Within the P.S. (Post Scriptum) Project, systematic research will be developed, along with the publishing and historical-linguistic study of private letters written in Portugal and Spain along the Early Modern Ages. These documents are unpublished epistolary writings written by authors from different social backgrounds. They could be either masters or servants, adults or children, men or women, thieves, soldiers, artisans, priests, political activists, among other kinds of social agents. Their epistolarity survived by chance, in most cases, when their paths met the persecution means used by the Inquisition and the civil courts, two institutions that used private correspondence as criminal evidence. In other, much less frequent cases, the letters were preserved in a non-criminal context, although they also belong to the realm of backstage interaction and are easy to locate in situational terms. These textual resources often present an (almost) oral rhetoric, treating everyday issues of past centuries in a register that hasn't been easy to study, apart from brief examples. Not only does the P.S. Project present a wide collection of private letters, but it also makes it available as a scholarly digital edition and as an annotated corpus.
The abbreviations are expanded, though signalled, and the word boundaries are standardized (except for the enclitic forms), as is standardized the distribution of «i», «j», «u» y «v». A color code highlights the abbreviation expansions, the conjectures, the difficult readings, and the manuscript erasures and additions.
Standardization for the purpose of corpus annotation: the tokens spelling is standardized and modern punctuation marks are introduced. Regional and archaic forms are also modernized for spelling and then linked to the standard lexicon. The conjecture solutions are adopted in the standardized format and the omitted segments are marked with suspension points within square brackets, regardless of their cause.
The P.S. Post Scriptum corpus is linguistically annotated both at the morphosyntactic and the syntactic level. The morphosyntactic annotation follows the Eagles tag system for Spanish with slight modifications (cf. P.S. Post Scriptum tagset). As for the syntactic annotation, it adopts the system originally developed for the Penn Parsed Corpora of Historical English. The annotation guidelines for Portuguese were established in close cooperation with the Tycho Brahe team at UNICAMP, São Paulo, and the CORDIAL-SIN and WOChWEL teams at CLUL.
Some current statistics on the overall corpus are available on the window Word Distribution.