Abstract:
The article presents our approach in the elaboration of the system for processing unstructured text data in order to create a structured data output as computer linguistics resources using a lexicon of markers. First, a description of the research on the proposed topic, as well as its relation to the national and international level research is presented, being followed by the depiction of a useful to this particular research functionality - PoS Tagger for Romanian. A special section is dedicated to the algorithm to be used to elaborate our system. Finally, we describe several ways of marker lexicon completion by means of derivation.