Specification for annotator > lemmatizer > morfologik
morfologik
Morfologik is a Polish morphological analyzer and lemmatizer. It returns morphosyntactic information for each token: base forms, grammatical class and attributes.
Values returned by Morfologik are described on page Znaczniki Morfologika (in Polish). In general, Morfologik's tagset is similar to the tagset of National Corpus of Polish, so you can also see http://nkjp.pl/poliqarp/help/ense2.html for more details.
Aliases
lemma-generator, lemmatise, lemmatiser, lemmatize, lemmatizerLanguages
plExamples
morfologik ! simple-writer --tags lexeme
Returns all base forms and grammatical classes for each word.
in:
Wszędzie dobrze, ale w domu najlepiej.
out:
wszędzie+adv
dobro+subst|dobry+adv|dobrze+adv
ala+qub|ale+conj
w+prep|wiek+brev
dom+subst
dobrze+adv
morfologik ! simple-writer --tags lemma
Returns all base forms for each word.
in:
Ala ma kota i psa.
out:
Al|Ala
mieć|mój
kot|kota
i
pies
Options
Allowed options: --level arg (=3) set word processing level 0-3 (0 - do nothing, 1 - return only base forms, 2 - add grammatical class and main attributes, 3 - add detailed attributes) --dict arg (=morfologik) set dictionary, one of morfologik, morfeusz, combined --keep-original keep original Morfologik's settings i.e. do not break brief forms --token-tag arg (=token) tag to operate on instead of token tag