Specification for thrax-normalizer
iayko
Iayko is a normalizer for diachronic normalization
Aliases
fst-normalizer, iayko-normalizer, thrax-normalizerLanguages
plExamples
iayko --lang pl --spec %ITSDATA%/%LANG%/all.far Rule001_09 %ITSDATA%/%LANG%/all.far Rule100_02
Normalize old text to modern version, using two transducers from the default set of rules: Rule001_09
(Feliński’s rule „y/i→j”) and Rule100_02
(„puhar→puchar”).
naley wody w puhar i doday iedno iayko
nalej wody w puchar i dodaj iedno iajko
iayko --lang pl --fsts transducers.txt
Normalize old text to modern version. List of transducers is given in file transducers.txt.
naley wody w puhar i doday iedno iayko
nalej wody w puchar i dodaj iedno iajko
iayko --far my_grammar.far --fst MyTransducer
Perform normalization using transducer MyTransducer from FAR archive compiled_grammar.far
This is a sample sentence.
Thif if a fample fentence.
iayko --grm my_grammar.grm --fst MyTransducer --save-far compiled_grammar.far
Perform normalization using transducer MyTransducer from text file with grammar written in Thrax (my_grammar.grm). Grammar is compiled to FAR archive and saved to compiled_grammar.far file.
This is a sample sentence.
Thif if a fample fentence.
better-diachronizer
Normalization using improved pipeline.
iayko wjeżdża, będzie w niedzielę
jajko wjeżdża, będzie w niedzielę
iayko --lang pl
Basic usage of iayko. Normalizes old text to modern version, using the default set of finite-state rules.
naley wody w puhar i doday iedno iayko
nalej wody w puchar i dodaj jedno jajko
iayko --lang pl --fst Rule001_09
Normalize old text to modern version, using only the transducer Rule001_09
(Feliński’s rule „y/i→j”) from the default set of rules.
naley wody w puhar i doday iedno iayko
nalej wody w puhar i dodaj iedno iajko
Options
Allowed options: --lang arg (=guess) language --force-language force using specified language even if a text was resognised otherwise --far arg (=%ITSDATA%/%LANG%/all.far) far archive with rules --fst arg fst name inside far --fsts arg (=%ITSDATA%/%LANG%/rules.txt) file with fst names to be used as a cascade --spec arg specification of more far:fst pairs to be used as cascade --grm arg text file with rules written in Thrax --md arg text file with rules written in Thrax and their description in Markdown --save-far arg where to save the far archive compiled from grm file --bypass-exceptions bypass exceptions --exceptions arg (=%ITSDATA%/%LANG%/exceptions.lst) a text file with list of exceptions --in-tag arg (=token) tag to operate on
niema
Niema is a conditional normalizer for diachronic normalization
Aliases
fst-normalizer, niema-normalizer, thrax-normalizerLanguages
plOptions
Allowed options: --lang arg (=guess) language --force-language force using specified language even if a text was resognised otherwise --far arg (=%ITSDATA%/%LANG%/all.far) far archive with rules --fst arg fst name inside far --condition arg condition for fst --conditions arg (=%ITSDATA%/%LANG%/conditions.txt) file with conditions --spec arg specification of more far:fst pairs to be used as cascade --grm arg text file with rules written in Thrax --md arg text file with rules written in Thrax and their description in Markdown --save-far arg where to save the far archive compiled from grm file --save-conditions arg where to save the conditions to file --bypass-exceptions bypass exceptions --exceptions arg (=%ITSDATA%/%LANG%/exceptions.lst) a text file with list of exceptions