\seekquarry\yioop\locale\ruresources

Classes

Tokenizer This class has a collection of methods for Russian locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation). The stemmer is a modification (with bug fixes ) of Dennis Kreminsky's stemmer from: http://snowball.tartarus.org/otherlangs/russian_php5.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.