$no_stem_list
$no_stem_list : array
Words we don't want to be stemmed
Italian specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
segment(string $pre_segment) : string
This method currently does nothing. For some locales it could used to split strings of the form "thisisastring" into a string with the words seperated: "this is a string"
string | $pre_segment | string to be segmented |
after segmentation done (same string in this case)
checkForSuffix( $parent_string, $substring) : \seekquarry\yioop\locale\it\resources\$pos
Checks if a string is a suffix for another string
$parent_string | is the string in which we wish to find the suffix |
|
$substring | is the suffix we wish to check |
as the starting position of the suffix $substring in $parent_string if it exists, else false
maxSuffix( $string, $suffixes) : \seekquarry\yioop\locale\it\resources\$max_suffix
Computes the longest suffix for a given string from a given set of suffixes
$string | is the for which the maximum suffix is to be found |
|
$suffixes | is an array of suffixes |
is the longest suffix for $string
acuteByGrave( $string) : \seekquarry\yioop\locale\it\resources\$string
Replaces all acute accents in a string by grave accents and also handles accented characters
$string | is the string from in which the acute accents are to be replaced |
with changes