\seekquarry\yioop\locale\in_ID\resourcesTokenizer

Indonesian specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram

Summary

Methods
Properties
Constants
stopwordsRemover()
$stop_words
$char_gram_len
No constants found
No protected methods found
No protected properties found
N/A
No private methods found
No private properties found
N/A

Properties

$stop_words

$stop_words : 

A list of frequently occurring terms for this locale which should be excluded from certain kinds of queries. This is also used for language detection

Type

$char_gram_len

$char_gram_len : integer

How many characters in a char gram for this locale

Type

integer

Methods

stopwordsRemover()

stopwordsRemover(mixed  $data) : mixed

Removes the stop words from the page (used for Word Cloud generation and language detection)

Parameters

mixed $data

either a string or an array of string to remove stop words from

Returns

mixed —

$data with no stop words