INDEX_CACHE_SIZE
INDEX_CACHE_SIZE
Max number of IndexArchiveBundles that can be cached
Class used to manage open IndexArchiveBundle's while performing a query. Ensures an easy place to obtain references to these bundles and ensures only one object per bundle is instantiated in a Singleton-esque way.
getIndex(string $index_name) : object
Returns a reference to the managed copy of an IndexArchiveBundle object with a given timestamp or feed (for handling media feeds)
string | $index_name | timestamp of desired IndexArchiveBundle |
the desired IndexArchiveBundle reference
getVersion(string $index_name) : integer
Returns the version of the index, so that Yioop can determine how to do word lookup.The only major change to the format was when word_id's went from 8 to 20 bytes which happened around Unix time 1369754208.
string | $index_name | unix timestamp of index |
0 - if the orginal format for Yioop indexes; 1 -if 20 byte word_id format
getWordInfo(string $index_name, string $hash, integer $threshold = -1, integer $start_generation = -1, integer $num_distinct_generations = -1, boolean $with_remaining_total = false) : array
Gets an array of posting list positions for each shard in the bundle $index_name for the word id $hash
string | $index_name | bundle to look $hash in |
string | $hash | hash of phrase or word to look up in bundle dictionary |
integer | $threshold | after the number of results exceeds this amount stop looking for more dictionary entries. |
integer | $start_generation | what generation in the index to start finding occurrence of phrase from |
integer | $num_distinct_generations | from $start_generation how many generation to search forward to |
boolean | $with_remaining_total | whether to total number of postings found as well or not |
either [total, sequence of four tuples] or sequence of four tuples: (index_shard generation, posting_list_offset, length, exact id that match $hash)
discountedNumDocsTerm(string $term, string $index_name) : integer
Returns the number of document that a given term or phrase appears in in the given index where we discount later generation -- those with lower document rank more
string | $term | what to look up in the indexes dictionary no mask is used for this look up |
string | $index_name | index to look up term or phrase in |
number of documents