\seekquarry\yioop\libraryIndexManager

Class used to manage open IndexArchiveBundle's while performing a query. Ensures an easy place to obtain references to these bundles and ensures only one object per bundle is instantiated in a Singleton-esque way.

Summary

Methods
Properties
Constants
getIndex()
clearCache()
getVersion()
getWordInfo()
discountedNumDocsTerm()
$indexes
$index_times
INDEX_CACHE_SIZE
No protected methods found
No protected properties found
N/A
No private methods found
No private properties found
N/A

Constants

INDEX_CACHE_SIZE

INDEX_CACHE_SIZE

Max number of IndexArchiveBundles that can be cached

Properties

$indexes

$indexes : array

Open IndexArchiveBundle's managed by this manager

Type

array

$index_times

$index_times : array

List of entries of the form name of bundle => time when cached

Type

array

Methods

getIndex()

getIndex(string  $index_name) : object

Returns a reference to the managed copy of an IndexArchiveBundle object with a given timestamp or feed (for handling media feeds)

Parameters

string $index_name

timestamp of desired IndexArchiveBundle

Returns

object —

the desired IndexArchiveBundle reference

clearCache()

clearCache() 

Clears the static variables in which caches of read in indexes and dictionary info is stored.

getVersion()

getVersion(string  $index_name) : integer

Returns the version of the index, so that Yioop can determine how to do word lookup.The only major change to the format was when word_id's went from 8 to 20 bytes which happened around Unix time 1369754208.

Parameters

string $index_name

unix timestamp of index

Returns

integer —

0 - if the orginal format for Yioop indexes; 1 -if 20 byte word_id format

getWordInfo()

getWordInfo(string  $index_name, string  $hash, integer  $threshold = -1, integer  $start_generation = -1, integer  $num_distinct_generations = -1, boolean  $with_remaining_total = false) : array

Gets an array of posting list positions for each shard in the bundle $index_name for the word id $hash

Parameters

string $index_name

bundle to look $hash in

string $hash

hash of phrase or word to look up in bundle dictionary

integer $threshold

after the number of results exceeds this amount stop looking for more dictionary entries.

integer $start_generation

what generation in the index to start finding occurrence of phrase from

integer $num_distinct_generations

from $start_generation how many generation to search forward to

boolean $with_remaining_total

whether to total number of postings found as well or not

Returns

array —

either [total, sequence of four tuples] or sequence of four tuples: (index_shard generation, posting_list_offset, length, exact id that match $hash)

discountedNumDocsTerm()

discountedNumDocsTerm(string  $term, string  $index_name) : integer

Returns the number of document that a given term or phrase appears in in the given index where we discount later generation -- those with lower document rank more

Parameters

string $term

what to look up in the indexes dictionary no mask is used for this look up

string $index_name

index to look up term or phrase in

Returns

integer —

number of documents