Constants

RESULTS_PER_BLOCK

RESULTS_PER_BLOCK

Default number of documents returned for each block (at most)

HOST_KEY_POS

HOST_KEY_POS

Host Key position + 1 (first char says doc, inlink or eternal link)

KEY_LEN

KEY_LEN

Length of a doc key

Properties

$num_docs

$num_docs :integer

Estimate of the number of documents that this iterator can return

Type

integer

$seen_docs

$seen_docs :integer

The number of documents already iterated over

Type

integer

$count_block

$count_block :integer

The number of documents in the current block

Type

integer

$pages

$pages :array

Cache of what currentDocsWithWord returns

Type

array

$current_block_fresh

$current_block_fresh :boolean

Says whether the value in $this->count_block is up to date

Type

boolean

$results_per_block

$results_per_block :integer

Number of documents returned for each block (at most)

Type

integer

$base_query

$base_query :string

Part of query without limit and num to be processed by all queue_server machines

Type

string

$limit

$limit :string

Current limit number to be added to base query

Type

string

$queue_servers

$queue_servers :string

An array of servers to ask a query to

Type

string

$more_results

$more_results :array

Flags for each server saying if there are more results for that server or not

Type

array

$filter

$filter :\seekquarry\yioop\library\index_bundle_iterators\SearchfiltersModel

Model responsible for keeping track of edited and deleted search results

Type

\seekquarry\yioop\library\index_bundle_iterators\SearchfiltersModel

$next_results_per_server

$next_results_per_server :integer

used to adaptively change the number of pages requested from each machine based on the number of machines that still have results

Type

integer

$hard_query

$hard_query :integer

Used to keep track of the original desired number of results to be returned in one find docs call versus the number actually retrieved.

Type

integer

Methods

reset()

reset()

Returns the iterators to the first document block that it could iterate over

advance()

advance(array  $gen_doc_offset = null)

Forwards the iterator one group of docs

Parameters

array $gen_doc_offset

a generation, doc_offset pair. If set, the must be of greater than or equal generation, and if equal the next block must all have $doc_offsets larger than or equal to this value

currentGenDocOffsetWithWord()

currentGenDocOffsetWithWord(): mixed

Gets the doc_offset and generation for the next document that would be return by this iterator. As this is not easily determined for a network iterator, this method always returns -1 for this iterator

Returns

mixed —

an array with the desired document offset and generation; -1 on fail

findDocsWithWord()

findDocsWithWord(): mixed

Hook function used by currentDocsWithWord to return the current block of docs if it is not cached

Returns

mixed —

doc ids and score if there are docs left, -1 otherwise

plan()

plan(): string

Returns a string representation of a plan by which the current iterator finds its results

Returns

string —

a representation of the current iterator and its subiterators, useful for determining how a query will be processed

genDocOffsetCmp()

genDocOffsetCmp(array  $gen_doc1,array  $gen_doc2,  $direction = self::ASCENDING): integer

Compares two arrays each containing a (generation, offset) pair.

Parameters

array $gen_doc1

first ordered pair

array $gen_doc2

second ordered pair

$direction

Returns

integer —

-1,0,1 depending on which is bigger

getDirection()

getDirection()

currentDocsWithWord()

currentDocsWithWord(): mixed

Gets the current block of doc ids and score associated with the this iterators word

Returns

mixed —

doc ids and score if there are docs left, -1 otherwise

getCurrentDocsForKeys()

getCurrentDocsForKeys(array  $keys = null): array

Gets the summaries associated with the keys provided the keys can be found in the current block of docs returned by this iterator

Parameters

array $keys

keys to try to find in the current block of returned results

Returns

array —

doc summaries that match provided keys

nextDocsWithWord()

nextDocsWithWord(  $doc_offset = null): array

Get the current block of doc summaries for the word iterator and advances the current pointer to the next block of documents. If a doc index is the next block must be of docs after this doc_index

Parameters

$doc_offset

if set the next block must all have $doc_offsets equal to or larger than this value

Returns

array —

doc summaries matching the $this->restrict_phrases

advanceSeenDocs()

advanceSeenDocs()

Updates the seen_docs count during an advance() call

setResultsPerBlock()

setResultsPerBlock(integer  $num)

Sets the value of the result_per_block field. This field controls the maximum number of results that can be returned in one go by currentDocsWithWord()

Parameters

integer $num

the maximum number of results that can be returned by a block

__construct()

__construct(string  $query,array  $queue_servers,string  $timestamp,\seekquarry\yioop\library\index_bundle_iterators\SearchfiltersModel  $filter = null,string  $save_timestamp_name = "")

Creates a network iterator with the given parameters.

Parameters

string $query

the query that was supplied by the end user that we are trying to get search results for

array $queue_servers

urls of yioop instances on which documents indexes live

string $timestamp

the timestamp of the particular current index archive bundles that we look in for results

\seekquarry\yioop\library\index_bundle_iterators\SearchfiltersModel $filter

Model responsible for keeping track of edited and deleted search results

string $save_timestamp_name

if this timestamp is nonzero, then when making queries to separate machines the save_timestamp is sent so the queries on those machine can make savepoints. Note the format of save_timestamp is timestamp-query_part where query_part is the number of the item in a query presentation (usually 0).

makeLookupLink()

makeLookupLink(array  $sites,integer  $index): string

Called to make a link for AnalyticsManager about a network query performed by this iterator.

Parameters

array $sites

used by this network iterator

integer $index

which site in array to make link for

Returns

string —

html of link

serverAdjustedResultsPerBlock()

serverAdjustedResultsPerBlock(  $num_machines,  $num_results)

Buttcher, Clark, Cormack give an exact formula to compute this, but it is slow to compute We instead compute a (1/$num_machines^{3/4})* $num_results +5;

Parameters

$num_machines
$num_results