\seekquarry\yioopexecutables

Classes

ArcTool Command line program that allows one to examine the content of the WebArchiveBundles and IndexArchiveBundles of Yioop crawls.
ClassifierTool Class used to encapsulate all the activities of the ClassifierTool.php command line script. This script allows one to automate the building and testing of classifiers, providing an alternative to the web interface when
ClassifierTrainer This class is used to finalize a classifier via the web interface.
Fetcher This class is responsible for fetching web pages for the SeekQuarry/Yioop search engine
MediaUpdater Separate process/command-line script which can be used to update news sources for Yioop and also handle other kinds of activities such as video conversion. This is as an alternative to using the web app for updating. Makes use of the web-apps code.
Mirror This class is responsible for syncing crawl archives between machines using the SeekQuarry/Yioop search engine
PageIterator This class provides the same interface as an iterator over crawl mixes, but simply iterates over an array.
QueryTool Tool to provide a command line query interface to indexes stored in Yioop! database. Running with no arguments gives a help message for this tool.
QueueServer Command line program responsible for managing Yioop crawls.

Functions

changeCopyrightFile()

changeCopyrightFile(string  $filename, mixed  $set_year = false) 

Callback function applied to each file in the directory being traversed by @see copyright(). It checks if the files is of the extension of a code file and if so trims whitespace from its lines and then updates the lines of the form 2009 - \d\d\d\d to the supplied copyright year

Parameters

string $filename

name of file to check for copyright lines and updated

mixed $set_year

if false then set the end of the copyright period to the current year, otherwise, if an int sets it to the value of the int

clean()

clean(array  $args) : boolean

Used to clean trailing whitespace from files in a folder or just from a file given in the command line. If also removes final ?> characters to make php files conform with suggested coding guidelines. Similarly, adds a space between if, for, foreach, etc and ( if not present to make match PHP coding guidelines

Parameters

array $args

$args[0] contains path to sub-folder/file

Returns

boolean —

$no_instructions false if should output CodeTool.php instructions

cleanLinesFile()

cleanLinesFile(string  $filename) 

Callback function applied to each file in the directory being traversed by @see clean().

Parameters

string $filename

name of file to clean lines for

copyright()

copyright(array  $args) : boolean

Updates the copyright info (assuming in Yioop docs format) on files in supplied sub-folder/file. That is, it changes strings matching /2009 - \d\d\d\d/ to 2009 - current_year in those files/file.

Parameters

array $args

$args[0] contains path to sub-folder/file

Returns

boolean —

$no_instructions false if should output CodeTool.php instructions

excludedPath()

excludedPath(  $path) : boolean

Checks if $path is amongst a list of paths which should be ignored

Parameters

$path

a directory path

Returns

boolean —

whether or not it should be ignored (true == ignore)

longlines()

longlines(array  $args) : boolean

Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,

Parameters

array $args

$args[0] contains path to sub-folder/file

Returns

boolean —

$no_instructions false if should output CodeTool.php instructions

mapPath()

mapPath(string  $path, string  $callback) 

Applies the function $callback to each file in $path

Parameters

string $path

to apply map $callback to

string $callback

function name to call with filename of each file in path

replace()

replace(array  $args) : boolean

Performs a search and replace for given pattern in files in supplied sub-folder/file

Parameters

array $args

$args[0] contains path to sub-folder/file, $args[1] contains the regex searching for, $args[2] contains what it should be replaced with, $args[3] (defaults to effect) controls the mode of operation. One of "effect", "change", or "interactive". effect shows line number and lines matching pattern, but commits no changes; interactive for each match, prompts user if should do the change, change does a global search and replace without output

Returns

boolean —

$no_instructions false if should output CodeTool.php instructions

replaceFile()

replaceFile(string  $filename, mixed  $set_pattern = false, mixed  $set_replace = false, mixed  $set_mode = false) 

Callback function applied to each file in the directory being traversed by @see replace(). Searches $filename matching $pattern. Depending on $mode ($arg[2] as described in replace()), it outputs and replaces with $replace

Parameters

string $filename

name of file to search and replace in

mixed $set_pattern

if not false, then sets $set_pattern in $pattern to initialize the callback on subsequent calls. $pattern here is the search pattern

mixed $set_replace

if not false, then sets $set_replace in $replace to initialize the callback on subsequent calls.

mixed $set_mode

if not false, then sets $set_mode in $mode to initialize the callback on subsequent calls.

search()

search(array  $args) : boolean

Performs a search for given pattern in files in supplied sub-folder/file

Parameters

array $args

$args[0] contains path to sub-folder/file, $args[1] contains the regex searching for

Returns

boolean —

$no_instructions false if should output CodeTool.php instructions

searchFile()

searchFile(string  $filename, mixed  $set_pattern = false) 

Callback function applied to each file in the directory being traversed by @see search(). Searches $filename matching $pattern and outputs line numbers and lines

Parameters

string $filename

name of file to search in

mixed $set_pattern

if not false, then sets $set_pattern in $pattern to initialize the callback on subsequent calls. $pattern here is the search pattern