Yioop_V9.5_Source_Code_Documentation

Application

Interfaces, Classes, Traits and Enums

Compressor
A Compressor is used to apply a filter to objects before they are stored into a WebArchive. The filter is assumed to be invertible, and the typical intention is the filter carries out some kind of string compression.
CrawlConstants
Shared constants and enums used by components that are involved in the crawling process
MediaConstants
Shared constants and enums used by components that are involved in the media related operations
Notifier
A Notifier is an object which will be notified by a priority queue when the index in the queue viewed as array of some data item has been changed.
ConfigureTool
Provides a command-line interface way to configure a Yioop Instance.
CreditConfig
Class containing methods used to handle payment processing when keyword advertising is enabled.
AdminController
Controller used to handle admin functionalities such as modify login and password, CREATE, UPDATE,DELETE operations for users, roles, locale, and crawls
ApiController
Controller used mainly for handling JS requests for Help Wiki Pages
ArchiveController
Fetcher machines also act as archives for complete caches of web pages, this controller is used to handle access to these web page caches
ClassifierController
This class handles XmlHttpRequests to label documents during classifier construction.
AccountaccessComponent
Component of the Yioop control panel used to handle activitys for managing accounts, users, roles, and groups. i.e., Settings of users and groups, what roles and groups a user has, what roles and users a group has, and what activities make up a role. It is used by AdminController
ChatbotComponent
Provides the AdminController activity that allows users to create Chat Bot Stories. A Chat Bot story is a collection of patterns (expression, trigger state, remote call, result state, responses) that govern how a chat bot will behave under various circumstances
Component
Base component class for all components on the SeekQuarry site. A component consists of a collection of activities and their auxiliary methods that can be used by a controller
CrawlComponent
This component is used to provide activities for the admin controller related to configuring and performing a web or archive crawl
SocialComponent
Provides activities to AdminController related to creating, updating blogs (and blog entries), static web pages, and crawl mixes.
StoreComponent
Component of the Yioop control panel used to handle activitys for managing advertisements. i.e., create advertisement, activate/ deactivate advertisement, edit advertisement.It is used by AdminController
SystemComponent
This component is used to handle activities related to the configuration of a Yioop installation, translations of text ging in the installation, as well as control of specifying what machines make up the installation and which processes they run.
Controller
Base controller class for all controllers on the SeekQuarry site.
CrawlController
Controller used to manage networked installations of Yioop where there might be mulliple QueueServers and a NameServer. Command sent to the nameserver web page are mapped out to queue_servers using this controller. Each method of the controller essentially mimics one method of CrawlModel, PhraseModel, or in general anything that extends ParallelModel and is used to proxy that information through a result web page back to the name_server.
FetchController
This class handles data coming to a queue_server from a fetcher Basically, it receives the data from the fetcher and saves it into various files for later processing by the queue server.
GroupController
Controller used to handle user group activities outside of the admin panel setting. This either could be because the admin panel is "collapsed" or because the request concerns a wiki page.
JobsController
This class is used to handle requests from a MediaUpdater to a name server There are three main types of requests: getUpdateProperties, and for any job that the MediaUpdater might be running, its getTasks, and putTasks request. getUpdateProperties is supposed to provide configuration settings for the MediaUpdater. A MediaUpdater might be running several periodic jobs. The getTasks requests of a job is used to see if there is any new work available of that job type on the name server. A putTasks request is used to handle any computed data sent back from a MediaUpdater to the name server.
MachineController
This class handles requests from a computer that is managing several fetchers and queue_servers. This controller might be used to start, stop fetchers/queue_server as well as get status on the active fetchers
RegisterController
Controller used to handle account registration and retrieval for the Yioop website. Also handles data for suggest a url
ResourceController
Used to serve resources, css, or scripts such as images from APP_DIR
SearchController
Controller used to handle search requests to SeekQuarry search site. Used to both get and display search results.
StaticController
This controller is used by the Yioop web site to display PUBLIC_GROUP_ID pages more like static forward facing pages.
TestsController
Controller used to handle search requests to SeekQuarry search site. Used to both get and display search results.
StockBot
This class demonstrates a simple Stock Chat Bot using the Yioop ChatBot APIs for Yioop Discussion Groups.
WeatherBot
This class demonstrates a simple Weather Chat Bot using the Yioop ChatBot APIs for Yioop Discussion Groups.
ArcTool
Command line program that allows one to examine the content of the WebArchiveBundles and IndexArchiveBundles of Yioop crawls.
ClassifierTool
Class used to encapsulate all the activities of the ClassifierTool.php command line script. This script allows one to automate the building and testing of classifiers, providing an alternative to the web interface when
PageIterator
This class provides the same interface as an iterator over crawl mixes, but simply iterates over an array.
ClassifierTrainer
This class is used to finalize a classifier via the web interface.
DictionaryUpdater
Fetcher
This class is responsible for fetching web pages for the SeekQuarry/Yioop search engine
MediaUpdater
Separate process/command-line script which can be used to update news sources for Yioop and also handle other kinds of activities such as video conversion. This is as an alternative to using the web app for updating. Makes use of the web-apps code.
Mirror
This class is responsible for syncing crawl archives between machines using the SeekQuarry/Yioop search engine
QueueServer
Command line program responsible for managing Yioop crawls.
AnalyticsManager
Used to set and get SQL query and search query timing statistic between models and index_bundle_iterators
ArcArchiveBundleIterator
Used to iterate through the records of a collection of arc files stored in a WebArchiveBundle folder. Arc is the file format of the Internet Archive http://www.archive.org/web/researcher/ArcFileFormat.php. Iteration would be for the purpose making an index of these records
ArchiveBundleIterator
Abstract class used to model iterating documents indexed in an WebArchiveBundle or set of such bundles.
DatabaseBundleIterator
Used to iterate through the records that result from an SQL query to a database
MediaWikiArchiveBundleIterator
Used to iterate through a collection of .xml.bz2 media wiki files stored in a WebArchiveBundle folder. Here these media wiki files contain the kinds of documents used by wikipedia. Iteration would be for the purpose making an index of these records
MixArchiveBundleIterator
Used to do an archive crawl based on the results of a crawl mix.
OdpRdfArchiveBundleIterator
Used to iterate through the records of a collection of one or more open directory RDF files stored in a WebArchiveBundle folder. Open Directory file can be found at http://rdf.dmoz.org/ . Iteration would be for the purpose making an index of these records
TextArchiveBundleIterator
Used to iterate through the records of a collection of text or compressed text-oriented records
WarcArchiveBundleIterator
Used to iterate through the records of a collection of warc files stored in a WebArchiveBundle folder. Warc is the newer file format of the Internet Archive and other for digital preservation: http://www.digitalpreservation.gov/formats/fdd/fdd000236.shtml http://archive-access.sourceforge.net/warc/ Iteration is done for the purpose making an index of these records
WebArchiveBundleIterator
Class used to model iterating documents indexed in an WebArchiveBundle. This would typically be for the purpose of re-indexing these documents.
BloomFilterBundle
A BloomFilterBundle is a directory of BloomFilterFile.
BloomFilterFile
Code used to manage a bloom filter in-memory and in file.
BPlusTree
This class implements the B+-tree structure over existing file system
BZip2BlockIterator
This class is used to allow one to iterate through a Bzip2 file.
BinaryFeatures
A concrete Features subclass that represents a document as a binary vector where a one indicates that a feature is present in the document, and a zero indicates that it is not. The absent features are ignored, so the binary vector is actually sparse, containing only those feature indices where the value is one.
ChiSquaredFeatureSelection
A subclass of FeatureSelection that implements chi-squared feature selection.
Classifier
The primary interface for building and using classifiers. An instance of this class represents a single classifier in memory, but the class also provides static methods to manage classifiers on disk.
ClassifierAlgorithm
An abstract class shared by classification algorithms that implement a common interface.
Features
Manages a dataset's features, providing a standard interface for converting documents to feature vectors, and for accessing feature statistics.
FeatureSelection
This is an abstract class that specifies an interface for selecting top features from a dataset.
LassoRegression
Implements the logistic regression text classification algorithm using lasso regression and a cyclic coordinate descent optimization step.
InvertedData
Stores a data matrix in an inverted index on columns with non-zero entries.
NaiveBayes
Implements the Naive Bayes text classification algorithm.
SparseMatrix
A sparse matrix implementation based on an associative array of associative arrays.
WeightedFeatures
A concrete Features subclass that represents a document as a vector of feature weights, where weights are computed using a modified form of TF * IDF. This feature mapping is experimental, and may not work correctly.
GzipCompressor
Implementation of a Compressor using GZIP/GUNZIP as the filter.
NonCompressor
Implementation of a trivial Compressor.
ComputerVision
Class used to encapsulate various methods related to computer vision that might be useful for indexing documents. These include recognizing text in images
ContextTagger
Abstract, base context tagger class.
CrawlDaemon
Used to run scripts as a daemon on *nix systems
CrawlQueueBundle
Encapsulates the data structures needed to have a queue of to crawl urls
DoubleIndexBundle
A DoubleIndexBundle encapsulates and provided methods for two IndexDocumentBundle used to store a repeating crawl. One one thse bundles is used to handle current search queries, while the other is used to store an ongoing crawl, once the crawl time has been reach the roles of the two bundles are swapped
FeedArchiveBundle
Subclass of IndexArchiveBundle with bloom filters to make it easy to check if a news feed item has been added to the bundle already before adding it
FeedDocumentBundle
Subclass of IndexDocumentBundle with bloom filters to make it easy to check if a news feed item has been added to the bundle already before adding it
FetchGitRepositoryUrls
Library of functions used to fetch Git internal urls
FetchUrl
Code used to manage HTTP or Gopher requests from one or more URLS
FileCache
Library of functions used to implement a simple file cache
HashTable
Code used to manage a memory efficient hash table Weights for the queue must be flaots
DisjointIterator
Used to iterate over the documents which occur in a set of disjoint iterators all belonging to the same index
DocIterator
Used to iterate through all the documents and links associated with a an IndexArchiveBundle. It iterates through each doc or link regardless of the words it contains. It also makes it easy to get the summaries of these documents.
GroupIterator
This iterator is used to group together documents or document parts which share the same url. For instance, a link document item and the document that it links to will both be stored in the IndexArchiveBundle by the QueueServer. This iterator would combine both these items into a single document result with a sum of their score, and a summary, if returned, containing text from both sources. The iterator's purpose is vaguely analogous to a SQL GROUP BY clause
IndexBundleIterator
Abstract classed used to model iterating documents indexed in an IndexArchiveBundle or set of such bundles.
IntersectIterator
Used to iterate over the documents which occur in all of a set of iterator results
NegationIterator
Used to iterate over the documents which don't occur in a set of iterator results
NetworkIterator
This iterator is used to handle querying a network of queue_servers with regard to a query
UnionIterator
Used to iterate over the documents which occur in any of a set of WordIterator results
WordIterator
Used to iterate through the documents associated with a word in an IndexArchiveBundle. It also makes it easy to get the summaries of these documents.
IndexArchiveBundle
Encapsulates a set of web page summaries and an inverted word-index of terms from these summaries which allow one to search for summaries containing a particular word.
IndexDictionary
Data structure used to store for entries of the form: word id, index shard generation, posting list offset, and length of posting list. It has entries for all words stored in a given IndexArchiveBundle. There might be multiple entries for a given word_id if it occurs in more than one index shard in the given IndexArchiveBundle.
IndexDocumentBundle
Encapsulates a set of web page documents and an inverted word-index of terms from these documents which allow one to search for documents containing a particular word.
AddressesPlugin
Used to extract emails, phone numbers, and addresses from a web page.
IndexingPlugin
Base indexing plugin Class. An indexing plugin allows a developer to do additional processing on web pages during a crawl, then after the web crawl is over do post processing on the additional data that was collected. For example, during a crawl one might by analysing web pages mark pages that have recipes on them with the meta word recipe:all, then after the crawl is over do post processing such as clustering the recipe's found and add additional meta words to retrieve recipe's by principle ingredient.
WordfilterPlugin
WordFilterPlugin is used to filter documents by terms during a crawl.
IndexManager
Class used to manage open IndexArchiveBundle's while performing a query. Ensures an easy place to obtain references to these bundles and ensures only one object per bundle is instantiated in a Singleton-esque way.
IndexShard
Data structure used to store one generation worth of the word document index (inverted index). This data structure consists of three main components a word entries, word_doc entries, and document entries.
JavascriptUnitTest
Super class of all the test classes testing Javascript functions.
LinearAlgebra
Class useful for handling linear algebra operations on associative array with key => value pairs where the value is a number.
LinearHashTable
This class implements a linear hash table for storing records that use PackedTableTools for their format
MailServer
A small class for communicating with an SMTP server. Used to avoid configuration issues that might be needed with PHP's built-in mail() function. Here is an example of how one might use this class:
AnalyticsJob
A media job used to periodically calculate summary statistics about group, thread, page, and query impressions.
BulkEmailJob
MediaJob class for sending out emails from a Yioop instance (either in response to account registrations or in response to group posts and similar activities)
FeedsUpdateJob
A media job to download and index feeds from various search sources (RSS, HTML scraper, etc). Idea is that this job runs once an hour to get the latest news, movies, weather from those sources.
MediaJob
Base class for jobs to be carried out by a MediaUpdater process.
PodcastDownloadJob
A media job to periodically download Podcasts and store them as resources of a Wiki Page
RecommendationJob
Recommendation Job recommends the trending threads as well as threads and groups which are relevant based on the users viewing history
TrendingHighlightsJob
A media job to compute trending terms from the feed search sources, and to generate a list of top feed items for the landing page for the different subsearches displlayed on the landing page.
VideoConvertJob
Media Job used to convert videos uploaded to the wiki or group feeds to a common format (mp4)
WikiThumbDetailJob
A media job to add thumbnails and animated thumbnails for wiki page media resources that have just been viewed in the browser. This is detected by the method: GroupModel::getGroupPageResourceUrls which write a file GroupModel::NEEDS_THUMBS_DIR . L\crawlHash($folder) . ".txt" to with information about the folders needing thumbs.
NamedEntityContextTagger
Machine learning based named entity recognizer.
NWordGrams
Library of functions used to create and extract n word grams
PackedTableTools
A collection of methods to encode and decode records according to a signature.
PageRuleParser
Has methods to parse user-defined page rules to apply documents to be indexed.
PartialZipArchive
Used to extract files from an initial segment or a fragment of a ZIP Archive.
PartitionDocumentBundle
A partition document bundle is a collection of partition each of which in turn can hold a concatenated sequence of compressed documents and which are managed together. It is a successor format to the earlier WebArchiveBundle of Yioop. The partition document bundle stores individual records using a record format defined via the PackedTableTools class.
PartOfSpeechContextTagger
Machine learning based Part of Speech tagger.
PersistentStructure
A PersistentStructure is a data structure which every so many operations will be saved to secondary storage (such as disk).
PhraseParser
Library of functions used to manipulate words and phrases
PriorityQueue
Code used to manage a memory efficient priority queue.
BmpProcessor
Used to create crawl summary information for BMP and ICO files
CompressedProcessor
Used to create crawl summary information for a gz compressed file whose uncompressed form has a processor we index.
DocProcessor
Used to create crawl summary information for binary DOC files
DocxProcessor
Used to create crawl summary information for DOCX files
EpubProcessor
Used to create crawl summary information for XML files (those served as application/epub+zip)
GifProcessor
Used to create crawl summary information for GIF files
GitXmlProcessor
Parent class common to all processors used to create crawl summary information that involves basically text data
GopherProcessor
Used to create crawl summary information for gopher protocol pages
HtmlProcessor
Used to create crawl summary information for HTML files
IconProcessor
Used to create crawl summary information for BMP and ICO files
ImageProcessor
Base abstract class common to all processors used to create crawl summary information from images
JavaProcessor
Parent class common to all processors used to create crawl summary information that involves basically text data
JpgProcessor
Used to create crawl summary information for JPEG files
PageProcessor
Base class common to all processors of web page data
PdfProcessor
Used to create crawl summary information for PDF files
PngProcessor
Used to create crawl summary information for PNG files
PptProcessor
Used to create crawl summary information for PPT files
PptxProcessor
Used to create crawl summary information for PPTX files
PythonProcessor
Parent class common to all processors used to create crawl summary information that involves basically text data
RobotProcessor
Processor class used to extract information from robots.txt files
RssProcessor
Used to create crawl summary information for RSS or Atom files
RtfProcessor
Used to create crawl summary information for RTF files
SitemapProcessor
Used to create crawl summary information for sitemap files
SvgProcessor
Used to create crawl summary information for SVG files. This class is a little bit weird in that it generates thumbs like the image processor classes, but when it gives up on the data it falls back to text processor handling.
TextProcessor
Parent class common to all processors used to create crawl summary information that involves basically text data
VideoProcessor
Base abstract class common to all processors used to create crawl summary information from videos
XlsxProcessor
Used to create crawl summary information for xlsx files
XmlProcessor
Used to create crawl summary information for XML files (those served as text/xml)
ScraperManager
Class used by html processors to detect if a page matches a particular signature such as that of a content management system, and also to provide scraping mechanisms for the content of such a page
StochasticTermSegmenter
Class for segmenting terms using Stochastic Finite State Word Segmentation
StringArray
Memory efficient implementation of persistent arrays
SuffixTree
Data structure used to maintain a suffix tree for a passage of words.
CentroidSummarizer
Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing. This is done by the @see getSummmary method. getSummary does this splitting the document into sentences and computing inverse sentence frequency (should be ISL, but we call IDF) scores for each term. It then computes an average document vector (we call centroid) with components (total number of occurrences of term) * (IDF score of term).
CentroidWeightedSummarizer
Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing. This is done by the @see getSummmary method. To generate a summary a normalized term frequency vector is computed for each sentence. An average vector is then computed by summing these and renormalizing the result.
GraphBasedSummarizer
Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing. The method @see getSummary is used to obtain such a summary. In GraphBasedSummarizer's implementation of this method sentences are ranks using a page rank style algorithm based on sentence adjacencies calculated using a distortion score between pair of sentence (@see LinearAlgebra::distortion for details on this).
ScrapeSummarizer
Class which may be used by TextProcessors to get a summary for a text document that may later be used for indexing.
Summarizer
Base class for all summarizers. Summarizers chief method is getSummary which is supposed to take a text or XML document and produces a summary of that document up to PageProcessor::$max_description_len many characters. Summarizers also contain various methods to generate word cloud from such a summary
Trie
Implements a trie data structure which can be used to store terms read from a dictionary in a succinct way
UnitTest
Base class for all the SeekQuarry/Yioop engine Unit tests
UrlParser
Library of functions used to manipulate and to extract components from urls
Mod9Constants
Mini-class (so not own file) used to hold encode decode info related to Mod9 encoding (as variant of Simplified-9 specify to Yioop).
VersionManager
VersionManager can be used to create and manage versions of files in a folder so that a user can revert the files to any version desired back to the time the folder under manager was first managed. It is used by Yioop's Wiki system to handle versions of image and other media resources for a Wiki page.
WebArchive
Code used to manage web archive files
WebArchiveBundle
A web archive bundle is a collection of web archives which are managed together.It is useful to split data across several archive files rather than just store it in one, for both read efficiency and to keep filesizes from getting too big. In some places we are using 4 byte int's to store file offsets which restricts the size of the files we can use for wbe archives.
WebSite
A single file, low dependency, pure PHP web server and web routing engine class.
WebException
Exception generated when a running WebSite script calls webExit()
WikiParser
Class with methods to parse mediawiki documents, both within Yioop, and when Yioop indexes mediawiki dumps as from Wikipedia.
Tokenizer
Arabic specific tokenization code. In particular, it has a stemmer, The stemmer is my stab at porting Ljiljana Dolamic (University of Neuchatel, www.unine.ch/info/clef/) C stemming algorithm: http://members.unine.ch/jacques.savoy/clef That algorithm maps all stems to ASCII. Instead, I tried to leave everything using Arabic characters.
Tokenizer
Bengali specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
German specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Greek specific tokenization code. Contains a list of greek stop words used in making word clouds. It also has a greek stemmer.
Tokenizer
This class has a collection of methods for English locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation), and a part of speech tagger (for question answering). The stemmer is my stab at implementing the Porter Stemmer algorithm presented http://tartarus.org/~martin/PorterStemmer/def.txt The code is based on the non-thread safe C version given by Martin Porter.
Tokenizer
Spanish specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Persian specific tokenization code. In particular, it has a stemmer, The stemmer is a modified variant (handling prefixes slightly differently) of my stab at porting Nick Patch's Perl port, https://metacpan.org/pod/Lingua::Stem::UniNE::FA, of the stemming algorithm by Ljiljana Dolamic and Jacques Savoy of the University of Neuchâtel. The Java version of this is at http://members.unine.ch/jacques.savoy/clef/persianStemmerUnicode.txt (beware of Java's handling of Unicode).
Tokenizer
This class has a collection of methods for French locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation). The stemmer is my stab at re-implementing the stemmer algorithm given at http://snowball.tartarus.org and was inspired by http://snowball.tartarus.org/otherlangs/french_javascript.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.
Tokenizer
Hebrew specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Hindi specific tokenization code. In particular, it has a stemmer, The stemmer is my stab at porting Ljiljana Dolamic (University of Neuchatel, www.unine.ch/info/clef/) Java stemming algorithm: http://members.unine.ch/jacques.savoy/clef/HindiStemmerLight.java.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.
Tokenizer
Indonesian specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Italian specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Japanese specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Kanada specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Korean specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
This class has a collection of methods for Dutch locale specific tokenization. In particular, it has a stemmer, .
Tokenizer
Polish specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
This class has a collection of methods for Portuguese locale specific tokenization. In particular, it has a stemmer implementing the Snowball Stemming algorithm presented in http://snowball.tartarus.org/algorithms/portuguese/stemmer.html
Tokenizer
This class has a collection of methods for Russian locale specific tokenization. In particular, it has a stemmer, a stop word remover (for use mainly in word cloud creation). The stemmer is a modification (with bug fixes ) of Dennis Kreminsky's stemmer from: http://snowball.tartarus.org/otherlangs/russian_php5.txt Here given a word, its stem is that part of the word that is common to all its inflected variants. For example, tall is common to tall, taller, tallest. A stemmer takes a word and tries to produce its stem.
Tokenizer
Telegu specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Thai specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Tagalog (spoken in Philipines) specific tokenization code.
Tokenizer
Turkish specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
Tokenizer
Vietnamese specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram for Vietnamese neither char gramming or stemming seemed to make sense, so for now this file is blank.
Tokenizer
Chinese specific tokenization code. Typically, tokenizer.php either contains a stemmer for the language in question or it specifies how many characters in a char gram
ActivityModel
This is class is used to handle db results related to Administration Activities
AdvertisementModel
This class is used to handle database statements related to Advertisements
BotModel
BotModel is used to handle database statements related to Bot User stories A Bot User Story consists of a sequence of patterns for what a bot should do when another user posts a request to the bot (a message beginning with @bot_name) in a discussion group.
CaptchaModel
This is class is used to handle the captcha settings for Yioop
CrawlModel
This is class is used to handle getting/setting crawl parameters, CRUD operations on current crawls, start, stop, status of crawls, getting cache files out of crawls, determining what is the default index to be used, marshalling/unmarshalling crawl mixes, and handling data from suggest-a-url forms
CreditModel
This class is used to manage Advertising Credits a user may purchase or spend
CronModel
Used to remember the last time the web app ran periodic activities
DatasourceManager
This abstract class defines the interface through which the seek_quarry program communicates with a database and the filesystem.
MysqlManager
Mysql DatasourceManager
PdoManager
Pdo DatasourceManager
Sqlite3Manager
SQLite3 DatasourceManager
GroupModel
This is class is used to handle db results related to Group Administration.
ImpressionModel
Model used to keep track for analytic and user experience activities that users carry out on a Yioop web site. For analytics things that might tracked are wiki page views, queries, query outcomes. For UX things that the impression model allows is to keep track of recent group a user has visited to provide better bread crumb drop downs, make the manage account landing page list more relevant groups, determine start of whether a media item has been watched, completely watched, etc.
LocaleModel
Used to encapsulate information about a locale (data about a language in a given region).
MachineModel
This is class is used to handle db results related to Machine Administration
Model
This is a base class for all models in the SeekQuarry search engine. It provides support functions for formatting search results
ParallelModel
Base class of models that need access to data from multiple queue servers Subclasses include @see CrawlModel and @see PhraseModel.
PhraseModel
This is class is used to handle results for a given phrase search
ProfileModel
This is class is used to handle getting and saving the Profile.php of the current search engine instance
RoleModel
This is class is used to handle db results related to Role Administration
ScraperModel
Used to manage data related to the SCRAPER database table.
SearchverticalsModel
This class manages the editing of search verticals. This includes allowing one to specify a search result should be filtered from the results of a query, it also includes alterning the title and description of a result from how it is stored in a particular index and it finally includes creating, updating, deleting knowledge wiki results To handle these activities this class leverages the existing group wiki system of Yioop. Edited and filtered search results correspond to group feed entries in a Search Group. Edited knowledge wiki entries correspond to wiki entries in the Search Group.
SigninModel
This is class is used to handle db results needed for a user to login
SourceModel
Used to manage data related to video, news, and other search sources Also, used to manage data about available subsearches seen in SearchView
TrendingModel
This is class is used to handle db results related to Group Administration. Groups are collections of users who might access a common blog/news feed and set of pages. This method also controls adding and deleting entries to a group feed and does limited access control checks of these operations.
UserModel
This class is used to handle database statements related to User Administration
VisitorModel
Used to keep track of ip address of failed account creation and login attempts
AdminView
View used to draw activity list and current activity for a logged in user
ApiView
View used to draw and allow editing of wiki page when not in the admin view (so activities panel on side is not present.) This is also used to draw wiki pages for public groups when not logged.
ComponentView
Base class for views created by adding elements to top, sub-top, same, opposite, center columns, or bottom possitions
CrawlstatusView
This view is used to display information about crawls that have been made by this seek_quarry instance
AdminbarElement
Element used to draw the navigation bar on admin pages.
AdminElement
Element used to render the admin interface for a logged in user of Yioop
AdminmenuElement
Element responsible for drawing the side menu portion of an admin page. This allows the user to signout or select from among allowed admin activities
ApiElement
Element responsible for drawing wiki pages in either admin or wiki view It is also responsible for rendering wiki history pages, and listings of wiki pages available for a group
AppearanceElement
Element responsible for drawing the screen used to set up the search engine appearance.
BotstoryElement
Element responsible for displaying the bot story features that someone can use to create for their own chat bot.
ConfigureElement
Element responsible for drawing the screen used to set up the search engine
CrawloptionsElement
Element responsible for displaying options about how a crawl will be performed. For instance, what are the seed sites for the crawl, what sites are allowed to be crawl what sites must not be crawled, etc.
DisplayadvertisementElement
Element responsible for displaying the advertisement on the search results page
EditclassifierElement
This element renders the initial edit page for a classifier, where the user can update the classifier label and find documents to label and add to the training set. The page displays some initial statistics and a form for finding documents in any existing index, but after that it is heavily modified by JavaScript in response to user actions and XmlHttpRequests made to the server.
EditlocalesElement
Element responsible for displaying the form where users can input string translations for a given locale
EditmixElement
Element responsible for displaying info about a given crawl mix
Element
Base Element Class.
FooterElement
Element responsible for drawing footer links on search view and static view pages
GroupbarElement
Element used to draw the navigation bar on group feed and wiki pages.
GroupElement
Element used to present group feed and wiki pages for the Yioop website
GroupfeedElement
Element responsible for draw the feeds a user is subscribed to
GroupmenuElement
Element responsible for drawing the menu side bar for group and wiki pages. These options include recently viewed wiki pages, groups, and threads
HeaderElement
Element responsible for drawing the header on admin, group, and, search views
HelpElement
This element is used to display the list of available activities in the AdminView
LanguageElement
Element used to display available languages in the settings view
MachinelogElement
Element responsible for displaying the queue_server or fetcher log of a machine
ManageaccountElement
Element responsible for displaying the user account features that someone can modify for their own SeekQuarry/Yioop account.
ManageadvertisementsElement
Element responsible for displaying advertisements information that someone can create, view, and modify for their own SeekQuarry/Yioop account.
ManageclassifiersElement
This element renders the page that lists classifiers, provides a form to create new ones, and provides per-classifier action links to edit, finalize, and delete the associated classifier.
ManagecrawlsElement
Element responsible for displaying info about starting, stopping, deleting, and using a crawl. It makes use of the CrawlStatusView
ManagecreditsElement
Element responsible for displaying Ad credits purchase form and recent transaction table
ManagegroupsElement
Used to draw the admin screen on which users can create groups, delete groups and add and delete users and roles to a group
ManagelocalesElement
This Element is responsible for drawing screens in the Admin View related to localization. Namely, the ability to create, delete, and text writing mode for locales as well as the ability to modify translations within a locale.
ManagemachinesElement
Used to draw the admin screen on which admin users can add/delete and manage machines which might act as fetchers or queue_servers.
ManagerolesElement
Used to draw the admin screen on which admin users can create roles, delete roles and add and delete activitiess from roles
ManageusersElement
Element responsible for drawing the activity screen for User manipulation in the AdminView.
MediajobsElement
Element used to draw toggles indicating which jobs the Media Updater will run and letting the user turn these jobs on/off.
MixcrawlsElement
Element responsible for displaying info to allow a user to create a crawl mix or edit an existing one
PageOptionsElement
This element is used to render the Page Options admin activity This activity lets a user control the amount of web pages downloaded, the recrawl frequency, the file types, etc of the pages crawled
PaginationElement
Element responsible for drawing the sequence of available pages for search results.
QuerystatsElement
Element responsible for displaying statistics about recent queries that have been run against the search engine
ResultsEditorElement
Element used to control how urls are filtered out of search results (if desired) after a crawl has already been performed.
ScrapersElement
Contains the forms for managing Web Page Scrapers.
SearchbarElement
Element used to draw the navigation bar on search pages.
SearchcalloutElement
Element responsible for drawing search wiki callouts for search results
SearchElement
Element used to present search results It is also contains the landing page search box for people to types searches into
SearchmenuElement
Element responsible for drawing the side menu with sign in/create account, search source options, search settings, and tool info for search pages
SearchsourcesElement
This element renders the forms for managing search sources for news, etc.
SecurityElement
Element used to handle configurations of Yioop related to authentication, captchas, and recovery of missing passwords
ServersettingsElement
Element used to draw forms to set up the various external servers that might be connected with a Yioop installation
SideadvertisementElement
Element used to draw an external server advertisement (if there is one) as a column on the opposite side of a search results page
StatisticsElement
Draws an element displaying statistical information about a web crawl such as number of hosts visited, distribution of file sizes, distribution of file type, distribution of languages, etc
TopadvertisementElement
This element is used to draw the keyword advertisement above search results (if present)
TrendingElement
Class to draw statistics and charts about trending news feed terms
UsermessagesElement
Element responsible for draw the feeds a user is subscribed to
WelcomemenuElement
Element responsible for drawing the side menu portion of an admin page. This allows the user to signout or select from among allowed admin activities
WikiElement
Element responsible for drawing wiki pages in group view It is also responsible for rendering wiki history pages, and listings of wiki pages available for a group
FeedstatusView
This view is drawn to refresh a group feed that has recently been posted to. Redrawing is invoked from a client script every so many seconds.
FetchView
This view is displayed by the fetch_controller.php to send information to a fetcher about things like what to crawl next
GroupView
View used to draw and allow editing of group feeds when not in the admin view (so activities panel on side is not present.) This is also used to draw group feeds for public feeds when not logged.
CloseHelper
This is a helper class is used to handle closing an option window for an activity
EmojipickerHelper
This is a helper class is used to handle drawing of Emojis in the Usermessages activity
FeedsHelper
Helper used to draw links and snippets for RSS feeds
FiletypeHelper
This is a helper class is used to handle used to render the filetype based on the supplied mimetype. It is mainly intended to be used in outputting webpage results for non html pages.
FileUploadHelper
This helper is used to render a drag and drop file upload region
GrouplistHelper
This is a helper class is used to draw grouped view discussions for group feeds on a variety on elements such as ManageAccountElement, GroupfeedElement, ManagegroupElement.
HamburgerHelper
This is a helper class is used to draw the hamburger menu symbol and associated link to the settings menu
HelpbuttonHelper
This is a helper class is used to draw help button for context sensitive help.
Helper
Base Helper Class.
IconlinkHelper
This is a helper class is used to draw icon buttons and links
ImagesHelper
Helper used to draw thumbnails strips for images
OptionsHelper
This is a helper class is used to handle draw select options form elements
PaginationHelper
This is a helper class is used to handle pagination of search results
PagingtableHelper
Used to create links to go backward/forwards and search a database tables. HTML table with data representing a database table might have millions of rows so want to limit what the user actually gets at one time and just allow the user to "page" through in increments of 10, 20, 50, 100, 200 rows at a time.
SearchformHelper
Used to draw the form to do advanced search for items in a user, group, locale, etc folder
ToggleHelper
This is a helper class is used to draw an On-Off switch in a web page
VideosHelper
Helper used to draw thumbnails strips for images
VideourlHelper
Helper used to draw thumbnails for video sites
ApiLayout
Layout used for the seek_quarry Website including pages such as search landing page and settings page
Layout
Base layout Class. Layouts are used to render the headers and footer of the page on which a View lives
RssLayout
Layout used for the seek_quarry Website including pages such as search landing page and settings page
WebLayout
Layout used for the seek_quarry Website including pages such as search landing page and settings page
MachinestatusView
This view is used to display information about the on/off state of the queue_servers and fetchers managed by this instance of Yioop.
MediadetailView
View used to draw and allow editing of a single media resource when a media resource gallery is draw in detail view.
NocacheView
This view is drawn when someone clicks on the cached link of a web-page for which no cache is available
RecoverView
This View is responsible for drawing the screen for recovering a forgotten password
RegisterView
Draws the page that allows a user to register for an account
ResendEmailView
This View is responsible for drawing the screen for resending the confirm account link
RssView
Web page used to present search results It is also contains the search box for people to types searches into
SearchView
Web page used to present search results It is also contains the search box for people to types searches into
SigninView
This View is responsible for drawing the login screen for the admin panel of the Seek Quarry app
StaticView
This View is responsible for drawing forward-facing wiki pages in a more static cleaned up way
SuggestView
View responsible for drawing the form where a user can suggest a URL
TestsView
Draws the view on which people can control their search settings such as num links per screen and the language settings
View
Base View Class. A View is used to display the output of controller activity
BloomFilterFileTest
Used to test that the BloomFilterFile class provides the basic functionality of a persistent set. I.e., we can insert things into it, and we can do membership testing
BmpProcessorTest
UnitTest for the BmpProcessor class. A BmpProcessor is used to process a .bmp file and extract summary from it. This class tests the processing of an .bmp file.
BPlusTreeTest
Yioop B+-tree Unit Class
CrawlQueueBundleTest
UnitTest for the CrawlQueueBundle class.
DeTokenizerTest
Code used to test the German stemming algorithm.
DocxProcessorTest
UnitTest for the DocxProcessor class. It is used to process docx files which are a zip of an xml-based format
ElTokenizerTest
Code used to test the Greek stemming algorithm.
EnTokenizerTest
Code used to test the English stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/porter/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/porter/output.txt Code uses original Porter stemmer, not Porter 2
EpubProcessorTest
UnitTest for the EpubProcessor class. An EpubProcessor is used to process a .epub (ebook publishing standard) file and extract summary from it. This class tests the processing of an .epub file format by EpubProcessor.
EsTokenizerTest
Code used to test the French stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/spanish/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/spanish/output.txt
FaTokenizerTest
Code used to test the Persian stemming algorithm. The inputs for the algorithm came from the sample text file for the Hamshahri Collection found at http://ece.ut.ac.ir/DBRG/Hamshahri/download.html The stemmed results come from the Java program that the PHP stemmer is based off of at http://members.unine.ch/jacques.savoy/clef/persianStemmerArabic.txt
FetchUrlTest
Used to test auxiliary functions related to downloading pages with the FetchUrl class.
FrTokenizerTest
Code used to test the French stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/french/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/french/output.txt
HashTableTest
Used to test that the HashTable class properly stores key value pairs, handles insert, deletes, collisions okay. It should also detect when table is full
HiTokenizerTest
Code used to test the Hindi stemming algorithm. The inputs for the algorithm came from the sample text file for the The stemmed results come from the Java program that the PHP stemmer is based off of at http://members.unine.ch/jacques.savoy/clef/HindiStemmerLight.java.txt which has since been modified to try to improve accuracy
IconProcessorTest
UnitTest for the IconProcessor class. A IconProcessor is used to process a .ico file and extract summary from it. This class tests the processing of an .ico file.
IndexDictionaryTest
Used to test that the IndexDictionary class can properly add shards and retrieve correct posting slice ranges in the shards.
IndexDocumentBundleTest
Used to test that the IndexDocumentBundle class can properly add and retrieve documents. Check its prepareMethod correctly deduplicates documents before inverted index creation. Tests inverted index creation and adding terms to IndexDocumentBundle's BPlusTree. Check look up of documents according to term.
IndexManagerTest
Used to run unit tests for the IndexManager class. IndexManager acts a a resource manager for the open indexes used to process a query.
IndexShardTest
Used to test that the IndexShard class can properly add new documents and retrieve those documents by word. Checks that doc offsets can be updated, shards can be saved and reloaded
ItTokenizerTest
My code for testing the Italian stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/italian/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/italian/output.txt
LinearHashTableTest
Used to test that the LinearHashTable class properly stores key value pairs, handles insert, deletes, retrievals okay.
NlTokenizerTest
Code used to test the Dutch stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/Dutch/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/Dutch/output.txt
PackedTableToolsTest
Used to test the PackedTableTools class. PackedTableTools are used for reading and storing rows with respect to some signature
PdfProcessorTest
UnitTest for the PdfProcessor class. A PdfProcessor is used to process a .pdf file and extract summary from it. This class tests the processing of an .pdf file.
PhraseParserTest
Used to test that the PhraseParser class. Want to make sure bigram extracting works correctly
PptxProcessorTest
UnitTest for the PptxProcessor class. It is used to process pptx files which are a zip of an xml-based format
PriorityQueueTest
Used to test the PriorityQueue class that is used to figure out which URL to crawl next
PtTokenizerTest
Code used to test the Portuguese stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/porter/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/porter/output.txt Code uses original Porter stemmer, not Porter 2
QueueServerTest
Used to test functions related to scheduling websites to crawl for a web crawl (the responsibility of a QueueServer)
RuTokenizerTest
Code used to test the Russian stemming algorithm. The inputs for the algorithm are words in http://snowball.tartarus.org/algorithms/russian/voc.txt and the resulting stems are compared with the stem words in http://snowball.tartarus.org/algorithms/russian/output.txt
ScraperManagerTest
Code used to test Web Scrapers.
Sha1JavascriptTest
Used to test the Javascript implementation of the sha1 function.
StringArrayTest
Used to test that the StringArray class properly stores/retrieves values, and can handle loading and saving
TrieTest
Used to test that the Trie class properly stores words that could be used for an autosuggest dictionary
UrlParserTest
Used to test that the UrlParser class. For now, want to see that the method canonicalLink is working correctly and that isPathMemberRegexPaths (used in robot_processor.php) works
UtilityTest
Used to test the various methods in utility, in particular, those related to posting lists and time.
VersionManagerTest
UnitTests for the VersionManager class.
WebArchiveTest
UnitTest for the WebArchive class. A web archive is used to store array-based objects persistently to a file. This class tests storing and retrieving from such an archive.
WikiParserTest
Tests the functionality of WikiParser used when processing Wikipedia dumps and used for Yioop's internal wiki infrastructure
WordIteratorTest
Tests the functionality of the WordIterator class used to iterate over documents in an IndexDocumentBundle containing a term.
XlsxProcessorTest
Used to test that the XlsxProcessor class provides the basic functionality of getting the tile, description, languages and links
ZhTokenizerTest
Used to test Named Entity Tagging and Part of Speech Tagging for the Chinese Language. Word segmentation is already tested in
CrawlcontrolsElement
Used to the control buttons on manage account, manage crawls, etc pages
SocialcontrolsElement
Used to the control buttons on manage account, manage groups, group feed, etc pages
LRUCache
Implements a least recently used cache
DescriptionUpdateJob
A media job to periodically update descriptions of Wiki resources using Description Search Sources

Table of Contents

QUERY_AGENT_NAME  = "QUERY_CACHER"
TIME_BETWEEN_REQUEST_IN_SECONDS  = 5
YIOOP_URL  = "http://localhost/"
Script to cache run a sequence of queries against a yioop instance so that they can be cached
e()  : mixed
shorthand for echo
getTrainingFileNames()  : array<string|int, mixed>
Returns an array of filenames to be used for training the current task in TokenTool
makeKwikiEntriesGetSeedSites()  : mixed
Generates knowledge wiki callouts for search results pages based on the first paragraph of a Wikipedia Page that matches a give qeury.
getNextPage()  : mixed
Gets the next wiki page from a file handle pointing to the wiki dump file
removeTags()  : string
Remove all occurrence of a open close tag pairs from $text
getBraceTag()  : array<string|int, mixed>
Get a substring offset pair matching the input open close brace tag pattern
getTagOffsetPage()  : mixed
Get the outer contents of an xml open/close tag pair from a text source together with a new offset location after
getTopPages()  : array<string|int, mixed>
Returns title and page counts of the top $max_pages many entries in a $page_count_file for a locale $locale_tag
smartOpen()  : array<string|int, mixed>
Gets a read file handle for $file_open appropriate for whether it is uncompressed, bz2 compressed, or gz compressed. It returns also function pointers to the functions needed to do reading and closing for the file handle.
translateLocale()  : mixed
Translates Yioop web app strings to a given locale ($locale_tag) and writes the LOCALE_DIR/$locale_tag/configure.ini file for these translations.
wikiHeaderPageToString()  : string
Converts an array of wiki header information and a wiki page contents string into a string suitable to be store into the GROUP_PAGE_HISTORY database table.
translatePhrase()  : mixed
Translates a string from English to a given locale using an online translation tool.
makeNWordGramsFiles()  : mixed
Makes an n or all word gram Bloom filter based on the supplied arguments Wikipedia files are assumed to have been place in the PREP_DIR before this is run and writes it into the resources folder of the given locale
makeSuggestTrie()  : mixed
Makes a trie that can be used to make word suggestions as someone enters terms into the Yioop! search box. Outputs the result into the file suggest_trie.txt.gz in the supplied locale dir
fileWithTrim()  : array<string|int, mixed>
Reads file into an array or outputs file not found. For each entry in array trims it. Any blank lines are deleted
tl()  : string
Translate the supplied arguments into the current locale.
e()  : mixed
shorthand for echo
tl()  : string
Translate the supplied arguments into the current locale.
e()  : mixed
shorthand for echo
exceptionErrorHandler()  : mixed
Error handler so catch errors as exceptions too
webError()  : mixed
Used to handle request errors in non-cli, non-webserver redirect case
clean()  : bool
Used to clean trailing whitespace from files in a folder or just from a file given in the command line. If also removes final ?> characters to make php files conform with suggested coding guidelines. Similarly, adds a space between if, for, foreach, etc and ( if not present to make match PHP coding guidelines
copyright()  : bool
Updates the copyright info (assuming in Yioop docs format) on files in supplied sub-folder/file. That is, it changes strings matching /2009 - \d\d\d\d/ to 2009 - current_year in those files/file.
longlines()  : bool
Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,
needsdocs()  : bool
Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,
replace()  : bool
Performs a search and replace for given pattern in files in supplied sub-folder/file
search()  : bool
Performs a search for given pattern in files in supplied sub-folder/file
unit()  : bool
Used to run or list Yioop unit tests given in $args
changeCopyrightFile()  : mixed
Callback function applied to each file in the directory being traversed by @see copyright(). It checks if the files is of the extension of a code file and if so trims whitespace from its lines and then updates the lines of the form 2009 - \d\d\d\d to the supplied copyright year
cleanLinesFile()  : mixed
Callback function applied to each file in the directory being traversed by @see clean().
searchFile()  : mixed
Callback function applied to each file in the directory being traversed by @see search(). Searches $filename matching $pattern and outputs line numbers and lines
replaceFile()  : mixed
Callback function applied to each file in the directory being traversed by @see replace(). Searches $filename matching $pattern. Depending on $mode ($arg[2] as described in replace()), it outputs and replaces with $replace
mapPath()  : mixed
Applies the function $callback to each file in $path
excludedPath()  : bool
Checks if $path is amongst a list of paths which should be ignored
bootstrap()  : mixed
Main entry point to the Yioop web app.
checkCookieConsent()  : bool
Checks if a cookie consent form was obtained. This This function returns true if a session cookie was received from the browser, or a form variable saying cookies are okay was received, or the cookie Yioop profile says the consent mechanism is disabled
configureRewrites()  : mixed
Used to setup and handles url rewriting for the Yioop Web app
routeAppFile()  : bool
Used to handle routes that will eventually just serve files from either the APP_DIR These include files like css, scripts, suggest tries, images, and videos.
routeBaseFile()  : bool
Used to handle routes that will eventually just serve files from either the BASE_DIR These include files like css, scripts, images, and robots.txt.
routeDirect()  : bool
Used to route page requests to pages that are fixed Public Group wiki that should always be present. For example, 404 page.
directUrl()  : string
Given the name of a fixed public group static page creates the url where it can be accessed in this instance of Yioop, making use of the defined variable REDIRECTS_ON.
routeBlog()  : bool
Used to route page requests to for the website's public blog
routeFeeds()  : bool
Used to route page requests for pages corresponding to a group, user, or thread feed. If redirects on then urls ending with /feed_type/id map to a page for the id'th item of that feed_type
feedsUrl()  : string
Given the type of feed, the identifier of the feed instance, and which controller is being used creates the url where that feed item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.
routeUserMessages()  : bool
routeController()  : bool
Used to route page requests to end-user controllers such as register, admin. urls ending with /controller_name will be routed to that controller.
controllerUrl()  : string
Given the name of a controller for which an easy end-user link is useful creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON. Examples of end-user controllers would be the admin, and register controllers.
routeSubsearch()  : bool
Used to route page requests for subsearches such as news, video, and images (site owner can define other). Urls of the form /s/subsearch will go the page handling the subsearch.
subsearchUrl()  : string
Given the name of a subsearch creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON.
routeSerpIcon()  : bool
Used to route requests for favicons for pages in search results
serpIconUrl()  : string
Return the url to repquest the favicon for a page in the search resutls, making use of the defined variable REDIRECTS_ON.
routeSuggest()  : bool
Used to route requests for the suggest-a-url link on the tools page.
suggestUrl()  : string
Return the url for the suggest-a-url link on the more tools page, making use of the defined variable REDIRECTS_ON.
routeWiki()  : bool
Used to route page requests for pages corresponding to a wiki page of group. If it is a wiki page for the public group viewed without being logged in, the route might come in as yioop_instance/p/page_name if redirects are on. If it is for a non-public wiki or page accessed with logged in the url will look like either: yioop_instance/group/group_id?a=wiki&page_name=some_name or yioop_instance/admin/group_id?a=wiki&page_name=some_name&csrf_token_string
wikiUrl()  : string
Given the name of a wiki page, the group it belongs to, and which controller is being used creates the url where that wiki item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.
main()  : mixed
Command-line shell for testing the class
tl()  : mixed
import a tl function into Controller Namespace
e()  : mixed
shorthand for echo
localesWithStopwordsList()  : array<string|int, mixed>
Returns an array of locales that have a stop words list and a stop words remover method
localeTagToIso639_2Tag()  : string
Converts a $locale_tag (major-minor) to an Iso 632-2 language name
guessLocale()  : string
Attempts to guess the user's locale based on the request, session, and user-agent data
guessLocaleFromString()  : string
Attempts to guess the user's locale based on a string sample
checkQuery()  : string
Tries to find whether query belongs to a programming language
guessLangEncoding()  : string
Tries to guess at a language tag based on the name of a character encoding
guessEncodingHtmlXml()  : mixed
Tries to guess the encoding used for an Html document
convertUtf8IfNeeded()  : mixed
Converts page data in a site associative array to UTF-8 if it is not already in UTF-8
tl()  : string
Translate the supplied arguments into the current locale.
setLocaleObject()  : mixed
Sets the language to be used for locale settings
getLocaleTag()  : string
Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.
getLocaleDirection()  : string
Returns the current language directions.
getLocaleQueryStatistics()  : array<string|int, mixed>
Returns the query statistics info for the current llocalt.
getBlockProgression()  : string
Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.
getWritingMode()  : string
Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).
w1256ToUTF8()  : string
Convert the string $str encoded in Windows-1256 into UTF-8
utf8chr()  : string
Given a unicode codepoint convert it to UTF-8
formatDateByLocale()  : string
Function for formatting a date string based on the locale.
upgradeLocalesCheck()  : mixed
Checks to see if the locale data of Yioop! of a locale in the work dir is older than the currently running Yioop!
upgradeLocales()  : mixed
If the locale data of Yioop! in the work directory is older than the currently running Yioop! then this function is called to at least try to copy the new strings into the old profile.
upgradePublicHelpWiki()  : mixed
Used to force push the default Public and Wiki pages into the current database
upgradeDatabaseWorkDirectoryCheck()  : mixed
Checks to see if the database data or work_dir folder of Yioop! is from an older version of Yioop! than the currently running Yioop!
upgradeDatabaseWorkDirectory()  : mixed
If the database data of Yioop is older than the version of the currently running Yioop then this function is called to try upgrade the database to the new version
updateVersionNumber()  : mixed
Update the database version number to a new number
getWikiHelpPages()  : mixed
Reads the Help articles from default db and returns the array of pages.
addActivityAtId()  : mixed
Used to insert a new activity into the database at a given activity_id
updateTranslationForStringId()  : mixed
Adds or replaces a translation for a database message string for a given IANA locale tag.
addRegexDelimiters()  : string
Adds delimiters to a regex that may or may not have them
preg_search()  : mixed
search for a pcre pattern in a subject from a given offset, return position of first match if found -1 otherwise.
preg_offset_replace()  : string
Replaces a pcre pattern with a replacement in $subject starting from some offset.
parse_ini_with_fallback()  : array<string|int, mixed>
Yioop replacement for parse_ini_file($name, true) in case parse_ini_file is on the disable_functions list. Name has underscores to match original function. This function checks if parse_ini_file is disabled on not. If not, it just calls parse_ini_file; otherwise, it simulates it enough so that configure.ini files used for string translations can be read.
getIniAssignMatch()  : mixed
Auxiliary function called from parse_ini_with_fallback to extract from the $matches array produced by the former function's preg_match what kind of assignment occurred in the ini file being parsed.
charCopy()  : mixed
Copies from $source string beginning at position $start, $length many bytes to destination string
vByteEncode()  : string
Encodes an integer using variable byte coding.
vByteDecode()  : int
Decodes from a string using variable byte coding an integer.
appendUnary()  : mixed
Appends a number re-encoded in unary to the end of an input string starting at a given bit offset into the string. Here n in unary has bit representation n-1 0's followed by a 1.
decodeUnary()  : int
Decodes a unary number froman input string at a given bit offset. Here n in unary has bit representation n-1 0's followed by a 1.
appendBits()  : string
Appends $num_bits bits from the start of the binary rep of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. If $num_bits == -1, then appends all of $number.
decodeBits()  : int
Decode $num_bits many bits from the $input string beginning at offset $start_bit_offset. The result of this operation is up $start_bit_offset by number of bits that were able to be decoded.
appendGamma()  : string
Appends gamma code of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. $start_bit_offset is updated to bit position after append.
decodeGammaList()  : array<string|int, mixed>
Decodes up to $num_decode gamma encoded integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers.
appendRiceSequence()  : string
Appends using a Rice coding a sequence of integers $int_sequence at offset $start_bit_offset to the string $output, overwriting any bits present at that location. $start_bit_offset is updated to bit position after append.
decodeRiceSequence()  : array<string|int, mixed>
Decodes up to $num_decode rice encoded difference list of integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers. If $delta_start >= 0 then the first int is assumed to be the difference from $delta_start;
encodePositionList()  : string
Encodes a list of integer positions of a term in a document. This is done as a gamma code of the first integer followed by the Rice coding of the remaining integers using a modulus based on the average gap between integers. If the number of positions is 1 or 2 then a gamma of each position only is used.
decodePositionList()  : array<string|int, mixed>
Decodes up to $num_decode term in document position integers from string $input under the assumption $input is encoded as per
encode255()  : string
Recodes a string in a 1-1 fashion to a string not involving \xFF (255). I.e., it maps characters \xFE -> \xFE\FD and \xFF -> \xFE\FE
decode255()  : string
Decodes a string in a 1-1 fashion from a string not involving \xFF (255). I.e., it maps characters \xFE\FE -> \xFF and \xFE\FD -> \xFF
encodeUnderscore()  : string
Recodes a string in a 1-1 fashion to a string not involving underscore (_). I.e., it maps characters - -> -- and _ -> -=
decodeUnderscore()  : string
Decodes a string in a 1-1 fashion from a string not involving underscore (_). I.e., it maps characters -= -> _ and -- -> -
packEncode255()  : string
Encodes a list of strings as their @see encode255 versions separated by \xFF's
unpackDecode255()  : array<string|int, mixed>
Decodes a list of strings from a string that encoded as their @see encode255 of its elements separated by \xFF's
packPosting()  : string
Makes an packed integer string from a docindex and the number of occurrences of a word in the document with that docindex.
unpackPosting()  : array<string|int, mixed>
Given a packed integer string, uses the top three bytes to calculate a doc_index of a document in the shard, and uses the low order byte to computer a number of occurrences of a word in that document.
addDocIndexPostings()  : string
This method is used while appending one index shard to another.
deltaList()  : array<string|int, mixed>
Computes the difference of a list of integers.
deDeltaList()  : array<string|int, mixed>
Given an array of differences of integers reconstructs the original list. This computes the inverse of the deltaList function
encodeModified9()  : string
Encodes a sequence of integers x, such that 1 <= x <= 2<<28-1 as a string. NOTICE x>=1.
packListModified9()  : string
Packs the contents of a single word of a sequence being encoded using Modified9.
nextPostString()  : string
Returns the next complete posting string from $input_string being at offset.
decodeModified9()  : array<string|int, mixed>
Decoded a sequence of positive integers from a string that has been encoded using Modified 9
unpackListModified9()  : array<string|int, mixed>
Decode a single word with high two bits off according to modified 9
docIndexModified9()  : int
Given an int encoding encoding a doc_index followed by a position list using Modified 9, extracts just the doc_index.
unpackInt()  : int
Unpacks an int from a 4 char string
packInt()  : string
Packs an int into a 4 char string
unpackFloat()  : float
Unpacks a float from a 4 char string
packFloat()  : string
Packs an float into a four char string
renameSerializedObject()  : string
Used to change the namespace of a serialized php object (assumes doesn't have nested subobjects)
getDomFromString()  : DOMDocument
Parses a provided string to make a DOM object. First tries to parse using XML and if this fails uses the more robust HTML Dom parser and manipulates the resulting DOM tree to make correspond to original tags for XML that isn't HTML
getTags()  : array<string|int, mixed>
Returns an array of DOMDocuments for the nodes that match an xpath query on $dom, a DOMDocument
toHexString()  : string
Converts a string to string where each char has been replaced by its hexadecimal equivalent
toIntString()  : string
Converts a string to string where each char has been replaced by a Integer equivalent
toBinString()  : string
Converts a string to string where each char has been replaced by its binary equivalent
metricToInt()  : int
Converts a string of the form some int followed by K, M, or G.
intToMetric()  : string
Converts a number to a string followed by nothing, K, M, G, T depending on whether number is < 1000, < 10^6, < 10^9, or < 10^(12)
crawlLog()  : mixed
Logs a message to a logfile or the screen. The super-global field $_SERVER['LOG_TO_FILES'] determines if this will log to a file. If not, then in cli mode, will log to stdout, otherwise it will use error_log. When logging to file $_SERVER["NO_ROTATE_LOGS"] controls whether or not there will be a log file rotation. The first call to this method is typically used to set up a process to check for liveness. For example a call: crawlLog("\n\nInitialize logger..", $this->process_name, true); says $this->process_name should be checked for liveness as part of any subsequent logging activity such as a call crawlLog("Another Message"); (note subsequent call don't need to specify the process name).
makeTimestamp()  : string
Used to make a log file entry time string of format: entry number, time in r format.
crawlTimeoutLog()  : bool
Writes a log message $msg if more than LOG_TIMEOUT time has passed since the last time crawlTimeoutLog was called. Useful in loops to write a message as progress is made through the loop (but not on every iteration, but say every 30 seconds).
crawlHash()  : string
Computes an 8 byte hash of a string for use in storing documents.
crawlHashWord()  : string
Used to create a 20 byte hash of a string (typically a word or phrase with a wikipedia page). Format is 8 byte crawlHash of term (md5 of term two halves XOR'd), followed by a \x00, followed by the first 11 characters from the term. If there are not enough char's to make 20 bytes, then the string is padded with \x00s to 20bytes.
canonicalTerm()  : string
Take a $term that might have come from adocuments and converts it to a string of 16 bytes which is either the original term padded by underscores or the first seven chars of the term followed by an underscore followed by the base64 encoding of the first 6 chars of its md5 hash.
compareWordHashes()  : int
Used to compare to ids for index dictionary lookup. ids are a 8 byte crawlHash together with 12 byte non-hash suffix.
base64Hash()  : string
Converts a crawl hash number to something closer to base64 coded but so doesn't get confused in urls or DBs
unbase64Hash()  : string
Decodes a crawl hash number from base64 to raw ASCII
webencode()  : string
Encodes a string in a format suitable for post data (mainly, base64, but str_replace data that might mess up post in result)
webdecode()  : string
Decodes a string encoded by webencode
crawlCrypt()  : string
The crawlHash function is used to encrypt passwords stored in the database.
partitionByHash()  : array<string|int, mixed>
Used by a controller to take a table and return those rows in the table that a given queue_server would be responsible for handling
calculatePartition()  : int
Used by a controller to say which queue_server should receive a given input
changeInMicrotime()  : float
Measures the change in time in seconds between two timestamps to microsecond precision
microTimestamp()  : string
Timestamp of current epoch with microsecond precision useful for situations where time() might cause too many collisions (account creation, etc)
checkTimeInterval()  : int
Checks that a timestamp is within the time interval given by a start time (HH:mm) and a duration
convertPixels()  : int
Converts a CSS unit string into its equivalent in pixels. This is used by @see SvgProcessor.
countFiles()  : int
Returns the number of files in a folder
makePath()  : bool
Creates folders along a filesystem path if they don't exist
deleteFileOrDir()  : mixed
This is a callback function used in the process of recursively deleting a directory
setWorldPermissions()  : mixed
This is a callback function used in the process of recursively chmoding to 777 all files in a folder
fileInfo()  : an
This is a callback function used in the process of recursively calculating an array of file modification times and files sizes for a directory
orderCallback()  : int
Callback function used to sort documents by a field
stringOrderCallback()  : int
Callback function used to sort documents by a field where field is assume to be a string
stringROrderCallback()  : int
Callback function used to sort documents by a field where field is assume to be a string
rorderCallback()  : int
Callback function used to sort documents by a field in reverse order
lessThan()  : int
Callback to check if $a is less than $b
greaterThan()  : int
Callback to check if $a is greater than $b
e()  : mixed
shorthand for echo
remoteAddress()  : mixed
Compute the real remote address of the incoming connection including forwarding
readInput()  : string
Used to read a line of input from the command-line
readPassword()  : string
Used to read a line of input from the command-line (on unix machines without echoing it)
readMessage()  : string
Used to read a several lines from the terminal up until a last line consisting of just a "."
mimeType()  : string
Returns the mime type of the provided file name if it can be determined.
generalIsA()  : bool
Checks if class_1 is the same as class_2 or has class_2 as a parent Behaves like 3 param version (last param true) of PHP is_a function that came into being with Version 5.3.9.
stripAttributes()  : string
Given the contents of a start XML/HMTL tag strips out all the attributes non listed in $safe_attribute_list
parseCsv()  : array<string|int, mixed>
Used to parse into a two dimensional array a string that contains CSV data.
arraytoCsv()  : string
Converts an array of values to a comma separated value formatted string.
diff()  : string
Computes a Unix-style diff of two strings. That is it only outputs lines which disagree between the two strings. It outputs +line if a line occurs in the second but not first string and -line if a line occurs in the first string but not the second.
computeLCS()  : mixed
Computes the longest common subsequence of two arrays
extractLCSFromTable()  : mixed
Extracts from a table of longest common sequence moves (probably calculated by @see computeLCS) and a starting coordinate $i, $j in that table, a longest common subsequence
tail()  : array<string|int, mixed>
Returns an array of the last $num_lines many lines our of a file
lineFilter()  : array<string|int, mixed>
Given an array of lines returns a subarray of those lines containing the filter string or filter array
logLineTimestamp()  : int
Tries to extract a timestamp from a line which is presumed to come from a Yioop log file
isPositiveInteger()  : bool
Returns whether an input can be parsed to a positive integer
measureCall()  : mixed
Used to measure the memory footprint in bytes and time spent calling a method of an object. It also records number of time the method has been called.
measureObject()  : mixed
Used to measure the memory footprint of an object in Yioop and save it to a statistics file No recording is done until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.
measureObjectCall()  : mixed
General method called by for @see measureCall and @see measureObject Used to measure the memory footprint in bytes of an object or memory and time spent calling a method of an object. It also records number of time the method has been called. When used to call a method before initialization, just calls the method without any recording or timing. To initialize, an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to should be done.
variableClone()  : mixed
Makes a deep copy of a variable regardless of its type
garbageCollect()  : int
Runs various system garbage collection functions and returns number of bytes freed.
utf8SafeSaveHtml()  : string
The dom method saveHTML has a tendency to replace UTF-8, non-ascii characters with html entities. This is supposed to save avoiding the replacement.
utf8WordWrap()  : string
A UTF-8 safe version of PHP's wordwrap function that wraps a string to a given number of characters
upgradeDatabaseVersion1()  : mixed
Upgrades a Version 0 version of the Yioop database to a Version 1 version
upgradeDatabaseVersion2()  : mixed
Upgrades a Version 1 version of the Yioop database to a Version 2 version
upgradeDatabaseVersion3()  : mixed
Upgrades a Version 2 version of the Yioop database to a Version 3 version
upgradeDatabaseVersion4()  : mixed
Upgrades a Version 3 version of the Yioop database to a Version 4 version
upgradeDatabaseVersion5()  : mixed
Upgrades a Version 4 version of the Yioop database to a Version 5 version
upgradeDatabaseVersion6()  : mixed
Upgrades a Version 5 version of the Yioop database to a Version 6 version
upgradeDatabaseVersion7()  : mixed
Upgrades a Version 6 version of the Yioop database to a Version 7 version
upgradeDatabaseVersion8()  : mixed
Upgrades a Version 7 version of the Yioop database to a Version 8 version
upgradeDatabaseVersion9()  : mixed
Upgrades a Version 8 version of the Yioop database to a Version 9 version
upgradeDatabaseVersion10()  : mixed
Upgrades a Version 9 version of the Yioop database to a Version 10 version
upgradeDatabaseVersion11()  : mixed
Upgrades a Version 10 version of the Yioop database to a Version 11 version
upgradeDatabaseVersion12()  : mixed
Upgrades a Version 11 version of the Yioop database to a Version 12 version
upgradeDatabaseVersion13()  : mixed
Upgrades a Version 12 version of the Yioop database to a Version 13 version
upgradeDatabaseVersion14()  : mixed
Upgrades a Version 13 version of the Yioop database to a Version 14 version
upgradeDatabaseVersion15()  : mixed
Upgrades a Version 14 version of the Yioop database to a Version 15 version
upgradeDatabaseVersion16()  : mixed
Upgrades a Version 15 version of the Yioop database to a Version 16 version
upgradeDatabaseVersion17()  : mixed
Upgrades a Version 16 version of the Yioop database to a Version 17 version
upgradeDatabaseVersion18()  : mixed
Upgrades a Version 17 version of the Yioop database to a Version 18 version
upgradeDatabaseVersion19()  : mixed
Upgrades a Version 18 version of the Yioop database to a Version 19 version This update has been superseded by the Version20 update and so its contents have been eliminated.
upgradeDatabaseVersion20()  : mixed
Upgrades a Version 19 version of the Yioop database to a Version 20 version This is a major upgrade as the user table have changed. This also acts as a cumulative since version 0.98. It involves a web form that has only been localized to English
upgradeDatabaseVersion21()  : mixed
Upgrades a Version 20 version of the Yioop database to a Version 21 version
upgradeDatabaseVersion22()  : mixed
Upgrades a Version 21 version of the Yioop database to a Version 22 version
upgradeDatabaseVersion23()  : mixed
Upgrades a Version 22 version of the Yioop database to a Version 23 version
upgradeDatabaseVersion24()  : mixed
Upgrades a Version 23 version of the Yioop database to a Version 24 version
upgradeDatabaseVersion25()  : mixed
Upgrades a Version 24 version of the Yioop database to a Version 25 version This version upgrade includes creation of Help group that holds help pages.
upgradeDatabaseVersion26()  : mixed
Upgrades a Version 25 version of the Yioop database to a Version 26 version This version upgrade includes updation fo the Help pages in the database to work with the changes to the way Hyperlinks are specified in wiki markup.
upgradeDatabaseVersion27()  : mixed
Upgrades a Version 26 version of the Yioop database to a Version 27 version
upgradeDatabaseVersion28()  : mixed
Upgrades a Version 27 version of the Yioop database to a Version 28 version
upgradeDatabaseVersion29()  : mixed
Upgrades a Version 28 version of the Yioop database to a Version 29 version
upgradeDatabaseVersion30()  : mixed
Upgrades a Version 29 version of the Yioop database to a Version 30 version
upgradeDatabaseVersion31()  : mixed
Upgrades a Version 30 version of the Yioop database to a Version 31 version
upgradeDatabaseVersion32()  : mixed
Upgrades a Version 31 version of the Yioop database to a Version 32 version
upgradeDatabaseVersion33()  : mixed
Upgrades a Version 32 version of the Yioop database to a Version 33 version
upgradeDatabaseVersion34()  : mixed
Upgrades a Version 33 version of the Yioop database to a Version 34 version
upgradeDatabaseVersion35()  : mixed
Upgrades a Version 34 version of the Yioop database to a Version 35 version
upgradeDatabaseVersion36()  : mixed
Upgrades a Version 35 version of the Yioop database to a Version 36 version
upgradeDatabaseVersion37()  : mixed
Upgrades a Version 36 version of the Yioop database to a Version 37 version
upgradeDatabaseVersion38()  : mixed
Upgrades a Version 37 version of the Yioop database to a Version 38 version
upgradeDatabaseVersion39()  : mixed
Upgrades a Version 38 version of the Yioop database to a Version 39 version
upgradeDatabaseVersion40()  : mixed
Upgrades a Version 39 version of the Yioop database to a Version 40 version
upgradeDatabaseVersion41()  : mixed
Upgrades a Version 40 version of the Yioop database to a Version 41 version
upgradeDatabaseVersion42()  : mixed
Upgrades a Version 41 version of the Yioop database to a Version 42 version
upgradeDatabaseVersion43()  : mixed
Upgrades a Version 42 version of the Yioop database to a Version 43 version
upgradeDatabaseVersion44()  : mixed
Upgrades a Version 43 version of the Yioop database to a Version 44 version
upgradeDatabaseVersion45()  : mixed
Upgrades a Version 44 version of the Yioop database to a Version 45 version
upgradeDatabaseVersion46()  : mixed
Upgrades a Version 45 version of the Yioop database to a Version 46 version
upgradeDatabaseVersion47()  : mixed
Upgrades a Version 46 version of the Yioop database to a Version 47 version
upgradeDatabaseVersion48()  : mixed
Upgrades a Version 47 version of the Yioop database to a Version 48 version
upgradeDatabaseVersion49()  : mixed
Upgrades a Version 48 version of the Yioop database to a Version 49 version
upgradeDatabaseVersion50()  : mixed
Upgrades a Version 49 version of the Yioop database to a Version 50 version
upgradeDatabaseVersion51()  : mixed
Upgrades a Version 50 version of the Yioop database to a Version 51 version
upgradeDatabaseVersion52()  : mixed
Upgrades a Version 51 version of the Yioop database to a Version 52 version
upgradeDatabaseVersion53()  : mixed
Upgrades a Version 52 version of the Yioop database to a Version 53 version
upgradeDatabaseVersion54()  : mixed
Upgrades a Version 53 version of the Yioop database to a Version 54 version
upgradeDatabaseVersion55()  : mixed
Upgrades a Version 54 version of the Yioop database to a Version 55 version
upgradeDatabaseVersion57()  : mixed
Upgrades a Version 56 version of the Yioop database to a Version 5 version
upgradeDatabaseVersion58()  : mixed
Upgrades a Version 57 version of the Yioop database to a Version 58 version
upgradeDatabaseVersion59()  : mixed
Upgrades a Version 58 version of the Yioop database to a Version 59 version
upgradeDatabaseVersion60()  : mixed
Upgrades a Version 59 version of the Yioop database to a Version 60 version
upgradeDatabaseVersion61()  : mixed
Upgrades a Version 60 version of the Yioop database to a Version 61 version
upgradeDatabaseVersion62()  : mixed
Upgrades a Version 61 version of the Yioop database to a Version 62 version
upgradeDatabaseVersion64()  : mixed
Upgrades a Version 63 version of the Yioop database to a Version 64 version
upgradeDatabaseVersion65()  : mixed
Upgrades a Version 64 version of the Yioop database to a Version 65 version
upgradeDatabaseVersion66()  : mixed
Upgrades a Version 65 version of the Yioop database to a Version 66 version
upgradeDatabaseVersion67()  : mixed
Upgrades a Version 66 version of the Yioop database to a Version 67 version
upgradeDatabaseVersion68()  : mixed
Upgrades a Version 67 version of the Yioop database to a Version 68 version
upgradeDatabaseVersion69()  : mixed
Upgrades a Version 68 version of the Yioop database to a Version 69 version
upgradeDatabaseVersion70()  : mixed
Upgrades a Version 69 version of the Yioop database to a Version 70 version
upgradeDatabaseVersion71()  : mixed
Upgrades a Version 70 version of the Yioop database to a Version 71 version
upgradeDatabaseVersion72()  : mixed
Upgrades a Version 71 version of the Yioop database to a Version 72 version
upgradeDatabaseVersion73()  : mixed
Upgrades a Version 72 version of the Yioop database to a Version 73 version
upgradeDatabaseVersion74()  : mixed
Upgrades a Version 73 version of the Yioop database to a Version 74 version
upgradeDatabaseVersion75()  : mixed
Upgrades a Version 74 version of the Yioop database to a Version 75 version
upgradeDatabaseVersion76()  : mixed
Upgrades a Version 75 version of the Yioop database to a Version 76 version
upgradeDatabaseVersion77()  : mixed
Upgrades a Version 76 version of the Yioop database to a Version 77 version
upgradeDatabaseVersion78()  : mixed
Upgrades a Version 77 version of the Yioop database to a Version 78 version
upgradeDatabaseVersion79()  : mixed
Upgrades a Version 78 version of the Yioop database to a Version 79 version
upgradeDatabaseVersion80()  : mixed
Upgrades a Version 79 version of the Yioop database to a Version 80 version
upgradeDatabaseVersion81()  : mixed
Upgrades a Version 80 version of the Yioop database to a Version 81 version
webExit()  : mixed
Function to call instead of exit() to indicate that the script processing the current web page is done processing. Use this rather that exit(), as exit() will also terminate WebSite.
makeTableCallback()  : mixed
Callback used by a preg_replace_callback in nextPage to make a table
citeCallback()  : string
Used to convert {{cite }} to a numbered link to a citation
fixLinksCallback()  : string
Used to changes spaces to underscores in links generated from our earlier matching rules
base64EncodeCallback()  : string
Callback used to base64 encode the contents of nowiki tags so they won't be manipulated by wiki replacements.
spaceEncodeCallback()  : string
Callback used to encode the contents of pre tags so they won't accidentally get sub-pre tags because a bunch of leading lines have spaces
spanEncodeCallback()  : string
Callback used to encode the contents of span tags so they newlines within them don't accidentally get treated as new wiki paragraphs
base64DecodeCallback()  : string
Callback used to base64 decode the contents of previously base64 encoded (@see base64EncodeCallback) nowiki tags after all mediawiki substitutions have been done
spaceDecodeCallback()  : string
Cleans up pre tags after other wiki rules applied
lessThanLocale()  : int
Function for comparing two locale arrays by locale tag so can sort
tl()  : string
Translate the supplied arguments into the current locale.
e()  : mixed
shorthand for echo
tl()  : string
Translate the supplied arguments into the current locale.
e()  : mixed
shorthand for echo
tl()  : string
Translate the supplied arguments into the current locale.
e()  : mixed
shorthand for echo
tl()  : string
Translate the supplied arguments into the current locale.
e()  : mixed
shorthand for echo

Constants

QUERY_AGENT_NAME

public mixed QUERY_AGENT_NAME = "QUERY_CACHER"

TIME_BETWEEN_REQUEST_IN_SECONDS

public mixed TIME_BETWEEN_REQUEST_IN_SECONDS = 5

YIOOP_URL

Script to cache run a sequence of queries against a yioop instance so that they can be cached

public mixed YIOOP_URL = "http://localhost/"

Functions

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

getTrainingFileNames()

Returns an array of filenames to be used for training the current task in TokenTool

getTrainingFileNames(array<string|int, mixed> $command_line_args[, int $start_index = 4 ]) : array<string|int, mixed>
Parameters
$command_line_args : array<string|int, mixed>

supplied to TokenTool.php. Assume array of the format: [ ... max_file_names_to_consider, file_glob1, file_glob2, ...]

$start_index : int = 4

index in $command_line_args of max_file_names_to_consider

Return values
array<string|int, mixed>

$file_names of files with training data

makeKwikiEntriesGetSeedSites()

Generates knowledge wiki callouts for search results pages based on the first paragraph of a Wikipedia Page that matches a give qeury.

makeKwikiEntriesGetSeedSites(string $locale_tag, string $page_count_file, string $wiki_dump_file, int $max_entries, int $max_seed_sites) : mixed

Also generates an initial list of potential seed sites for a crawl based off urls scraped from the wiki pages.

Parameters
$locale_tag : string

the IANA language tag of the locale to create knowledge wiki entries and seed sites for

$page_count_file : string

the file name of a a wiki page count dump file (or folder of such files). Such a file contains the names of wiki pages and how many times they were accessed

$wiki_dump_file : string

a dump of wikipedia pages and meta pages

$max_entries : int

maximum number of kwiki entries to create. Will pick the one with the highest counts in $page_count_file

$max_seed_sites : int

maximum number of seed sites to add to Yioop's set of seed sites. Again chooses those with highest page count score

Return values
mixed

getNextPage()

Gets the next wiki page from a file handle pointing to the wiki dump file

getNextPage(resource $fr, function $read, int $block_size, mixed &$input_buffer) : mixed
Parameters
$fr : resource

file handle (might be a compressed file handle, for example, corresponding to gzopen of bzopen)

$read : function

a function for reading from thhe given file handle

$block_size : int

size of blocks to use when reading

$input_buffer : mixed
Return values
mixed

removeTags()

Remove all occurrence of a open close tag pairs from $text

removeTags(string $text, string $open, string $close) : string
Parameters
$text : string

to remove tag pair from

$open : string

string pattern for open tag

$close : string

string pattern for close tag

Return values
string

text after tag removed

getBraceTag()

Get a substring offset pair matching the input open close brace tag pattern

getBraceTag(string $page, string $brace_open, string $brace_close, string $tag, int $offset) : array<string|int, mixed>
Parameters
$page : string

source text to search for the tag in For example, lala {{infobox {{blah yoyoy}} }} dada.

$brace_open : string

character sequence starting the tag region. For example {{

$brace_close : string

character sequence ending the tag region. For example }}

$tag : string

tag that might be associated with the opening of the the sequence. For example infobox.

$offset : int

offset to start searching from

Return values
array<string|int, mixed>

ordered pair [substring containing the brace tag, offset after the tag]. If had "lala {{infobox {{blah yoyoy}} }} dada" as input and searched on {{, }}, infobox, 0 would get ["{{infobox {{blah yoyoy}}", 31]

getTagOffsetPage()

Get the outer contents of an xml open/close tag pair from a text source together with a new offset location after

getTagOffsetPage(string $page, string $tag, int $offset) : mixed
Parameters
$page : string

text source to search the tag pair in

$tag : string

the xml tag to look for

$offset : int

offset to start searching after for the open/close pair

Return values
mixed

getTopPages()

Returns title and page counts of the top $max_pages many entries in a $page_count_file for a locale $locale_tag

getTopPages(string $page_count_file, string $locale_tag, int $max_pages[, array<string|int, mixed> $title_counts = [] ]) : array<string|int, mixed>
Parameters
$page_count_file : string

page count file to use to search for title counts with respect to a locale

$locale_tag : string

locale to get top pages for

$max_pages : int

number of pages

$title_counts : array<string|int, mixed> = []

title counts that might have come from analyzing a previous file. These will be in the output and contribute to $max_pages

Return values
array<string|int, mixed>

$title_counts wiki page titles => num_views associative array

smartOpen()

Gets a read file handle for $file_open appropriate for whether it is uncompressed, bz2 compressed, or gz compressed. It returns also function pointers to the functions needed to do reading and closing for the file handle.

smartOpen(string $file_name) : array<string|int, mixed>
Parameters
$file_name : string

name of file want read file handle for

Return values
array<string|int, mixed>

[file_handle, read_function_ptr, close_function_ptr]

translateLocale()

Translates Yioop web app strings to a given locale ($locale_tag) and writes the LOCALE_DIR/$locale_tag/configure.ini file for these translations.

translateLocale(string $locale_tag, int $with_wiki_pages) : mixed

Currently, translations are done using the Yandex.translate (https://translate.yandex.com/) API.

Parameters
$locale_tag : string

of locale to translate

$with_wiki_pages : int

if this is <=0, public and help wiki pages are not translated, if it is 1, they are translated to the locale if the locale does not already have a translation. If it is >1 then it is force translated to locale.

Return values
mixed

wikiHeaderPageToString()

Converts an array of wiki header information and a wiki page contents string into a string suitable to be store into the GROUP_PAGE_HISTORY database table.

wikiHeaderPageToString(array<string|int, mixed> $wiki_header, string $wiki_page_data) : string
Parameters
$wiki_header : array<string|int, mixed>

of wiki header information

$wiki_page_data : string

mediawiki data

Return values
string

suitable to be stored in GROUP_PAGE_HISTORY

translatePhrase()

Translates a string from English to a given locale using an online translation tool.

translatePhrase(string $translate_text, string $locale_tag) : mixed
Parameters
$translate_text : string

text to be translated

$locale_tag : string

locale to translate to

Return values
mixed

translated string on success, false otherwise

makeNWordGramsFiles()

Makes an n or all word gram Bloom filter based on the supplied arguments Wikipedia files are assumed to have been place in the PREP_DIR before this is run and writes it into the resources folder of the given locale

makeNWordGramsFiles(array<string|int, mixed> $args) : mixed
Parameters
$args : array<string|int, mixed>

command line arguments with first two elements of $argv removed. For details on which arguments do what see the $usage variable

Return values
mixed

makeSuggestTrie()

Makes a trie that can be used to make word suggestions as someone enters terms into the Yioop! search box. Outputs the result into the file suggest_trie.txt.gz in the supplied locale dir

makeSuggestTrie(string $dict_file, string $locale, string $end_marker) : mixed
Parameters
$dict_file : string

where the word list is stored, one word per line

$locale : string

which locale to write the suggest file to

$end_marker : string

used to indicate end of word in the trie

Return values
mixed

fileWithTrim()

Reads file into an array or outputs file not found. For each entry in array trims it. Any blank lines are deleted

fileWithTrim( $file_name) : array<string|int, mixed>
Parameters
$file_name :

file to read into array

Return values
array<string|int, mixed>

of trimmed lines

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function is a convenience copy of the same function

Tags
see
tl()

to this subnamespace

Return values
string

translated string

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function is a convenience copy of the same function

Tags
see
tl()

to this subnamespace

Return values
string

translated string

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

exceptionErrorHandler()

Error handler so catch errors as exceptions too

exceptionErrorHandler(int $errno, string $errstr, string $errfile, int $errline) : mixed
Parameters
$errno : int

number code of error

$errstr : string

text of error message

$errfile : string

filename of file in which error occurred

$errline : int

line number of error

Return values
mixed

webError()

Used to handle request errors in non-cli, non-webserver redirect case

webError() : mixed
Return values
mixed

clean()

Used to clean trailing whitespace from files in a folder or just from a file given in the command line. If also removes final ?> characters to make php files conform with suggested coding guidelines. Similarly, adds a space between if, for, foreach, etc and ( if not present to make match PHP coding guidelines

clean(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>

$args[0] contains path to sub-folder/file

Return values
bool

$no_instructions false if should output CodeTool.php instructions

Updates the copyright info (assuming in Yioop docs format) on files in supplied sub-folder/file. That is, it changes strings matching /2009 - \d\d\d\d/ to 2009 - current_year in those files/file.

copyright(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>

$args[0] contains path to sub-folder/file

Return values
bool

$no_instructions false if should output CodeTool.php instructions

longlines()

Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,

longlines(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>

$args[0] contains path to sub-folder/file

Return values
bool

$no_instructions false if should output CodeTool.php instructions

needsdocs()

Search and echos line numbers and lines for lines of length greater than 80 characters in files in supplied sub-folder/file,

needsdocs(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>

$args[0] contains path to sub-folder/file

Return values
bool

$no_instructions false if should output CodeTool.php instructions

replace()

Performs a search and replace for given pattern in files in supplied sub-folder/file

replace(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>

$args[0] contains path to sub-folder/file, $args[1] contains the regex searching for, $args[2] contains what it should be replaced with, $args[3] (defaults to effect) controls the mode of operation. One of "effect", "change", or "interactive". effect shows line number and lines matching pattern, but commits no changes; interactive for each match, prompts user if should do the change, change does a global search and replace without output

Return values
bool

$no_instructions false if should output CodeTool.php instructions

Performs a search for given pattern in files in supplied sub-folder/file

search(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>

$args[0] contains path to sub-folder/file, $args[1] contains the regex searching for

Return values
bool

$no_instructions false if should output CodeTool.php instructions

unit()

Used to run or list Yioop unit tests given in $args

unit(array<string|int, mixed> $args) : bool
Parameters
$args : array<string|int, mixed>
  • if empty run all tests, if $args[0] == 'list' then list available test. If $args[0] == name_of_particular then run just that test. If $args[1] == name_of_particular case, then just run that test case of the particular test.
Return values
bool

whether $args made sense so could process

changeCopyrightFile()

Callback function applied to each file in the directory being traversed by @see copyright(). It checks if the files is of the extension of a code file and if so trims whitespace from its lines and then updates the lines of the form 2009 - \d\d\d\d to the supplied copyright year

changeCopyrightFile(string $filename[, mixed $set_year = false ]) : mixed
Parameters
$filename : string

name of file to check for copyright lines and updated

$set_year : mixed = false

if false then set the end of the copyright period to the current year, otherwise, if an int sets it to the value of the int

Return values
mixed

cleanLinesFile()

Callback function applied to each file in the directory being traversed by @see clean().

cleanLinesFile(string $filename) : mixed
Parameters
$filename : string

name of file to clean lines for

Return values
mixed

searchFile()

Callback function applied to each file in the directory being traversed by @see search(). Searches $filename matching $pattern and outputs line numbers and lines

searchFile(string $filename[, mixed $set_pattern = false ]) : mixed
Parameters
$filename : string

name of file to search in

$set_pattern : mixed = false

if not false, then sets $set_pattern in $pattern to initialize the callback on subsequent calls. $pattern here is the search pattern

Return values
mixed

replaceFile()

Callback function applied to each file in the directory being traversed by @see replace(). Searches $filename matching $pattern. Depending on $mode ($arg[2] as described in replace()), it outputs and replaces with $replace

replaceFile(string $filename[, mixed $set_pattern = false ][, mixed $set_replace = false ][, mixed $set_mode = false ]) : mixed
Parameters
$filename : string

name of file to search and replace in

$set_pattern : mixed = false

if not false, then sets $set_pattern in $pattern to initialize the callback on subsequent calls. $pattern here is the search pattern

$set_replace : mixed = false

if not false, then sets $set_replace in $replace to initialize the callback on subsequent calls.

$set_mode : mixed = false

if not false, then sets $set_mode in $mode to initialize the callback on subsequent calls.

Return values
mixed

mapPath()

Applies the function $callback to each file in $path

mapPath(string $path, string $callback) : mixed
Parameters
$path : string

to apply map $callback to

$callback : string

function name to call with filename of each file in path

Return values
mixed

excludedPath()

Checks if $path is amongst a list of paths which should be ignored

excludedPath( $path) : bool
Parameters
$path :

a directory path

Return values
bool

whether or not it should be ignored (true == ignore)

bootstrap()

Main entry point to the Yioop web app.

bootstrap([object $web_site = null ][, bool $start_new_session = true ]) : mixed

Initialization is done in a function to avoid polluting the global namespace with variables.

Parameters
$web_site : object = null
$start_new_session : bool = true

whether to start a session or not

Return values
mixed

checkCookieConsent()

Checks if a cookie consent form was obtained. This This function returns true if a session cookie was received from the browser, or a form variable saying cookies are okay was received, or the cookie Yioop profile says the consent mechanism is disabled

checkCookieConsent() : bool
Return values
bool

cookie consent (true) else false

configureRewrites()

Used to setup and handles url rewriting for the Yioop Web app

configureRewrites(object $web_site) : mixed

Developers can add new routes by creating a Routes class in the app_dir with a static method getRoutes which should return an associating array of incoming_path => handler function

Parameters
$web_site : object

used to send error pages if configuration fails

Return values
mixed

routeAppFile()

Used to handle routes that will eventually just serve files from either the APP_DIR These include files like css, scripts, suggest tries, images, and videos.

routeAppFile(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash)

Return values
bool

whether was able to compute a route or not

routeBaseFile()

Used to handle routes that will eventually just serve files from either the BASE_DIR These include files like css, scripts, images, and robots.txt.

routeBaseFile(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

routeDirect()

Used to route page requests to pages that are fixed Public Group wiki that should always be present. For example, 404 page.

routeDirect(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

directUrl()

Given the name of a fixed public group static page creates the url where it can be accessed in this instance of Yioop, making use of the defined variable REDIRECTS_ON.

directUrl(string $name[, bool $with_delim = false ][, bool $with_base_url = false ]) : string
Parameters
$name : string

of static page

$with_delim : bool = false

whether it should be terminated with nothing or ? or &

$with_base_url : bool = false

whether to use SHORT_BASE_URL or BASE_URL (true).

Return values
string

url for the page in question

routeBlog()

Used to route page requests to for the website's public blog

routeBlog(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

routeFeeds()

Used to route page requests for pages corresponding to a group, user, or thread feed. If redirects on then urls ending with /feed_type/id map to a page for the id'th item of that feed_type

routeFeeds(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

feedsUrl()

Given the type of feed, the identifier of the feed instance, and which controller is being used creates the url where that feed item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.

feedsUrl(string $type, int $id[, bool $with_delim = false ][, string $controller = "group" ][, bool $use_short_base_url = true ]) : string
Parameters
$type : string

of feed: group, user, user messages, thread

$id : int

the identifier for that feed.

$with_delim : bool = false

whether it should be terminated with nothing or ? or &

$controller : string = "group"

which controller is being used to access the feed: usually admin or group

$use_short_base_url : bool = true

whether to create the url as a relative url using C\SHORT_BASE_URL or as a full url using C\BASE_URL (the latter is useful for mail notifications)

Return values
string

url for the page in question

routeUserMessages()

routeUserMessages(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

routeController()

Used to route page requests to end-user controllers such as register, admin. urls ending with /controller_name will be routed to that controller.

routeController(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

controllerUrl()

Given the name of a controller for which an easy end-user link is useful creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON. Examples of end-user controllers would be the admin, and register controllers.

controllerUrl(string $name[, bool $with_delim = false ]) : string
Parameters
$name : string

of controller

$with_delim : bool = false

whether it should be terminated with nothing or ? or &

Return values
string

url for the page in question

routeSubsearch()

Used to route page requests for subsearches such as news, video, and images (site owner can define other). Urls of the form /s/subsearch will go the page handling the subsearch.

routeSubsearch(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

subsearchUrl()

Given the name of a subsearch creates the url where it can be accessed on this instance of Yioop, making use of the defined variable REDIRECTS_ON.

subsearchUrl(string $name[, bool $with_delim = false ]) : string

Examples of subsearches include news, video, and images. A site owner can add to these and delete from these.

Parameters
$name : string

of subsearch

$with_delim : bool = false

whether it should be terminated with nothing or ? or &

Return values
string

url for the page in question

routeSerpIcon()

Used to route requests for favicons for pages in search results

routeSerpIcon(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

serpIconUrl()

Return the url to repquest the favicon for a page in the search resutls, making use of the defined variable REDIRECTS_ON.

serpIconUrl(mixed $url, mixed $crawl_time[, bool $with_delim = false ]) : string
Parameters
$url : mixed
$crawl_time : mixed
$with_delim : bool = false

whether it should be terminated with nothing or ? or &

Return values
string

url for the page in question

routeSuggest()

Used to route requests for the suggest-a-url link on the tools page.

routeSuggest(array<string|int, mixed> $route_args) : bool

If redirects on, then /suggest routes to this suggest-a-url page.

Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

suggestUrl()

Return the url for the suggest-a-url link on the more tools page, making use of the defined variable REDIRECTS_ON.

suggestUrl([bool $with_delim = false ]) : string
Parameters
$with_delim : bool = false

whether it should be terminated with nothing or ? or &

Return values
string

url for the page in question

routeWiki()

Used to route page requests for pages corresponding to a wiki page of group. If it is a wiki page for the public group viewed without being logged in, the route might come in as yioop_instance/p/page_name if redirects are on. If it is for a non-public wiki or page accessed with logged in the url will look like either: yioop_instance/group/group_id?a=wiki&page_name=some_name or yioop_instance/admin/group_id?a=wiki&page_name=some_name&csrf_token_string

routeWiki(array<string|int, mixed> $route_args) : bool
Parameters
$route_args : array<string|int, mixed>

of url parts (split on slash).

Return values
bool

whether was able to compute a route or not

wikiUrl()

Given the name of a wiki page, the group it belongs to, and which controller is being used creates the url where that wiki item can be accessed from the instance of Yioop. It makes use of the defined variable REDIRECTS_ON.

wikiUrl(string $name[, bool $with_delim = false ][, string $controller = "static" ][, int $id = CPUBLIC_GROUP_ID ]) : string
Parameters
$name : string

of wiki page

$with_delim : bool = false

whether it should be terminated with nothing or ? or &

$controller : string = "static"

which controller is being used to access the feed: usually static (for the public group), admin, or group

$id : int = CPUBLIC_GROUP_ID

the group the wiki page belongs to

Return values
string

url for the page in question

main()

Command-line shell for testing the class

main() : mixed
Return values
mixed

tl()

import a tl function into Controller Namespace

tl() : mixed
Return values
mixed

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

localesWithStopwordsList()

Returns an array of locales that have a stop words list and a stop words remover method

localesWithStopwordsList() : array<string|int, mixed>
Return values
array<string|int, mixed>

list of locales that have a stopwords list;

localeTagToIso639_2Tag()

Converts a $locale_tag (major-minor) to an Iso 632-2 language name

localeTagToIso639_2Tag(string $locale_tag) : string
Parameters
$locale_tag : string

want to convert

Return values
string

corresponding Iso 632-2 language tag

guessLocale()

Attempts to guess the user's locale based on the request, session, and user-agent data

guessLocale() : string
Return values
string

IANA language tag of the guessed locale

guessLocaleFromString()

Attempts to guess the user's locale based on a string sample

guessLocaleFromString(string $phrase_string[, string $locale_tag = null ]) : string
Parameters
$phrase_string : string

used to make guess

$locale_tag : string = null

language tag to use if can't guess -- if not provided uses current locale's value

Return values
string

IANA language tag of the guessed locale

checkQuery()

Tries to find whether query belongs to a programming language

checkQuery(string $query) : string
Parameters
$query : string

query entered by user

Return values
string

$lang programming language for the the query provided

guessLangEncoding()

Tries to guess at a language tag based on the name of a character encoding

guessLangEncoding(string $encoding) : string
Parameters
$encoding : string

a character encoding name

Return values
string

guessed language tag

guessEncodingHtmlXml()

Tries to guess the encoding used for an Html document

guessEncodingHtmlXml(string $html[, string $return_loc_info = false ]) : mixed
Parameters
$html : string

a character encoding name

$return_loc_info : string = false

if meta http-equiv info was used to find the encoding, then if $return_loc_info is true, we return the location of charset substring. This allows converting to UTF-8 later so cached pages will display correctly and redirects without char encoding won't be given a different hash.

Return values
mixed

either string or array if string then guessed encoding, if array guessed encoding, start_pos of where charset info came from, length

convertUtf8IfNeeded()

Converts page data in a site associative array to UTF-8 if it is not already in UTF-8

convertUtf8IfNeeded(array<string|int, mixed> &$site, string $page_field, string $encoding_field[, function $log_function = "" ]) : mixed
Parameters
$site : array<string|int, mixed>

an associative of info about a web site

$page_field : string

the field in the associative array that contains the $site's web page as a string.

$encoding_field : string

the field in the associative array that contains the character encoding the page is currently in

$log_function : function = ""

a callback function used to write log messages with, if desired.

Return values
mixed

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function takes a variable number of arguments. The first being an identifier to translate. Additional arguments are used to interpolate values in for %s's in the translation.

Return values
string

translated string

setLocaleObject()

Sets the language to be used for locale settings

setLocaleObject(string $locale_tag) : mixed
Parameters
$locale_tag : string

the tag of the language to use to determine locale settings

Return values
mixed

getLocaleTag()

Gets the language tag (for instance, en_US for American English) of the locale that is currently being used. This function has the side effect of setting Yioop's current locale.

getLocaleTag() : string
Return values
string

the tag of the language currently being used for locale settings

getLocaleDirection()

Returns the current language directions.

getLocaleDirection() : string
Return values
string

ltr or rtl depending on if the language is left-to-right or right-to-left

getLocaleQueryStatistics()

Returns the query statistics info for the current llocalt.

getLocaleQueryStatistics() : array<string|int, mixed>
Return values
array<string|int, mixed>

consisting of queries and elapses times for locale computations

getBlockProgression()

Returns the current locales method of writing blocks (things like divs or paragraphs).A language like English puts blocks one after another from the top of the page to the bottom. Other languages like classical Chinese list them from right to left.

getBlockProgression() : string
Return values
string

tb lr rl depending on the current locales block progression

getWritingMode()

Returns the writing mode of the current locale. This is a combination of the locale direction and the block progression. For instance, for English the writing mode is lr-tb (left-to-right top-to-bottom).

getWritingMode() : string
Return values
string

the locales writing mode

w1256ToUTF8()

Convert the string $str encoded in Windows-1256 into UTF-8

w1256ToUTF8(string $str) : string
Parameters
$str : string

Windows-1256 string to convert

Return values
string

the UTF-8 equivalent

utf8chr()

Given a unicode codepoint convert it to UTF-8

utf8chr(int $code) : string
Parameters
$code : int

the codepoint to convert

Return values
string

the corresponding UTF-8 string

formatDateByLocale()

Function for formatting a date string based on the locale.

formatDateByLocale( $timestamp,  $locale_tag) : string
Parameters
$timestamp :

is the crawl time

$locale_tag :

is the tag for locale

Return values
string

formatted date string

upgradeLocalesCheck()

Checks to see if the locale data of Yioop! of a locale in the work dir is older than the currently running Yioop!

upgradeLocalesCheck(string $locale_tag) : mixed
Parameters
$locale_tag : string

locale to check directory of

Return values
mixed

upgradeLocales()

If the locale data of Yioop! in the work directory is older than the currently running Yioop! then this function is called to at least try to copy the new strings into the old profile.

upgradeLocales() : mixed
Return values
mixed

upgradePublicHelpWiki()

Used to force push the default Public and Wiki pages into the current database

upgradePublicHelpWiki(resource &$db) : mixed
Parameters
$db : resource

datasource to use to upgrade

Return values
mixed

upgradeDatabaseWorkDirectoryCheck()

Checks to see if the database data or work_dir folder of Yioop! is from an older version of Yioop! than the currently running Yioop!

upgradeDatabaseWorkDirectoryCheck() : mixed
Return values
mixed

upgradeDatabaseWorkDirectory()

If the database data of Yioop is older than the version of the currently running Yioop then this function is called to try upgrade the database to the new version

upgradeDatabaseWorkDirectory() : mixed
Return values
mixed

updateVersionNumber()

Update the database version number to a new number

updateVersionNumber(object &$db, int $number) : mixed
Parameters
$db : object

datasource for Yioop database

$number : int

the new database number

Return values
mixed

getWikiHelpPages()

Reads the Help articles from default db and returns the array of pages.

getWikiHelpPages() : mixed
Return values
mixed

addActivityAtId()

Used to insert a new activity into the database at a given activity_id

addActivityAtId(resource &$db, string $string_id, string $method_name, int $activity_id) : mixed

Inserting at an ID rather than at the end is useful since activities are displayed in admin panel in order of increasing id.

Parameters
$db : resource

database handle where Yioop database stored

$string_id : string

message identifier to give translations for for activity

$method_name : string

admin_controller method to be called to perform this activity

$activity_id : int

the id location at which to create this activity activity at and below this location will be shifted down by 1.

Return values
mixed

updateTranslationForStringId()

Adds or replaces a translation for a database message string for a given IANA locale tag.

updateTranslationForStringId(resource &$db, string $string_id, string $locale_tag, string $translation) : mixed
Parameters
$db : resource

database handle where Yioop database stored

$string_id : string

message identifier to give translation for

$locale_tag : string

the IANA language tag to update the strings of

$translation : string

the translation for $string_id in the language $locale_tag

Return values
mixed

addRegexDelimiters()

Adds delimiters to a regex that may or may not have them

addRegexDelimiters(string $expression) : string
Parameters
$expression : string

a regex

Return values
string

rgex with delimiters if not there

search for a pcre pattern in a subject from a given offset, return position of first match if found -1 otherwise.

preg_search(string $pattern, string $subject, int $offset[, bool $return_match = false ]) : mixed
Parameters
$pattern : string

a Perl compatible regular expression

$subject : string

to search for pattern in

$offset : int

character offset into $subject to begin searching from

$return_match : bool = false

whether to return as well what the match was for the pattern

Return values
mixed

if $return_match is false then the integer position of first match, otherwise, it returns the ordered pair [$pos, $match].

preg_offset_replace()

Replaces a pcre pattern with a replacement in $subject starting from some offset.

preg_offset_replace(string $pattern, string $replacement, string $subject, int $offset) : string
Parameters
$pattern : string

a Perl compatible regular expression

$replacement : string

what to replace the pattern with

$subject : string

to search for pattern in

$offset : int

character offset into $subject to begin searching from

Return values
string

result of the replacements

parse_ini_with_fallback()

Yioop replacement for parse_ini_file($name, true) in case parse_ini_file is on the disable_functions list. Name has underscores to match original function. This function checks if parse_ini_file is disabled on not. If not, it just calls parse_ini_file; otherwise, it simulates it enough so that configure.ini files used for string translations can be read.

parse_ini_with_fallback(string $file) : array<string|int, mixed>
Parameters
$file : string

filename of ini data to parse into an array

Return values
array<string|int, mixed>

data parse from file

getIniAssignMatch()

Auxiliary function called from parse_ini_with_fallback to extract from the $matches array produced by the former function's preg_match what kind of assignment occurred in the ini file being parsed.

getIniAssignMatch(string $matches) : mixed
Parameters
$matches : string

produced by a preg_match in parse_ini_with_fallback

Return values
mixed

value of ini file assignment

charCopy()

Copies from $source string beginning at position $start, $length many bytes to destination string

charCopy(string $source, string &$destination, int $start, int $length[, string $timeout_msg = "" ]) : mixed
Parameters
$source : string

string to copy from

$destination : string

string to copy to

$start : int

starting offset

$length : int

number of bytes to copy

$timeout_msg : string = ""

message to print if taking more than 30 seconds

Return values
mixed

vByteEncode()

Encodes an integer using variable byte coding.

vByteEncode(int $pos_int) : string
Parameters
$pos_int : int

integer to encode

Return values
string

a string of 1-5 chars depending on how bit $pos_int was

vByteDecode()

Decodes from a string using variable byte coding an integer.

vByteDecode(string $str, int &$offset) : int
Parameters
$str : string

string to use for decoding

$offset : int

byte offset into string when var int stored

Return values
int

the decoded integer

appendUnary()

Appends a number re-encoded in unary to the end of an input string starting at a given bit offset into the string. Here n in unary has bit representation n-1 0's followed by a 1.

appendUnary(int $number, mixed $input, mixed &$start_bit_offset[, mixed $just_bit_offset = false ]) : mixed
Parameters
$number : int

number to append

$input : mixed
$start_bit_offset : mixed
$just_bit_offset : mixed = false
Return values
mixed

either the resulting string or its length

decodeUnary()

Decodes a unary number froman input string at a given bit offset. Here n in unary has bit representation n-1 0's followed by a 1.

decodeUnary(string $input, int &$start_bit_offset) : int
Parameters
$input : string

the string that we want to decode a unary number from

$start_bit_offset : int

the starting bit offset in $input to start decoding from. After the call it will be the position after the decode

Return values
int

the decoded unary number

appendBits()

Appends $num_bits bits from the start of the binary rep of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. If $num_bits == -1, then appends all of $number.

appendBits(int $number, string $input, int &$start_bit_offset[,  $num_bits = -1 ]) : string
Parameters
$number : int

to append

$input : string

the string to append to.

$start_bit_offset : int

starting location to begin append from

$num_bits : = -1

number of bits of $input to append.

Return values
string

resulting string

decodeBits()

Decode $num_bits many bits from the $input string beginning at offset $start_bit_offset. The result of this operation is up $start_bit_offset by number of bits that were able to be decoded.

decodeBits(string $input, int &$start_bit_offset, int $num_bits) : int
Parameters
$input : string

string to decode bits from

$start_bit_offset : int

bit offset to start decoding from in $input

$num_bits : int

number of bits tot try to decode

Return values
int

the number decoded

appendGamma()

Appends gamma code of $number beginning at offset $start_bit_offset of $input string overwriting any bits present. $start_bit_offset is updated to bit position after append.

appendGamma(int $number, string $input, int &$start_bit_offset) : string
Parameters
$number : int

to append

$input : string

the string to append to.

$start_bit_offset : int

starting bit location to begin append from

Return values
string

resulting string

decodeGammaList()

Decodes up to $num_decode gamma encoded integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers.

decodeGammaList(string $input, int &$start_bit_offset, int $num_decode) : array<string|int, mixed>
Parameters
$input : string

the string to decode from

$start_bit_offset : int

starting bit location to decode from

$num_decode : int

number of int's to decode

Return values
array<string|int, mixed>

decoded int's

appendRiceSequence()

Appends using a Rice coding a sequence of integers $int_sequence at offset $start_bit_offset to the string $output, overwriting any bits present at that location. $start_bit_offset is updated to bit position after append.

appendRiceSequence(array<string|int, mixed> $int_sequence, int $modulus, string $output, int &$start_bit_offset[, int $delta_start = -1 ]) : string

Encoding is done as a difference list. If $delta_start is set to a value other than >= then the first gap is assumed to be from int $delta_start

Parameters
$int_sequence : array<string|int, mixed>

int's to append

$modulus : int

i in the 2^i modulus to use for Rice code

$output : string

the string to append to.

$start_bit_offset : int

starting bit location to begin append from

$delta_start : int = -1

if >= 0 previous int to use for difference list otherwise the first integer is encoded as itself rather than a difference

Return values
string

resulting string

decodeRiceSequence()

Decodes up to $num_decode rice encoded difference list of integers beginning at $start_bit_offset. $start_bit_offset is updated to the bit position after the decoded integers. If $delta_start >= 0 then the first int is assumed to be the difference from $delta_start;

decodeRiceSequence(string $input, int &$start_bit_offset, int $num_decode[, int $delta_start = -1 ]) : array<string|int, mixed>
Parameters
$input : string

the string to decode from

$start_bit_offset : int

starting bit location to decode from

$num_decode : int

number of int's to decode

$delta_start : int = -1

if >= 0 previous int to use for difference list otherwise the first integer is decoded as itself rather than a difference

Return values
array<string|int, mixed>

decoded int's

encodePositionList()

Encodes a list of integer positions of a term in a document. This is done as a gamma code of the first integer followed by the Rice coding of the remaining integers using a modulus based on the average gap between integers. If the number of positions is 1 or 2 then a gamma of each position only is used.

encodePositionList(array<string|int, mixed> $positions) : string
Parameters
$positions : array<string|int, mixed>

integer term positions

Return values
string

encoded position list

decodePositionList()

Decodes up to $num_decode term in document position integers from string $input under the assumption $input is encoded as per

decodePositionList(string $input, int $num_decode) : array<string|int, mixed>
Parameters
$input : string

string to decode from

$num_decode : int

number of integer to decode

Tags
see
encodePositionList

.

Return values
array<string|int, mixed>

decoded positions

encode255()

Recodes a string in a 1-1 fashion to a string not involving \xFF (255). I.e., it maps characters \xFE -> \xFE\FD and \xFF -> \xFE\FE

encode255(string $str) : string
Parameters
$str : string

to be encoded

Return values
string

encoded string without \xFF

decode255()

Decodes a string in a 1-1 fashion from a string not involving \xFF (255). I.e., it maps characters \xFE\FE -> \xFF and \xFE\FD -> \xFF

decode255(string $str) : string
Parameters
$str : string

to be frcoded

Return values
string

decoded string

encodeUnderscore()

Recodes a string in a 1-1 fashion to a string not involving underscore (_). I.e., it maps characters - -> -- and _ -> -=

encodeUnderscore(string $str) : string
Parameters
$str : string

to be encoded

Return values
string

encoded string without _

decodeUnderscore()

Decodes a string in a 1-1 fashion from a string not involving underscore (_). I.e., it maps characters -= -> _ and -- -> -

decodeUnderscore(string $str) : string
Parameters
$str : string

to be frcoded

Return values
string

decoded string

packEncode255()

Encodes a list of strings as their @see encode255 versions separated by \xFF's

packEncode255(array<string|int, mixed> $strs) : string
Parameters
$strs : array<string|int, mixed>

strings to encode as a single string

Return values
string

encoded list

unpackDecode255()

Decodes a list of strings from a string that encoded as their @see encode255 of its elements separated by \xFF's

unpackDecode255(string $encoded_strs) : array<string|int, mixed>
Parameters
$encoded_strs : string

string to decode into a list of strings

Return values
array<string|int, mixed>

decoded list

packPosting()

Makes an packed integer string from a docindex and the number of occurrences of a word in the document with that docindex.

packPosting(int $doc_index, array<string|int, mixed> $position_list[, bool $delta = true ]) : string
Parameters
$doc_index : int

index (i.e., a count of which document it is rather than a byte offset) of a document in the document string

$position_list : array<string|int, mixed>

integer positions word occurred in that doc

$delta : bool = true

if true then stores the position_list as a sequence of differences (a delta list)

Return values
string

a modified9 (our compression scheme) packed string containing this info.

unpackPosting()

Given a packed integer string, uses the top three bytes to calculate a doc_index of a document in the shard, and uses the low order byte to computer a number of occurrences of a word in that document.

unpackPosting(string $posting, int &$offset[, bool $dedelta = true ]) : array<string|int, mixed>
Parameters
$posting : string

a string containing a doc index position list pair coded encoded using modified9

$offset : int

a offset into the string where the modified9 posting is encoded

$dedelta : bool = true

if true then assumes the list is a sequence of differences (a delta list) and undoes the difference to get the original sequence

Return values
array<string|int, mixed>

consisting of integer doc_index and a subarray consisting of integer positions of word in doc.

addDocIndexPostings()

This method is used while appending one index shard to another.

addDocIndexPostings(string &$postings, int $add_offset) : string

Given a string of postings adds $add_offset add to each offset to the document map in each posting.

Parameters
$postings : string

a string of index shard postings

$add_offset : int

an fixed amount to add to each postings doc map offset

Return values
string

$new_postings where each doc offset has had $add_offset added to it

deltaList()

Computes the difference of a list of integers.

deltaList(array<string|int, mixed> $list) : array<string|int, mixed>

i.e., (a1, a2, a3, a4) becomes (a1, a2-a1, a3-a2, a4-a3)

Parameters
$list : array<string|int, mixed>

a nondecreasing list of integers

Return values
array<string|int, mixed>

the corresponding list of differences of adjacent integers

deDeltaList()

Given an array of differences of integers reconstructs the original list. This computes the inverse of the deltaList function

deDeltaList(array<string|int, mixed> &$delta_list) : array<string|int, mixed>
Parameters
$delta_list : array<string|int, mixed>

a list of nonegative integers

Tags
see
deltaList
Return values
array<string|int, mixed>

a nondecreasing list of integers

encodeModified9()

Encodes a sequence of integers x, such that 1 <= x <= 2<<28-1 as a string. NOTICE x>=1.

encodeModified9(array<string|int, mixed> $list) : string

The encoded string is a sequence of 4 byte words (packed int's). The high order 2 bits of a given word indicate whether or not to look at the next word. The codes are as follows: 11 start of encoded string, 10 continue four more bytes, 01 end of encoded, and 00 indicates whole sequence encoded in one word.

After the high order 2 bits, the next most significant bits indicate the format of the current word. There are nine possibilities: 00 - 1 28 bit number, 01 - 2 14 bit numbers, 10 - 3 9 bit numbers, 1100 - 4 6 bit numbers, 1101 - 5 5 bit numbers, 1110 6 4 bit numbers, 11110 - 7 3 bit numbers, 111110 - 12 2 bit numbers, 111111 - 24 1 bit numbers.

Parameters
$list : array<string|int, mixed>

a list of positive integers satsfying above

Return values
string

encoded string

packListModified9()

Packs the contents of a single word of a sequence being encoded using Modified9.

packListModified9(int $continue_bits, int $cnt, array<string|int, mixed> $pack_list) : string
Parameters
$continue_bits : int

the high order 2 bits of the word

$cnt : int

the number of element that will be packed in this word

$pack_list : array<string|int, mixed>

a list of positive integers to pack into word

Tags
see
encodeModified9
Return values
string

encoded 4 byte string

nextPostString()

Returns the next complete posting string from $input_string being at offset.

nextPostString(string &$input_string, int &$offset) : string

Does not do any decoding.

Parameters
$input_string : string

a string of postings

$offset : int

an offset to this string which will be updated after call

Return values
string

undecoded posting

decodeModified9()

Decoded a sequence of positive integers from a string that has been encoded using Modified 9

decodeModified9(string $input_string, int &$offset) : array<string|int, mixed>
Parameters
$input_string : string

string to decode from

$offset : int

where to string in the string, after decode points to where one was after decoding.

Tags
see
encodeModified9
Return values
array<string|int, mixed>

sequence of positive integers that were decoded

unpackListModified9()

Decode a single word with high two bits off according to modified 9

unpackListModified9(string $encoded_list) : array<string|int, mixed>
Parameters
$encoded_list : string

four byte string to decode

Return values
array<string|int, mixed>

sequence of integers that results from the decoding.

docIndexModified9()

Given an int encoding encoding a doc_index followed by a position list using Modified 9, extracts just the doc_index.

docIndexModified9(int $encoded_list) : int
Parameters
$encoded_list : int

in the just described format

Return values
int

a doc index into an index shard document map.

unpackInt()

Unpacks an int from a 4 char string

unpackInt(string $str) : int
Parameters
$str : string

where to extract int from

Return values
int

extracted integer

packInt()

Packs an int into a 4 char string

packInt(int $my_int) : string
Parameters
$my_int : int

the integer to pack

Return values
string

the packed string

unpackFloat()

Unpacks a float from a 4 char string

unpackFloat(string $str) : float
Parameters
$str : string

where to extract int from

Return values
float

extracted float

packFloat()

Packs an float into a four char string

packFloat(float $my_float) : string
Parameters
$my_float : float

the float to pack

Return values
string

the packed string

renameSerializedObject()

Used to change the namespace of a serialized php object (assumes doesn't have nested subobjects)

renameSerializedObject(string $class_name, string $object_string) : string
Parameters
$class_name : string

new fully qualified name with namespace

$object_string : string

serialized object

Return values
string

serialized object with new name

getDomFromString()

Parses a provided string to make a DOM object. First tries to parse using XML and if this fails uses the more robust HTML Dom parser and manipulates the resulting DOM tree to make correspond to original tags for XML that isn't HTML

getDomFromString(string $to_parse) : DOMDocument
Parameters
$to_parse : string

the string to parse a DOMDocument from

Return values
DOMDocument

computed based on the provided string

getTags()

Returns an array of DOMDocuments for the nodes that match an xpath query on $dom, a DOMDocument

getTags(DOMDocument $dom, string $query) : array<string|int, mixed>
Parameters
$dom : DOMDocument

document to run xpath query on

$query : string

xpath query to run

Return values
array<string|int, mixed>

of DOMDocuments one for each node matching the xpath query in the original DOMDocument

toHexString()

Converts a string to string where each char has been replaced by its hexadecimal equivalent

toHexString(string $str) : string
Parameters
$str : string

what we want rewritten in hex

Return values
string

the hexified string

toIntString()

Converts a string to string where each char has been replaced by a Integer equivalent

toIntString(string $str) : string
Parameters
$str : string

what we want rewritten in hex

Return values
string

the hexified string

toBinString()

Converts a string to string where each char has been replaced by its binary equivalent

toBinString(string $str) : string
Parameters
$str : string

what we want rewritten in hex

Return values
string

the binary string

metricToInt()

Converts a string of the form some int followed by K, M, or G.

metricToInt(string $metric_num) : int

into its integer equivalent. For example 4K would become 4000, 16M would become 16000000, and 1G would become 1000000000 Note not using base 2 for K, M, G

Parameters
$metric_num : string

metric number to convert

Return values
int

number the metric string corresponded to

intToMetric()

Converts a number to a string followed by nothing, K, M, G, T depending on whether number is < 1000, < 10^6, < 10^9, or < 10^(12)

intToMetric(int $num) : string
Parameters
$num : int

number to convert

Return values
string

number the metric string corresponded to

crawlLog()

Logs a message to a logfile or the screen. The super-global field $_SERVER['LOG_TO_FILES'] determines if this will log to a file. If not, then in cli mode, will log to stdout, otherwise it will use error_log. When logging to file $_SERVER["NO_ROTATE_LOGS"] controls whether or not there will be a log file rotation. The first call to this method is typically used to set up a process to check for liveness. For example a call: crawlLog("\n\nInitialize logger..", $this->process_name, true); says $this->process_name should be checked for liveness as part of any subsequent logging activity such as a call crawlLog("Another Message"); (note subsequent call don't need to specify the process name).

crawlLog(string $msg[, string $lname = null ][, bool $check_process_handler = false ]) : mixed
Parameters
$msg : string

message to log. If empty then no message written

$lname : string = null

name of log file in the LOG_DIR directory, rotated logs will also use this as their basename followed by a number followed by gzipped (since they are gzipped (older versions of Yioop used bzip Some distros don't have bzip but do have gzip. Also gzip was being used elsewhere in Yioop, so to remove the dependency bzip was replaced )).

$check_process_handler : bool = false

by default set to false. After the first time set to true, as long as in subsequent calls set to false, processHandler will be called to check how long the code has run since the last time processHandler called.

Return values
mixed

makeTimestamp()

Used to make a log file entry time string of format: entry number, time in r format.

makeTimestamp([int $time = -1 ]) : string
Parameters
$time : int = -1

a unix timestamp

Return values
string

[line_count_in_log r_formatted_date]

crawlTimeoutLog()

Writes a log message $msg if more than LOG_TIMEOUT time has passed since the last time crawlTimeoutLog was called. Useful in loops to write a message as progress is made through the loop (but not on every iteration, but say every 30 seconds).

crawlTimeoutLog(mixed $msg) : bool
Parameters
$msg : mixed

usually a string with what to be printed out after the timeout period. If $msg === true then clears the timeout cache

Return values
bool

whether a log message was written

crawlHash()

Computes an 8 byte hash of a string for use in storing documents.

crawlHash(string $string[, bool $raw = false ]) : string

An eight byte hash was chosen so that the odds of collision even for a few billion documents via the birthday problem are still reasonable. If the raw flag is set to false then an 11 byte base64 encoding of the 8 byte hash is returned. The hash is calculated as the xor of the two halves of the 16 byte md5 of the string. (8 bytes takes less storage which is useful for keeping more doc info in memory)

Parameters
$string : string

the string to hash

$raw : bool = false

whether to leave raw or base 64 encode

Return values
string

the hash of $string

crawlHashWord()

Used to create a 20 byte hash of a string (typically a word or phrase with a wikipedia page). Format is 8 byte crawlHash of term (md5 of term two halves XOR'd), followed by a \x00, followed by the first 11 characters from the term. If there are not enough char's to make 20 bytes, then the string is padded with \x00s to 20bytes.

crawlHashWord(string $string[, bool $raw = false ]) : string
Parameters
$string : string

word to hash

$raw : bool = false

whether to base64Hash the result

Return values
string

first 8 bytes of md5 of $string concatenated with \x00 to indicate the hash is of a word not a phrase concatenated with the padded to 11 byte $meta_string.

canonicalTerm()

Take a $term that might have come from adocuments and converts it to a string of 16 bytes which is either the original term padded by underscores or the first seven chars of the term followed by an underscore followed by the base64 encoding of the first 6 chars of its md5 hash.

canonicalTerm(string $term) : string

Base64 used to make this all nice and printable.

Parameters
$term : string

to made into a canonical form

Return values
string

canonicalize by apbove version of term.

compareWordHashes()

Used to compare to ids for index dictionary lookup. ids are a 8 byte crawlHash together with 12 byte non-hash suffix.

compareWordHashes(string $id1, string $id2) : int
Parameters
$id1 : string

20 byte word id to compare

$id2 : string

20 byte word id to compare

Return values
int

negative if $id1 smaller, positive if bigger, and 0 if same

base64Hash()

Converts a crawl hash number to something closer to base64 coded but so doesn't get confused in urls or DBs

base64Hash(string $string) : string
Parameters
$string : string

a hash to base64 encode

Return values
string

the encoded hash

unbase64Hash()

Decodes a crawl hash number from base64 to raw ASCII

unbase64Hash(string $base64) : string
Parameters
$base64 : string

a hash to decode

Return values
string

the decoded hash

webencode()

Encodes a string in a format suitable for post data (mainly, base64, but str_replace data that might mess up post in result)

webencode(string $str) : string
Parameters
$str : string

string to encode

Return values
string

encoded string

webdecode()

Decodes a string encoded by webencode

webdecode(string $str) : string
Parameters
$str : string

string to encode

Return values
string

encoded string

crawlCrypt()

The crawlHash function is used to encrypt passwords stored in the database.

crawlCrypt(string $string[, int $salt = null ]) : string

It tries to use the best version the Blowfish variant of php's crypt function available on the current system.

Parameters
$string : string

the string to encrypt

$salt : int = null

salt value to be used (needed to verify if a password is valid)

Return values
string

the crypted string where crypting is done using crawlHash

partitionByHash()

Used by a controller to take a table and return those rows in the table that a given queue_server would be responsible for handling

partitionByHash(array<string|int, mixed> $table, string $field, int $num_partition, int $instance[, object $callback = null ]) : array<string|int, mixed>
Parameters
$table : array<string|int, mixed>

an array of rows of associative arrays which a queue_server might need to process

$field : string

column of $table whose values should be used for partitioning

$num_partition : int

number of queue_servers to choose between

$instance : int

the id of the particular server we are interested in

$callback : object = null

function or static method that might be applied to input before deciding the responsible queue_server. For example, if input was a url we might want to get the host before deciding on the queue_server

Return values
array<string|int, mixed>

the reduced table that the $instance queue_server is responsible for

calculatePartition()

Used by a controller to say which queue_server should receive a given input

calculatePartition(string $input, int $num_partition[, object $callback = null ]) : int
Parameters
$input : string

can view as a key that might be processes by a queue_server. For example, in some cases input might be a url and we want to determine which queue_server should be responsible for queuing that url

$num_partition : int

number of queue_servers to choose between

$callback : object = null

function or static method that might be applied to input before deciding the responsible queue_server. For example, if the input was a url we might want to get the host before deciding on the queue_server

Return values
int

id of server responsible for input

changeInMicrotime()

Measures the change in time in seconds between two timestamps to microsecond precision

changeInMicrotime(string $start[, string $end = null ]) : float
Parameters
$start : string

starting time with microseconds

$end : string = null

ending time with microseconds, if null use current time

Return values
float

time difference in seconds

microTimestamp()

Timestamp of current epoch with microsecond precision useful for situations where time() might cause too many collisions (account creation, etc)

microTimestamp() : string
Return values
string

timestamp to microsecond of time in second since start of current epoch

checkTimeInterval()

Checks that a timestamp is within the time interval given by a start time (HH:mm) and a duration

checkTimeInterval(string $start_time, string $duration[, int $time = -1 ]) : int
Parameters
$start_time : string

string of the form (HH:mm)

$duration : string

string containing an int in seconds

$time : int = -1

a Unix timestamp.

Return values
int

-1 if the time of day of $time is not within the given interval. Otherwise, the Unix timestamp at which the interval will be over for the same day as $time.

convertPixels()

Converts a CSS unit string into its equivalent in pixels. This is used by @see SvgProcessor.

convertPixels(string $value) : int
Parameters
$value : string

a number followed by a legal CSS unit

Return values
int

a number in pixels

countFiles()

Returns the number of files in a folder

countFiles(string $folder) : int
Parameters
$folder : string

path to folder to count

Return values
int

number of files

makePath()

Creates folders along a filesystem path if they don't exist

makePath(string $path) : bool
Parameters
$path : string

a file system path

Return values
bool

success or failure

deleteFileOrDir()

This is a callback function used in the process of recursively deleting a directory

deleteFileOrDir(string $file_or_dir) : mixed
Parameters
$file_or_dir : string

the filename or directory name to be deleted

Tags
see
DatasourceManager::unlinkRecursive()
Return values
mixed

setWorldPermissions()

This is a callback function used in the process of recursively chmoding to 777 all files in a folder

setWorldPermissions(string $file) : mixed
Parameters
$file : string

the filename or directory name to be chmod

Tags
see
DatasourceManager::setWorldPermissionsRecursive()
Return values
mixed

fileInfo()

This is a callback function used in the process of recursively calculating an array of file modification times and files sizes for a directory

fileInfo(string $file) : an
Parameters
$file : string

a name of a file in the file system

Return values
an

array whose single element contain an associative array with the size and modification time of the file

orderCallback()

Callback function used to sort documents by a field

orderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: orderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

-1 if first doc bigger 1 otherwise

stringOrderCallback()

Callback function used to sort documents by a field where field is assume to be a string

stringOrderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: stringOrderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

-1 if first doc smaller 1 otherwise

stringROrderCallback()

Callback function used to sort documents by a field where field is assume to be a string

stringROrderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: stringROrderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

-1 if first doc bigger 1 otherwise

rorderCallback()

Callback function used to sort documents by a field in reverse order

rorderCallback(string $word_doc_a, string $word_doc_b[, string $order_field = null ]) : int

Should be initialized before using in usort with a call like: rorderCallback($tmp, $tmp, "field_want");

Parameters
$word_doc_a : string

doc id of first document to compare

$word_doc_b : string

doc id of second document to compare

$order_field : string = null

which field of these associative arrays to sort by

Return values
int

1 if first doc bigger -1 otherwise

lessThan()

Callback to check if $a is less than $b

lessThan(float $a, float $b) : int

Used to help sort document results returned in PhraseModel called in IndexArchiveBundle

Parameters
$a : float

first value to compare

$b : float

second value to compare

Tags
see
IndexArchiveBundle::getSelectiveWords()
see
PhraseModel::getPhrasePageResults()
Return values
int

-1 if $a is less than $b; 1 otherwise

greaterThan()

Callback to check if $a is greater than $b

greaterThan(float $a, float $b) : int

Used to help sort document results returned in PhraseModel called in IndexArchiveBundle

Parameters
$a : float

first value to compare

$b : float

second value to compare

Tags
see
IndexArchiveBundle::getSelectiveWords()
see
PhraseModel::getTopPhrases()
Return values
int

-1 if $a is greater than $b; 1 otherwise

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

remoteAddress()

Compute the real remote address of the incoming connection including forwarding

remoteAddress() : mixed
Return values
mixed

readInput()

Used to read a line of input from the command-line

readInput() : string
Return values
string

from the command-line

readPassword()

Used to read a line of input from the command-line (on unix machines without echoing it)

readPassword() : string
Return values
string

from the command-line

readMessage()

Used to read a several lines from the terminal up until a last line consisting of just a "."

readMessage() : string
Return values
string

from the command-line

mimeType()

Returns the mime type of the provided file name if it can be determined.

mimeType(string $file_name[, bool $use_extension = false ]) : string
Parameters
$file_name : string

(name of file including path to figure out mime type for)

$use_extension : bool = false

whether to just try to guess from the file extension rather than looking at the file

Return values
string

mime type or unknown if can't be determined

generalIsA()

Checks if class_1 is the same as class_2 or has class_2 as a parent Behaves like 3 param version (last param true) of PHP is_a function that came into being with Version 5.3.9.

generalIsA(mixed $class_1, mixed $class_2) : bool
Parameters
$class_1 : mixed

object or string class name to see if in class2

$class_2 : mixed

object or string class name to see if contains class1

Return values
bool

equal or contains class

stripAttributes()

Given the contents of a start XML/HMTL tag strips out all the attributes non listed in $safe_attribute_list

stripAttributes(string $start_tag_contents[, array<string|int, mixed> $safe_attribute_list = [] ]) : string
Parameters
$start_tag_contents : string

the contents of an HTML/XML tag. I.e., if the tag was <tag stuff> then $start_tag_contents could be stuff

$safe_attribute_list : array<string|int, mixed> = []

a list of attributes which should be kept

Return values
string

containing only safe attributes and their values

parseCsv()

Used to parse into a two dimensional array a string that contains CSV data.

parseCsv(string $csv_string) : array<string|int, mixed>
Parameters
$csv_string : string

string with csv data

Return values
array<string|int, mixed>

two dimensional array of elements from csv

arraytoCsv()

Converts an array of values to a comma separated value formatted string.

arraytoCsv(array<string|int, mixed> $arr) : string
Parameters
$arr : array<string|int, mixed>

values to convert

Return values
string

CSV string after conversion

diff()

Computes a Unix-style diff of two strings. That is it only outputs lines which disagree between the two strings. It outputs +line if a line occurs in the second but not first string and -line if a line occurs in the first string but not the second.

diff(string $data1, string $data2[, bool $html = false ]) : string
Parameters
$data1 : string

first string to compare

$data2 : string

second string to compare

$html : bool = false

whether to output html highlighting

Return values
string

representing info about where $data1 and $data2 don't match

computeLCS()

Computes the longest common subsequence of two arrays

computeLCS(array<string|int, mixed> $lines1, array<string|int, mixed> $lines2, int $offset) : mixed
Parameters
$lines1 : array<string|int, mixed>

an array of lines to compute LCS of

$lines2 : array<string|int, mixed>

an array of lines to compute LCS of

$offset : int

an offset to shift over array addresses in output by

Return values
mixed

extractLCSFromTable()

Extracts from a table of longest common sequence moves (probably calculated by @see computeLCS) and a starting coordinate $i, $j in that table, a longest common subsequence

extractLCSFromTable(array<string|int, mixed> $lcs_moves, array<string|int, mixed> $lines, int $i, int $j, int $offset, array<string|int, mixed> &$lcs) : mixed
Parameters
$lcs_moves : array<string|int, mixed>

a table of move computed by computeLCS

$lines : array<string|int, mixed>

from first of the two arrays computing LCS of

$i : int

a line number in string 1

$j : int

a line number in string 2

$offset : int

a number to add to each line number output into $lcs. This is useful if we have trimmed off the initially common lines from our two strings we are trying to compute the LCS of

$lcs : array<string|int, mixed>

an array of triples (index_string1, index_string2, line) the indexes indicate the line number in each string, line is the line in common the two strings

Return values
mixed

tail()

Returns an array of the last $num_lines many lines our of a file

tail(string $file_name, string $num_lines) : array<string|int, mixed>
Parameters
$file_name : string

name of file to return lines from

$num_lines : string

number of lines to retrieve

Return values
array<string|int, mixed>

retrieved lines

lineFilter()

Given an array of lines returns a subarray of those lines containing the filter string or filter array

lineFilter(string $lines, mixed $filters[, bool $case_insensitive = true ]) : array<string|int, mixed>
Parameters
$lines : string

to search

$filters : mixed

either string to filter lines with or an array of strings (any of which can be present to pass the filter)

$case_insensitive : bool = true

whether search should be done case insensitively or not.

Return values
array<string|int, mixed>

lines containing the string

logLineTimestamp()

Tries to extract a timestamp from a line which is presumed to come from a Yioop log file

logLineTimestamp(string $line) : int
Parameters
$line : string

to search

Return values
int

timestamp of that log entry

isPositiveInteger()

Returns whether an input can be parsed to a positive integer

isPositiveInteger(mixed $input) : bool
Parameters
$input : mixed
Return values
bool

whether $input can be parsed to a positive integer.

measureCall()

Used to measure the memory footprint in bytes and time spent calling a method of an object. It also records number of time the method has been called.

measureCall(object $object, string $method[, mixed $arguments = [] ][, string $call_name = "" ]) : mixed

Just calls the method without any recording or timing until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.

Parameters
$object : object

name of object whose method we want to call and measure

$method : string

method we're calling

$arguments : mixed = []
$call_name : string = ""

name to use when outputting stats for this call, defaults to $method.

Return values
mixed

whatever method would normally returned when called as above

measureObject()

Used to measure the memory footprint of an object in Yioop and save it to a statistics file No recording is done until an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to.

measureObject(object $object[, string $save_file = "" ][, mixed $class_name = "" ]) : mixed
Parameters
$object : object

name of object whose size we want to measure

$save_file : string = ""

statistics file to write info to

$class_name : mixed = ""
Return values
mixed

measureObjectCall()

General method called by for @see measureCall and @see measureObject Used to measure the memory footprint in bytes of an object or memory and time spent calling a method of an object. It also records number of time the method has been called. When used to call a method before initialization, just calls the method without any recording or timing. To initialize, an initial call to the function measureCall(null, save_statistics_file) where save_statistics_file is the name of the file you won't to store statistics to should be done.

measureObjectCall(object $object, string $method[, mixed $arguments = [] ][, string $call_name = "" ]) : mixed
Parameters
$object : object

name of object whose method we want to call and measure

$method : string

method we're calling

$arguments : mixed = []
$call_name : string = ""

name to use when outputting stats for this call, defaults to $method.

Return values
mixed

whatever method would normally returned when called as above

variableClone()

Makes a deep copy of a variable regardless of its type

variableClone(mixed $var) : mixed
Parameters
$var : mixed

variable to deep copy

Return values
mixed

the deep copy

garbageCollect()

Runs various system garbage collection functions and returns number of bytes freed.

garbageCollect() : int
Return values
int

number of bytes freed

utf8SafeSaveHtml()

The dom method saveHTML has a tendency to replace UTF-8, non-ascii characters with html entities. This is supposed to save avoiding the replacement.

utf8SafeSaveHtml(DOMDocument $dom) : string

What it does is to first save the dom, then it replaces htmlentities of the form &single_char; or &#some_number; with the UTF-8 they correspond to. It leaves all other entities as they are

Parameters
$dom : DOMDocument
Return values
string

output of saving html

utf8WordWrap()

A UTF-8 safe version of PHP's wordwrap function that wraps a string to a given number of characters

utf8WordWrap(string $string[, int $width = 75 ][, string $break = " " ][, bool $cut = false ]) : string
Parameters
$string : string

the input string

$width : int = 75

the number of characters at which the string will be wrapped

$break : string = " "

string used to break a line into two

$cut : bool = false

whether to always force wrap at $width characters even if word hasn't ended

Return values
string

the given string wrapped at the specified length

upgradeDatabaseVersion1()

Upgrades a Version 0 version of the Yioop database to a Version 1 version

upgradeDatabaseVersion1(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion2()

Upgrades a Version 1 version of the Yioop database to a Version 2 version

upgradeDatabaseVersion2(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion3()

Upgrades a Version 2 version of the Yioop database to a Version 3 version

upgradeDatabaseVersion3(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion4()

Upgrades a Version 3 version of the Yioop database to a Version 4 version

upgradeDatabaseVersion4(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion5()

Upgrades a Version 4 version of the Yioop database to a Version 5 version

upgradeDatabaseVersion5(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion6()

Upgrades a Version 5 version of the Yioop database to a Version 6 version

upgradeDatabaseVersion6(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion7()

Upgrades a Version 6 version of the Yioop database to a Version 7 version

upgradeDatabaseVersion7(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion8()

Upgrades a Version 7 version of the Yioop database to a Version 8 version

upgradeDatabaseVersion8(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion9()

Upgrades a Version 8 version of the Yioop database to a Version 9 version

upgradeDatabaseVersion9(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion10()

Upgrades a Version 9 version of the Yioop database to a Version 10 version

upgradeDatabaseVersion10(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion11()

Upgrades a Version 10 version of the Yioop database to a Version 11 version

upgradeDatabaseVersion11(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion12()

Upgrades a Version 11 version of the Yioop database to a Version 12 version

upgradeDatabaseVersion12(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion13()

Upgrades a Version 12 version of the Yioop database to a Version 13 version

upgradeDatabaseVersion13(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion14()

Upgrades a Version 13 version of the Yioop database to a Version 14 version

upgradeDatabaseVersion14(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion15()

Upgrades a Version 14 version of the Yioop database to a Version 15 version

upgradeDatabaseVersion15(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion16()

Upgrades a Version 15 version of the Yioop database to a Version 16 version

upgradeDatabaseVersion16(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion17()

Upgrades a Version 16 version of the Yioop database to a Version 17 version

upgradeDatabaseVersion17(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion18()

Upgrades a Version 17 version of the Yioop database to a Version 18 version

upgradeDatabaseVersion18(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion19()

Upgrades a Version 18 version of the Yioop database to a Version 19 version This update has been superseded by the Version20 update and so its contents have been eliminated.

upgradeDatabaseVersion19(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion20()

Upgrades a Version 19 version of the Yioop database to a Version 20 version This is a major upgrade as the user table have changed. This also acts as a cumulative since version 0.98. It involves a web form that has only been localized to English

upgradeDatabaseVersion20(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion21()

Upgrades a Version 20 version of the Yioop database to a Version 21 version

upgradeDatabaseVersion21(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion22()

Upgrades a Version 21 version of the Yioop database to a Version 22 version

upgradeDatabaseVersion22(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion23()

Upgrades a Version 22 version of the Yioop database to a Version 23 version

upgradeDatabaseVersion23(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion24()

Upgrades a Version 23 version of the Yioop database to a Version 24 version

upgradeDatabaseVersion24(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion25()

Upgrades a Version 24 version of the Yioop database to a Version 25 version This version upgrade includes creation of Help group that holds help pages.

upgradeDatabaseVersion25(object &$db) : mixed

Help Group is created with GROUP_ID=HELP_GROUP_ID. If a Group with Group_ID=HELP_GROUP_ID already exists, then that GROUP is moved to the end of the GROUPS table(Max group id is used).

Parameters
$db : object

data source to use to upgrade

Return values
mixed

upgradeDatabaseVersion26()

Upgrades a Version 25 version of the Yioop database to a Version 26 version This version upgrade includes updation fo the Help pages in the database to work with the changes to the way Hyperlinks are specified in wiki markup.

upgradeDatabaseVersion26(object &$db) : mixed

The changes were implemented to point all articles with page names containing %20 to be able to work with '_' and vice versa.

Parameters
$db : object

data source to use to upgrade

Return values
mixed

upgradeDatabaseVersion27()

Upgrades a Version 26 version of the Yioop database to a Version 27 version

upgradeDatabaseVersion27(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion28()

Upgrades a Version 27 version of the Yioop database to a Version 28 version

upgradeDatabaseVersion28(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion29()

Upgrades a Version 28 version of the Yioop database to a Version 29 version

upgradeDatabaseVersion29(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion30()

Upgrades a Version 29 version of the Yioop database to a Version 30 version

upgradeDatabaseVersion30(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion31()

Upgrades a Version 30 version of the Yioop database to a Version 31 version

upgradeDatabaseVersion31(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion32()

Upgrades a Version 31 version of the Yioop database to a Version 32 version

upgradeDatabaseVersion32(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion33()

Upgrades a Version 32 version of the Yioop database to a Version 33 version

upgradeDatabaseVersion33(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion34()

Upgrades a Version 33 version of the Yioop database to a Version 34 version

upgradeDatabaseVersion34(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion35()

Upgrades a Version 34 version of the Yioop database to a Version 35 version

upgradeDatabaseVersion35(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion36()

Upgrades a Version 35 version of the Yioop database to a Version 36 version

upgradeDatabaseVersion36(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion37()

Upgrades a Version 36 version of the Yioop database to a Version 37 version

upgradeDatabaseVersion37(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion38()

Upgrades a Version 37 version of the Yioop database to a Version 38 version

upgradeDatabaseVersion38(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion39()

Upgrades a Version 38 version of the Yioop database to a Version 39 version

upgradeDatabaseVersion39(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion40()

Upgrades a Version 39 version of the Yioop database to a Version 40 version

upgradeDatabaseVersion40(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion41()

Upgrades a Version 40 version of the Yioop database to a Version 41 version

upgradeDatabaseVersion41(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion42()

Upgrades a Version 41 version of the Yioop database to a Version 42 version

upgradeDatabaseVersion42(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion43()

Upgrades a Version 42 version of the Yioop database to a Version 43 version

upgradeDatabaseVersion43(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion44()

Upgrades a Version 43 version of the Yioop database to a Version 44 version

upgradeDatabaseVersion44(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion45()

Upgrades a Version 44 version of the Yioop database to a Version 45 version

upgradeDatabaseVersion45(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion46()

Upgrades a Version 45 version of the Yioop database to a Version 46 version

upgradeDatabaseVersion46(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion47()

Upgrades a Version 46 version of the Yioop database to a Version 47 version

upgradeDatabaseVersion47(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion48()

Upgrades a Version 47 version of the Yioop database to a Version 48 version

upgradeDatabaseVersion48(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion49()

Upgrades a Version 48 version of the Yioop database to a Version 49 version

upgradeDatabaseVersion49(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion50()

Upgrades a Version 49 version of the Yioop database to a Version 50 version

upgradeDatabaseVersion50(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion51()

Upgrades a Version 50 version of the Yioop database to a Version 51 version

upgradeDatabaseVersion51(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion52()

Upgrades a Version 51 version of the Yioop database to a Version 52 version

upgradeDatabaseVersion52(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion53()

Upgrades a Version 52 version of the Yioop database to a Version 53 version

upgradeDatabaseVersion53(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion54()

Upgrades a Version 53 version of the Yioop database to a Version 54 version

upgradeDatabaseVersion54(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion55()

Upgrades a Version 54 version of the Yioop database to a Version 55 version

upgradeDatabaseVersion55(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion57()

Upgrades a Version 56 version of the Yioop database to a Version 5 version

upgradeDatabaseVersion57(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion58()

Upgrades a Version 57 version of the Yioop database to a Version 58 version

upgradeDatabaseVersion58(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion59()

Upgrades a Version 58 version of the Yioop database to a Version 59 version

upgradeDatabaseVersion59(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion60()

Upgrades a Version 59 version of the Yioop database to a Version 60 version

upgradeDatabaseVersion60(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion61()

Upgrades a Version 60 version of the Yioop database to a Version 61 version

upgradeDatabaseVersion61(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion62()

Upgrades a Version 61 version of the Yioop database to a Version 62 version

upgradeDatabaseVersion62(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion64()

Upgrades a Version 63 version of the Yioop database to a Version 64 version

upgradeDatabaseVersion64(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion65()

Upgrades a Version 64 version of the Yioop database to a Version 65 version

upgradeDatabaseVersion65(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion66()

Upgrades a Version 65 version of the Yioop database to a Version 66 version

upgradeDatabaseVersion66(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion67()

Upgrades a Version 66 version of the Yioop database to a Version 67 version

upgradeDatabaseVersion67(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion68()

Upgrades a Version 67 version of the Yioop database to a Version 68 version

upgradeDatabaseVersion68(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion69()

Upgrades a Version 68 version of the Yioop database to a Version 69 version

upgradeDatabaseVersion69(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion70()

Upgrades a Version 69 version of the Yioop database to a Version 70 version

upgradeDatabaseVersion70(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade.

Return values
mixed

upgradeDatabaseVersion71()

Upgrades a Version 70 version of the Yioop database to a Version 71 version

upgradeDatabaseVersion71(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion72()

Upgrades a Version 71 version of the Yioop database to a Version 72 version

upgradeDatabaseVersion72(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion73()

Upgrades a Version 72 version of the Yioop database to a Version 73 version

upgradeDatabaseVersion73(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion74()

Upgrades a Version 73 version of the Yioop database to a Version 74 version

upgradeDatabaseVersion74(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion75()

Upgrades a Version 74 version of the Yioop database to a Version 75 version

upgradeDatabaseVersion75(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion76()

Upgrades a Version 75 version of the Yioop database to a Version 76 version

upgradeDatabaseVersion76(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion77()

Upgrades a Version 76 version of the Yioop database to a Version 77 version

upgradeDatabaseVersion77(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion78()

Upgrades a Version 77 version of the Yioop database to a Version 78 version

upgradeDatabaseVersion78(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion79()

Upgrades a Version 78 version of the Yioop database to a Version 79 version

upgradeDatabaseVersion79(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion80()

Upgrades a Version 79 version of the Yioop database to a Version 80 version

upgradeDatabaseVersion80(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

upgradeDatabaseVersion81()

Upgrades a Version 80 version of the Yioop database to a Version 81 version

upgradeDatabaseVersion81(object &$db) : mixed
Parameters
$db : object

datasource to use to upgrade

Return values
mixed

webExit()

Function to call instead of exit() to indicate that the script processing the current web page is done processing. Use this rather that exit(), as exit() will also terminate WebSite.

webExit([string $err_msg = "" ]) : mixed
Parameters
$err_msg : string = ""

error message to send on exiting

Tags
throws
WebException
Return values
mixed

makeTableCallback()

Callback used by a preg_replace_callback in nextPage to make a table

makeTableCallback(array<string|int, mixed> $matches) : mixed
Parameters
$matches : array<string|int, mixed>

of table cells

Return values
mixed

citeCallback()

Used to convert {{cite }} to a numbered link to a citation

citeCallback(array<string|int, mixed> $matches[, int $init = -1 ]) : string
Parameters
$matches : array<string|int, mixed>

from regular expression to check for {{cite }}

$init : int = -1

used to initialize counter for citations

Return values
string

a HTML link to citation in current document

fixLinksCallback()

Used to changes spaces to underscores in links generated from our earlier matching rules

fixLinksCallback(array<string|int, mixed> $matches) : string
Parameters
$matches : array<string|int, mixed>

from regular expression to check for links

Return values
string

result of correcting link

base64EncodeCallback()

Callback used to base64 encode the contents of nowiki tags so they won't be manipulated by wiki replacements.

base64EncodeCallback(array<string|int, mixed> $matches) : string
Parameters
$matches : array<string|int, mixed>

$matches[1] should contain the contents of a nowiki tag

Return values
string

base 64 encoded contents surrounded by an escaped nowiki tag.

spaceEncodeCallback()

Callback used to encode the contents of pre tags so they won't accidentally get sub-pre tags because a bunch of leading lines have spaces

spaceEncodeCallback(array<string|int, mixed> $matches) : string
Parameters
$matches : array<string|int, mixed>

$matches[1] should contain the contents of a pre tag

Return values
string

encoded contents surrounded by an escaped pre tag.

spanEncodeCallback()

Callback used to encode the contents of span tags so they newlines within them don't accidentally get treated as new wiki paragraphs

spanEncodeCallback(array<string|int, mixed> $matches) : string
Parameters
$matches : array<string|int, mixed>

$matches[1] should contain the contents of a span tag

Return values
string

encoded contents surrounded by an escaped pre tag.

base64DecodeCallback()

Callback used to base64 decode the contents of previously base64 encoded (@see base64EncodeCallback) nowiki tags after all mediawiki substitutions have been done

base64DecodeCallback(array<string|int, mixed> $matches) : string
Parameters
$matches : array<string|int, mixed>

$matches[1] should contain the contents of a nowiki tag

Return values
string

base 64 decoded, entity decoded contents.

spaceDecodeCallback()

Cleans up pre tags after other wiki rules applied

spaceDecodeCallback(array<string|int, mixed> $matches) : string
Parameters
$matches : array<string|int, mixed>

$matches[1] should contain the contents of a pre tag

Return values
string

cleaned contents surrounded by a pre-formatted tag.

lessThanLocale()

Function for comparing two locale arrays by locale tag so can sort

lessThanLocale(array<string|int, mixed> $a, array<string|int, mixed> $b) : int
Parameters
$a : array<string|int, mixed>

an associative array of locale info

$b : array<string|int, mixed>

an associative array of locale info

Return values
int

-1, 0, or 1 depending on which is alphabetically smaller or if they are the same size

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function is a convenience copy of the same function

Tags
see
tl()

to this subnamespace

Return values
string

translated string

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function is a convenience copy of the same function

Tags
see
tl()

to this subnamespace

Return values
string

translated string

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function is a convenience copy of the same function

Tags
see
tl()

to this subnamespace

Return values
string

translated string

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

tl()

Translate the supplied arguments into the current locale.

tl() : string

This function is a convenience copy of the same function

Tags
see
tl()

to this subnamespace

Return values
string

translated string

e()

shorthand for echo

e(string $text) : mixed
Parameters
$text : string

string to send to the current output

Return values
mixed

Search results