seek_quarry
[ class tree: seek_quarry ] [ index: seek_quarry ] [ all elements ]

Procedural File: arc_tool.php

Source Location: /bin/arc_tool.php



Classes:

ArcTool
Command line program that allows one to examine the content of the WebArchiveBundles and IndexArchiveBundles of Yioop crawls.


Page Details:

SeekQuarry/Yioop -- Open Source Pure PHP Search Engine, Crawler, and Indexer

Copyright (C) 2009 - 2013 Chris Pollett chris@pollett.org

LICENSE:

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

END LICENSE




Tags:

author:  Chris Pollett chris@pollett.org
copyright:  2009 - 2013
link:  http://www.seekquarry.com/
filesource:  Source Code for this file
license:  GPL3


Includes:

require_once(BASE_DIR."/lib/index_bundle_iterators/word_iterator.php") [line 68]
To be able to determine info about word in a index dictionary

require_once(BASE_DIR."/lib/url_parser.php") [line 80]
Used for manipulating urls

require_once($filename) [line 76]
Load the iterator classes for non-yioop archives

require_once(BASE_DIR."/lib/web_queue_bundle.php") [line 62]
Load the class that maintains our URL queue

require_once(BASE_DIR."/lib/index_manager.php") [line 71]
Used by word_iterator.php

require_once(BASE_DIR."/lib/utility.php") [line 83]
For crawlHash function

require_once(BASE_DIR."/models/datasources/".DBMS."_manager.php") [line 86]
Get the database library based on the current database type

require_once(BASE_DIR."/lib/fetch_url.php") [line 89]
Load FetchUrl, used by the MediaWiki archive iterator

require_once(BASE_DIR.'/configs/config.php') [line 48]
Load in global configuration settings

require_once(BASE_DIR."/lib/crawl_constants.php") [line 92]
Loads common constants for web crawling

require_once(BASE_DIR."/lib/index_archive_bundle.php") [line 65]
Load word->{array of docs with word} index class






BASE_DIR [line 37]

BASE_DIR = substr(dirname(realpath($_SERVER['PHP_SELF'])),0,-strlen("/bin"))
Calculate base directory of script @ignore


[ Top ]



LOG_TO_FILES [line 45]

LOG_TO_FILES = false
This tool does not need logging


[ Top ]



NO_CACHE [line 56]

NO_CACHE = true
NO_CACHE means don't try to use memcache


[ Top ]



USE_CACHE [line 59]

USE_CACHE = false
USE_CACHE false rules out file cache as well


[ Top ]




Documentation generated by phpDocumentor 1.4.3