diff --git a/en-US/pages/documentation.thtml b/en-US/pages/documentation.thtml
index 7609026..9064e25 100755
--- a/en-US/pages/documentation.thtml
+++ b/en-US/pages/documentation.thtml
@@ -73,6 +73,17 @@
there, you can check the Yioop <a href="#requirements">Requirements</a>
section followed by the more general <a
href="#installation">Installation and Configuration</a> instructions.
+ <a href="http://www.yioop.com/">Yioop.com</a>, the demo site of Yioop allows
+ people to register accounts. This would give you an account with User
+ rather than Admin access, but it would allow you to experiment with
+ some of the features of Yioop beyond search such as
+ Yioop Groups, Wikis, and Crawl Mixes without needing to install the
+ software yourself. The <a href="#search-interface">Search Interface</a>,
+ <a href="#userrolegroups">Managing Users, Roles, and Groups</a>,
+ <a href="#feeds-wikis">Feeds and Wikis</a>, and
+ <a href="#mixes">Mixing Crawl Indexes</a> sections below could serve
+ as a guide to testing the portion of the site general users have access to
+ on Yioop.com.
</p>
<h3 id="intro">Introduction</h3>
<p>The Yioop search engine is designed to allow users
@@ -84,14 +95,16 @@
control over the exact sites which are being indexed with Yioop, you have
much better control over the kinds of results that a search will return.
Yioop provides a traditional web interface to do queries, an rss api,
- and a function api. In this section we discuss some of the different
+ and a function api. It also supports many common features of a search
+ portal such as user discussion group, blogs, wikis, and a news aggregator.
+ In this section we discuss some of the different
search engine technologies which exist today, how Yioop fits into this
eco-system, and when Yioop might be the right choice for your search
engine needs. In the remainder of this document after the introduction,
we discuss how to get and install Yioop; the files and folders used
- in Yioop; the various crawl, search and administration facilities in the
- Yioop; localization in the Yioop system; building a site using the Yioop
- framework; embedding Yioop in an existing web-site;
+ in Yioop; the various crawl, search, social portal, and administration
+ facilities in the Yioop; localization in the Yioop system; building a site
+ using the Yioop framework; embedding Yioop in an existing web-site;
customizing Yioop; and the Yioop command-line tools.
</p>
<p>Since the mid-1990s a wide variety of search engine technologies
@@ -131,7 +144,8 @@
<a href="http://www.opensearch.org">Open Search RSS results</a> or
a JSON variant. These can be used to embed Yioop within your existing site.
If you want to create a new search engine site, Yioop offers a web-based,
- model-view-controller framework with a web-interface for localization
+ model-view-adapter (a variation on model-view-controller) framework w
+ ith a web-interface for localization
that can serve as the basis for your app.
</p>
<p>
@@ -329,7 +343,7 @@
content of each downloaded url. Amongst urls with the same hash only the
one that is linked to the most will be returned after grouping. Finally,
if a user wants to do more sophisticated post-processing such as clustering
- or computing page, Yioop supports a straightforward architecture
+ or computing page rank, Yioop supports a straightforward architecture
for indexing plugins.
</p>
<p>
@@ -430,13 +444,35 @@
<li>Using web archives, crawls can be mirrored amongst several machines
to speed-up serving search results. This can be further sped-up
by using memcache or filecache.</li>
- <li>Yioop comes with its own extendable model-view-controller
+ <li>Yioop comes with its own extendable model-view-adapter
framework that you can use directly to create new sites that use
Yioop search technology. This framework also comes with a GUI
which makes it easy to localize strings and static pages.</li>
+ <li>Yioop has been optimized to work well with smart phone web browsers
+ and with tablet devices.</li>
</ul>
</li>
- <li><b>Search and User Interface</b>
+ <li><b>Social and User Interface</b>
+ <ul>
+ <li>Yioop can be configured to allow or not to allow users to
+ register for accounts.
+ </li>
+ <li>If allowed, user accounts can create discussion groups, blogs, and
+ wikis.
+ </li>
+ <li>Users can share their own mixes of crawls that exist in the Yioop
+ system.</li>
+ <li>If user accounts are enabled, Yioop has a search tools page on which
+ people can suggest urls to crawl.</li>
+ <li>Yioop has three different captcha'ing mechanisms that can be
+ used in account registration and for suggest urls: a standard graphics
+ based captch, a text-based captcha, and a hash-cash-like catpha.</li>
+ <li>Password authentication can be configured to either use a
+ standard password hash based system, or make use of Fiat Shamir
+ zero-knowledge authentication.</li>
+ </ul>
+ </li>
+ <li><b>Search</b>
<ul>
<li>Yioop supports subsearches geared towards presenting certain
kinds of media such as images, video, and news. The list of video and
@@ -446,9 +482,9 @@
can be combined like a simplified relational algebra.</li>
<li>Yioop can be configured to display word suggestions as a user
types a query. It can also suggest spell corrections for mis-typed
- queries. This feature can be localized</li>
- <li>Yioop has been optimized to work well with smart phone web browsers
- and with tablet devices.</li>
+ queries. This feature can be localized.</li>
+ <li>Yioop can also make use of a thesaurus facility such as provided
+ by WordNet to suggest related queries.</li>
<li>Yioop supports the ability to filter out urls from search
results after a crawl has been performed. It also has the ability
to edit summary information that will be displayed for urls.</li>
@@ -466,9 +502,9 @@
<li>Yioop is capable of indexing small sites to sites or
collections of sites containing low hundreds of millions
of documents.</li>
- <li>For indexes starting with v0.96, Yioop uses a hybrid inverted
+ <li>Yioop uses a hybrid inverted
index/suffix tree approach for word lookup to make multi-word
- queries faster.</li>
+ queries faster on disk bound machines.</li>
<li>Yioop indexes are positional rather than
bag of word indexes, and a index compression scheme called Modified9
is used.</li>
@@ -483,11 +519,15 @@
also attempting to extract information from unknown filetypes.</li>
<li>Yioop has a simple page rule language for controlling what content
should be extracted from a page or record.</li>
+ <li>Yioop has two different kinds of text summarizers which can be used
+ to further affect what words are index: a basic web
+ page scraper, and a centroid algorithm summarizer. The latter can be
+ used to generate word clouds of crawled documents.</li>
<li>Indexing occurs as crawling happens, so when a crawl is stopped,
it is ready to be used to handle search queries immediately.</li>
<li>Yioop Indexes can be used to create classifiers which then
can be used in labeling and ranking future indexes.</li>
- <li>Yioop come with a stemmer for English and Italian, and a
+ <li>Yioop comes with a stemmer for English and Italian, and a
word segmenter for Chinese. It uses char-gramming for other languages.
Yioop has a simple architecture for adding stemmers for other languages.
</li>
@@ -518,6 +558,7 @@
attributes. It also supports X-Robots-Tag HTTP headers.</li>
<li>Yioop has its own DNS caching mechanism.</li>
<li>Yioop supports crawling TOR networks (.onion urls).</li>
+ <li>Yioop supports crawling through a list of proxy servers.</li>
<li>Yioop supports crawl quotas for web sites. I.e., one can control
the number of urls/hour downloaded from a site.</li>
<li>Yioop can detect website congestion and slow down crawling