Modifications to intro and features for V1.0, a=chris

Chris Pollett [2014-06-27 00:Jun:th]
Modifications to intro and features for V1.0, a=chris
Filename
en-US/pages/documentation.thtml
diff --git a/en-US/pages/documentation.thtml b/en-US/pages/documentation.thtml
index 7609026..9064e25 100755
--- a/en-US/pages/documentation.thtml
+++ b/en-US/pages/documentation.thtml
@@ -73,6 +73,17 @@
     there, you can check the Yioop <a href="#requirements">Requirements</a>
     section followed by the more general <a
     href="#installation">Installation and Configuration</a> instructions.
+    <a href="http://www.yioop.com/">Yioop.com</a>, the demo site of Yioop allows
+    people to register accounts. This would give you an account with User
+    rather than Admin access, but it would allow you to experiment with
+    some of the features of Yioop beyond search such as
+    Yioop Groups, Wikis, and Crawl Mixes without needing to install the
+    software yourself. The <a href="#search-interface">Search Interface</a>,
+    <a href="#userrolegroups">Managing Users, Roles, and Groups</a>,
+    <a href="#feeds-wikis">Feeds and Wikis</a>, and
+    <a href="#mixes">Mixing Crawl Indexes</a> sections below could serve
+    as a guide to testing the portion of the site general users have access to
+    on Yioop.com.
     </p>
     <h3 id="intro">Introduction</h3>
     <p>The Yioop search engine is designed to allow users
@@ -84,14 +95,16 @@
     control over the exact sites which are being indexed with Yioop, you have
     much better control over the kinds of results that a search will return.
     Yioop provides a traditional web interface to do queries, an rss api,
-    and a function api. In this section we discuss some of the different
+    and a function api. It also supports many common features of a search
+    portal such as user discussion group, blogs, wikis, and a news aggregator.
+    In this section we discuss some of the different
     search engine technologies which exist today, how Yioop fits into this
     eco-system, and when Yioop might be the right choice for your search
     engine needs. In the remainder of this document after the introduction,
     we discuss how to get and install Yioop; the files and folders used
-    in Yioop; the various crawl, search and administration facilities in the
-    Yioop; localization in the Yioop system; building a site using the Yioop
-    framework; embedding Yioop in an existing web-site;
+    in Yioop; the various crawl, search, social portal, and administration
+    facilities in the Yioop; localization in the Yioop system; building a site
+    using the Yioop framework; embedding Yioop in an existing web-site;
     customizing Yioop; and the Yioop command-line tools.
     </p>
     <p>Since the mid-1990s a wide variety of search engine technologies
@@ -131,7 +144,8 @@
     <a href="http://www.opensearch.org">Open Search RSS results</a> or
     a JSON variant. These can be used to embed Yioop within your existing site.
     If you want to create a new search engine site, Yioop offers a web-based,
-    model-view-controller framework with a web-interface for localization
+    model-view-adapter (a variation on model-view-controller) framework w
+    ith a web-interface for localization
     that can serve as the basis for your app.
     </p>
     <p>
@@ -329,7 +343,7 @@
     content of each downloaded url. Amongst urls with the same hash only the
     one that is linked to the most will be returned after grouping. Finally,
     if a user wants to do more sophisticated post-processing such as clustering
-    or computing page, Yioop supports a straightforward architecture
+    or computing page rank, Yioop supports a straightforward architecture
     for indexing plugins.
     </p>
     <p>
@@ -430,13 +444,35 @@
     <li>Using web archives, crawls can be mirrored amongst several machines
     to speed-up serving search results. This can be further sped-up
     by using memcache or filecache.</li>
-    <li>Yioop comes with its own extendable model-view-controller
+    <li>Yioop comes with its own extendable model-view-adapter
     framework that you can use directly to create new sites that use
     Yioop search technology. This framework also comes with a GUI
     which makes it easy to localize strings and static pages.</li>
+    <li>Yioop has been optimized to work well with smart phone web browsers
+    and with tablet devices.</li>
     </ul>
     </li>
-    <li><b>Search and User Interface</b>
+    <li><b>Social and User Interface</b>
+    <ul>
+    <li>Yioop can be configured to allow or not to allow users to
+    register for accounts.
+    </li>
+    <li>If allowed, user accounts can create discussion groups, blogs, and
+    wikis.
+    </li>
+    <li>Users can share their own mixes of crawls that exist in the Yioop
+    system.</li>
+    <li>If user accounts are enabled, Yioop has a search tools page on which
+    people can suggest urls to crawl.</li>
+    <li>Yioop has three different captcha'ing mechanisms that can be
+    used in account registration and for suggest urls: a standard graphics
+    based captch, a text-based captcha, and a hash-cash-like catpha.</li>
+    <li>Password authentication can be configured to either use a
+    standard password hash based system, or make use of Fiat Shamir
+    zero-knowledge authentication.</li>
+    </ul>
+    </li>
+    <li><b>Search</b>
     <ul>
     <li>Yioop supports subsearches geared towards presenting certain
     kinds of media such as images, video, and news. The list of video and
@@ -446,9 +482,9 @@
     can be combined like a simplified relational algebra.</li>
     <li>Yioop can be configured to display word suggestions as a user
     types a query. It can also suggest spell corrections for mis-typed
-    queries. This feature can be localized</li>
-    <li>Yioop has been optimized to work well with smart phone web browsers
-    and with tablet devices.</li>
+    queries. This feature can be localized.</li>
+    <li>Yioop can also make use of a thesaurus facility such as provided
+    by WordNet to suggest related queries.</li>
     <li>Yioop supports the ability to filter out urls from search
     results after a crawl has been performed. It also has the ability
     to edit summary information that will be displayed for urls.</li>
@@ -466,9 +502,9 @@
     <li>Yioop is capable of indexing small sites to sites or
     collections of sites containing low hundreds of millions
     of documents.</li>
-    <li>For indexes starting with v0.96, Yioop uses a hybrid inverted
+    <li>Yioop uses a hybrid inverted
     index/suffix tree approach for word lookup to make multi-word
-    queries faster.</li>
+    queries faster on disk bound machines.</li>
     <li>Yioop indexes are positional rather than
     bag of word indexes, and a index compression scheme called Modified9
     is used.</li>
@@ -483,11 +519,15 @@
     also attempting to extract information from unknown filetypes.</li>
     <li>Yioop has a simple page rule language for controlling what content
     should be extracted from a page or record.</li>
+    <li>Yioop has two different kinds of text summarizers which can be used
+    to further affect what words are index: a basic web
+    page scraper, and a centroid algorithm summarizer. The latter can be
+    used to generate word clouds of crawled documents.</li>
     <li>Indexing occurs as crawling happens, so when a crawl is stopped,
     it is ready to be used to handle search queries immediately.</li>
     <li>Yioop Indexes can be used to create classifiers which then
     can be used in labeling and ranking future indexes.</li>
-    <li>Yioop come with a stemmer for English and Italian, and a
+    <li>Yioop comes with a stemmer for English and Italian, and a
     word segmenter for Chinese. It uses char-gramming for other languages.
     Yioop has a simple architecture for adding stemmers for other languages.
     </li>
@@ -518,6 +558,7 @@
     attributes. It also supports X-Robots-Tag HTTP headers.</li>
     <li>Yioop has its own DNS caching mechanism.</li>
     <li>Yioop supports crawling TOR networks (.onion urls).</li>
+    <li>Yioop supports crawling through a list of proxy servers.</li>
     <li>Yioop supports crawl quotas for web sites. I.e., one can control
     the number of urls/hour downloaded from a site.</li>
     <li>Yioop can detect website congestion and slow down crawling
ViewGit