Add install guide for Ubuntu Linux, a=chris

Chris Pollett [2012-10-06 15:Oct:th]
Add install guide for Ubuntu Linux, a=chris
Filename
en-US/pages/install.thtml
diff --git a/en-US/pages/install.thtml b/en-US/pages/install.thtml
index 3298c32..5f3fedb 100755
--- a/en-US/pages/install.thtml
+++ b/en-US/pages/install.thtml
@@ -2,6 +2,7 @@
     <ul>
         <li><a href="#xampp">XAMPP on Windows</a></li>
         <li><a href="#wamp">WAMP</a></li>
+        <li><a href="#linux">Ubuntu Linux</a></li>
         <li><a href="#cpanel">CPanel</a></li>
         <li><a href="#multiple">System with Multiple Queue Servers</a></li>
     </ul>
@@ -163,11 +164,16 @@ Click edit and add to the path variable:
 Exit control panel, then re-enter to double check that path really was added
  to end</li>
 <li> Next go to
-wamp =&gt; apache =&gt; restart service. In a browser go to Yioop =&gt;
+wamp =&gt; apache =&gt; restart service. In a browser, go to
+http://localhost/yioop/ . You should see a configure screen
+where you can enter C:/yioop_data for the Work Directory. It
+will ask you to re-login. Use the login: root and no password.
+Now go to Yioop =&gt;
 Configure and input the following settings:
 <pre>
-Search Engine Work Directory: C:/yioop_data
+Search Engine Work Directory: C:/yioop_data
 Default Language: English
+(initially only the above )
 Debug Display: (all checked)
 Search access: (all checked)
 Database Set-up: (left unchanged)
@@ -202,7 +208,105 @@ crawls list. Set it as the default crawl. You should be
 able to search using this index
 </li>
 </ol>
-
+<h2 id="linux">Ubuntu Linux</h2>
+<ol>
+<li>Get PHP and Apache set-up by running the following commands as needed
+(you might have already done some):
+<pre>
+sudo apt-get install curl
+sudo apt-get install apache2
+sudo apt-get install php5
+sudo apt-get install php5-cli
+sudo apt-get install php5-sqlite
+sudo apt-get install php5-curl
+sudo apt-get install php5-gd
+</pre>
+</li>
+<li>After this sequence, the files /etc/apache2/mods-enabled/php5.conf
+and /etc/apache2/mods-enabled/php5.load should exist and link
+to the corresponding files in /etc/apache2/mods-available. The configuration
+files for php are /etc/php5/apache2/php.ini (for the apache module)
+and /etc/php5/cli/php.ini (for the command-line interpreter).
+You want to make changes to both configurations. Using your favorite
+texteditor, vi, nano, gedit, etc., modify the line:
+<pre>
+post_max_size = 8M
+to
+post_max_size = 32M
+</pre>
+</li>
+<li>Looking in the folders /etc/php5/apache2/conf.d and
+/etc/php5/cli/conf.d you can see which extensions are being loaded
+by php. Look for files curl.ini, gd.ini, sqlite.ini to know these
+extensions will be loaded.</li>
+<li>Restart the web server after making your changes:
+<pre>
+sudo apachectl stop
+sudo apachectl start
+</pre>
+</li>
+<li>We are going to configure Yioop so that fetchers and queue_servers
+can be started from the GUI interface. On a Linux machine, Yioop makes
+use of the Unix "at" command. Under Ubuntu, "at" will typically be enabled,
+however, you might need to give your web server access to schedule
+"at" jobs. To do this, check that the web server user (www-data)
+is not in the file /etc/at.deny .</li>
+<li>The DocumentRoot for web sites (virtual hosts) served by an Ubuntu Linux
+machine is typically specified by files in /etc/apache2/sites-enabled.
+In this example, it was given in a file 000-default and specified to
+be /var/www/.</li>
+<li><a href="http://www.seekquarry.com/viewgit/?a=summary&p=yioop"
+>Download Yioop</a>, unpack it into /var/www and use
+mv to rename the Yioop folder to yioop.</li>
+<li>Make a folder for your crawl data:
+<pre>
+sudo mkdir /var/www/yioop_data
+sudo chmod 777 /var/www/yioop_data
+</pre>
+</li>
+<li>In a browser, go to the page http://localhost/yioop/ .
+You should see a configure screen
+where you can enter /var/www/yioop_data for the Work Directory. It
+will ask you to re-login. Use the login: root and no password.
+Now go to Yioop =&gt;
+Configure and input the following settings:
+<pre>
+Search Engine Work Directory: /var/www/yioop_data
+Default Language: English
+Debug Display: (all checked)
+Search access: (all checked)
+Database Set-up: (left unchanged)
+Search Auxiliary Links Displayed: (all checked)
+Name Server Set-up
+Server Key: 0
+Name Server Url: http://localhost/yioop/
+Caral Robot Name: TestBot
+Robot Instance: A
+Robot Description: TestBot should be disallowed from everywhere because
+the installer of Yioop did not customize this to his system.
+Please block this ip.
+</pre>
+</li>
+<li>Go to Manage Machines. Add a single machine under Add Machine using the
+settings:
+<pre>
+Machine Name: Local
+Machine Url: http://localhost/yioop/
+Is Mirror: (uncheck)
+Has Queue Server: (check)
+Number of Fetchers 1
+Submit
+</pre>
+</li>
+<li>Under Machine Information turn the Queue Server and Fetcher On.</li>
+<li>Go to Manage Crawls. Click on the options to set up where you want to crawl.
+Type in a name for the crawl and click start crawl.</li>
+<li>Let it crawl for a while, till you see the Total URls Seen &gt; 1.</li>
+<li>Then click Stop Crawl and wait for the crawl to appear in the previous
+crawls list. Set it as the default crawl. You should be
+able to search using this index.
+</li>
+</ol>
 <h2 id="cpanel">CPanel</h2>
 <p>
 Generally, it is not practical to do your crawling in a cPanel hosted website.
ViewGit