Cloud 2 - Search

A study of the available “cloud computing technologies” in the field of Searching. Introduces students to the technologies, their benefits, and how to leverage them. Class concludes in a two day lab in which students apply the knowledge following real scenarios. Course uses the Java programming language.



Prerequisites:

Course requires the ability to use the Java programming language.


Course Duration:

2 days (16 hours) classroom time


Appropriate Roles:

Advanced Technical

Optional: Technical


Required Textbooks and Materials:

David Smiley, Solr 1.4 Enterprise Search Server

http://www.amazon.com/Solr-1-4-Enterprise-Search-Server/dp/1847195881/ref=sr_1_1?s=books&ie=UTF8&qid=1300202076&sr=1-1


Otis Gospodnetic, Erik Hatcher, Lucene in Action (In Action Series)

http://www.amazon.com/Lucene-Action-Otis-Gospodnetic/dp/1932394281


     



Upon completion of this course the student will be able to:



Syllabus:

  1. Apache Lucene information retrieval software library

    1. Features

    2. Limitations

    3. API

    4. Example uses

  2. Apache Solr enterprise search platform

    1. Features

    2. Limitations

    3. Communicating with Solr

      1. HTTP

      2. API

    4. Example uses

    5. Schema Design

      1. Fields

    6. Text Analysis

      1. Tokenization

    7. Indexing Data

      1. Direct Database

      2. Solr Cell

      3. Direct File

    8. Basic Searching

      1. Query types

      2. Query syntax

    9. Sorting and Filtering

  3. Solr Plugin: Carrot2 search result clustering engine

    1. Integrating with Solr

    2. Clustering Results

  4. Implementing cloud search technologies

    1. Lucene

      1. Deploying Lucene

      2. Indexing with Lucene

      3. Querying Lucene

    2. Solr

      1. Deploying Solr

      2. Indexing with Solr

      3. Querying Solr

  5. Implementing cloud search technologies in a project

    1. Integrating Solr into a project

    2. Scaling Solr

    3. Enhanced Searching




Agenda:

  1. Course Overview

    1. What is the Cloud

    2. Open source search technologies Lucene, Solr, Nutch

      1. [apache-lucene-searching-the-web-and-everything-else-jazoon0711-35pg.odp]

    3. Learning Objectives

      1. [Syllabus_Intro_Cloud_Computing.doc]

    4. Java Coding Best Practices

      1. [5RulesOfSoftwareDevelopment.ppt]

  2. Setting up your Environment Lab

    1. Course code and library distribution

      1. [Intro_Cloud_Computing_Materials.tgz]

    2. Configuring your environment

      1. Follow readme in tgz archive

    3. Verify proper configuration

  3. Lucene Basics

    1. General Lucene Functionality

      1. [Lucene Basics.ppt]

      2. [Lucene2.ppt]

  4. Lucene Demo Lab 1

    1. Install Lucene Demo/Tutorial Project Part 1

      1. [Lab 1, Lucene Basics.doc]

      2. [Lucene_demo.doc]

      3. [Lucene-3.1.0.tar.gz]

    2. Run a couple sample indexes

    3. Run a couple sampe queries

    4. End of Lab Review

  5. Lucene Basics 2

    1. Query Parsing

      1. [Lucene_Query_Parser_Syntax.doc]

    2. Scoring

      1. [Lucene_Scoring.doc]

  6. Lucene Demo Lab 2

    1. Extend Lucene Demo Lab 1 Part 2

      1. [Lab 1, Lucene Basics.doc]

    2. End of Lab Review

  7. Lucene Wrap Up

    1. Companies using Lucene

      1. [PoweredBy - Lucene-java Wiki.doc]

    2. Advanced Features

      1. [Lucene3.ppt]

      2. [Lucene4.ppt]

  8. Search (Solr)

    1. Solr Features

      1. [Solr Features.doc]

    2. Solr Overview

      1. [SolrTutorial - Solr Wiki.doc]

  9. Solr Tutorial Lab 3

    1. [SolrTutorial - Solr Wiki.doc]

      1. Indexing Data

      2. Updating Data

      3. Querying Data

      4. Search UI

      5. Text Analysis

    2. End of Lab Review

  10. Solr Basics

    1. Solr Basics

      1. [apache-solr-out-of-the-box.ppt]

    2. Solr Plugins

      1. [apache-solr-beyond-the-box.ppt]

    3. Distributed Search

      1. [DistributedSearch - Solr Wiki.doc]

    4. Clustering Component and Carrot2

      1. [ClusteringComponent - Solr Wiki.doc]

  11. Solr Wrap Up

    1. Companies using Solr

      1. [Websites_Powered_By_Solr_Wiki.doc]

  12. Search Wrap Up

    1. Review / Test, Class Excercise

      1. [Search Final Exam.doc]

    2. End of Search Course Review