Cloud 2 - Search
A study of the available “cloud computing technologies” in the field of Searching. Introduces students to the technologies, their benefits, and how to leverage them. Class concludes in a two day lab in which students apply the knowledge following real scenarios. Course uses the Java programming language.
Prerequisites:
Course requires the ability to use the Java programming language.
Course Duration:
2 days (16 hours) classroom time
Appropriate Roles:
Advanced Technical
Optional: Technical
Required Textbooks and Materials:
David Smiley, Solr 1.4 Enterprise Search Server
Otis Gospodnetic, Erik Hatcher, Lucene in Action (In Action Series)
http://www.amazon.com/Lucene-Action-Otis-Gospodnetic/dp/1932394281
Upon completion of this course the student will be able to:
Rapidly deploy any search requirements into an application using existing open source (Apache) technologies.
Setup Lucene and query using a variety of strategies.
Setup Solr and query using a variety of strategies.
Setup Carrot2 and cluster queries using a variety of strategies.
Describe fault tolerance and load balancing in Solr.
Syllabus:
Apache Lucene information retrieval software library
Features
Limitations
API
Example uses
Apache Solr enterprise search platform
Features
Limitations
Communicating with Solr
HTTP
API
Example uses
Schema Design
Fields
Text Analysis
Tokenization
Indexing Data
Direct Database
Solr Cell
Direct File
Basic Searching
Query types
Query syntax
Sorting and Filtering
Solr Plugin: Carrot2 search result clustering engine
Integrating with Solr
Clustering Results
Implementing cloud search technologies
Lucene
Deploying Lucene
Indexing with Lucene
Querying Lucene
Solr
Deploying Solr
Indexing with Solr
Querying Solr
Implementing cloud search technologies in a project
Integrating Solr into a project
Scaling Solr
Enhanced Searching
Agenda:
Course Overview
What is the Cloud
Open source search technologies Lucene, Solr, Nutch
[apache-lucene-searching-the-web-and-everything-else-jazoon0711-35pg.odp]
Learning Objectives
[Syllabus_Intro_Cloud_Computing.doc]
Java Coding Best Practices
[5RulesOfSoftwareDevelopment.ppt]
Setting up your Environment Lab
Course code and library distribution
[Intro_Cloud_Computing_Materials.tgz]
Configuring your environment
Follow readme in tgz archive
Verify proper configuration
Lucene Basics
General Lucene Functionality
[Lucene Basics.ppt]
[Lucene2.ppt]
Lucene Demo Lab 1
Install Lucene Demo/Tutorial Project Part 1
[Lab 1, Lucene Basics.doc]
[Lucene_demo.doc]
[Lucene-3.1.0.tar.gz]
Run a couple sample indexes
Run a couple sampe queries
End of Lab Review
Lucene Basics 2
Query Parsing
[Lucene_Query_Parser_Syntax.doc]
Scoring
[Lucene_Scoring.doc]
Lucene Demo Lab 2
Extend Lucene Demo Lab 1 Part 2
[Lab 1, Lucene Basics.doc]
End of Lab Review
Lucene Wrap Up
Companies using Lucene
[PoweredBy - Lucene-java Wiki.doc]
Advanced Features
[Lucene3.ppt]
[Lucene4.ppt]
Search (Solr)
Solr Features
[Solr Features.doc]
Solr Overview
[SolrTutorial - Solr Wiki.doc]
Solr Tutorial Lab 3
[SolrTutorial - Solr Wiki.doc]
Indexing Data
Updating Data
Querying Data
Search UI
Text Analysis
End of Lab Review
Solr Basics
Solr Basics
[apache-solr-out-of-the-box.ppt]
Solr Plugins
[apache-solr-beyond-the-box.ppt]
Distributed Search
[DistributedSearch - Solr Wiki.doc]
Clustering Component and Carrot2
[ClusteringComponent - Solr Wiki.doc]
Solr Wrap Up
Companies using Solr
[Websites_Powered_By_Solr_Wiki.doc]
Search Wrap Up
Review / Test, Class Excercise
[Search Final Exam.doc]
End of Search Course Review