Core Solr Training, San Francisco – Fall 2016

Oct 4, 2016 (Tue) – 9:00 am to 5:00 pm

Completed

Overview

Rafal Kuć

Comprehensive 1-day Solr class taught by Rafal Kuć a seasoned Solr instructor and consultant from Sematext, and the author of several Solr and Elasticsearch books. After taking this course you will be able to configure and deploy Solr, run a wide range of queries including queries with facets and aggregations, and index documents with Solr. You will learn about inverted index, about Solr schema, analysis, tokens, token filters, highlighting, query parsing, and so on – see course outline below for more. Each section is followed by a lab with multiple hands-on exercises.


Who Should Attend

The course is designed for technical attendees of any knowledge level and is aimed at those who need to configure, tune and manage Solr and have only basic Solr knowledge. No prior Solr experience is required. Experience with Linux systems is not a must, but basic familiarity with running shell commands (e.g., using curl command) will make the course more enjoyable.

Prerequisites

None, just desire to learn!

Things To Remember

  • Arrive at least 20 minutes early to class and on time after each break.
  • Participants must bring their own laptop with OSX, Linux or Windows to the workshop. Laptops should have the latest version of Java installed. You should be comfortable using a terminal / command line.

  • If you have any dietary restrictions be sure to let us know at least a week prior to the training.

What We Provide

For this training Sematext provides:
  • A digital copy of the training material
  • A VM with all configs, scripts, exercises, etc.
  • Breakfast, lunch, snacks, coffee, tea, juices, soft drinks, and water

Course Outline

Modules

  1. Getting Started with Solr
    • What is Apache Solr
    • General principles
    • Architecture types
  2.  
  3. Introduction to Solr
    • Starting Solr with schema-less configuration
    • Inverted index
    • Relevancy basics
    • Indexing documents
    • Retrieving documents by identifier
    • Searching for documents
    • Deleting documents
    • Lab
      • Using start scripts
      • Working with configuration
      • CRUD operations
  4.  
  5. Indexing Data
    • Data structure
    • Index structure configuration
    • Defining custom field types
    • String vs Text based types
    • Basic field usage examples
    • Tokenizers
    • Char filters
    • Filters
    • Language oriented data
    • Dynamic fields
    • Copy fields
    • Running Solr with our own configuration
    • XML data format explained
    • JSON data format explained
    • CSV data format explained
    • Batch indexing
    • Doc values
    • Additional field properties
    • Nested documents support
    • Lab
      • Creating fields and types structure
      • Using copy fields
      • Using Solr language analysis capabilities
      • Indexing data in various format
  6.  
  7. Searching
    • Simple URI search
    • Paging
    • Sorting
    • Filters
    • Choosing display fields
    • Pseudo fields
    • Debug query
    • Lucene query language
    • Standard query parser
    • Dismax query parser
    • Extended dismax query parser
    • XML query parser
    • Examples of other parsers
    • Timing out searches
    • Using cursor for deep paging
    • Nested documents support
    • Dealing with relevancy
    • Lab
      • Paging
      • Sorting
      • Term searching
      • Using various query parsers
      • Using cursor
  8.  
  9. Data Analysis
    • Introduction to faceting
    • Basic use cases
    • Field faceting
    • Field prefix faceting
    • Sorting faceting results
    • Limiting faceting
    • Faceting execution control
    • Range faceting
    • Query faceting
    • Hierarchical faceting
    • Interval faceting
    • Lab
      • Building tag cloud using field faceting
      • Using prefixes to build simple autocomplete feature
      • Sorting faceting results
      • Working with numerical data and faceting
      • Using hierarchical faceting to get more insight into the data
      • Interval faceting
  10.  
  11. JSON Facets
    • Introduction to JSON request API
    • Facet functions
    • Nested JSON facets
    • Execution type
    • Lab
      • Searching using JSON request API
      • Finding top tags
      • Retrieving statistics using range faceting
      • Using terms JSON facets to retrieve term counts
      • Using functions with JSON facets
      • Nesting JSON facets
  12.  
  13. Highlighting and More Like This
    • Introduction to highlighting
    • Highlighting query hits
    • Specifying fields to highlight
    • Choosing highlighting tags
    • Using FastVectorHighlighter
    • Using PostingsHighlighter
    • Finding similar documents
    • Prerequisites for More Like This functionality
    • Configuring More Like This functionality
    • Lab
      • Highlighting field matches
      • Using own tags for matching highlighted fragments
      • Using various parsers with highlighting
      • Using different query for highlighting and matching
      • Finding documents similar to a given one
      • Using term frequency and length to find similar documents