Intermediate Solr Training, San Francisco – Fall 2016

Oct 5, 2016 (Wed) – 9:00 am to 5:00 pm

Where:

  • MicroTek Labs
  • 655 Montgomery Street, Suite 400 San Francisco, CA 94111
Also available remotely through Virtual Training Room

Cost:

  • $700 early bird rate, $800 regular
  • 25% off regular price for the second seat

Overview

Rafal Kuć

Comprehensive 1-day Solr class taught by Rafal Kuć a seasoned Solr instructor and consultant from Sematext, and the author of several Solr and Elasticsearch books. After taking this course you will understand the differences and use-cases for Solr and SolrCloud, create Spatial Search and Function Queries, perform Document Grouping, configure and tune Query Spellchecking and Suggesters. During the second half of the source you will learn how to tune and scale Solr and SolrCloud, as well as various operational elements, like monitoring, backups, etc. See course outline below for more. Each section is followed by a lab with multiple hands-on exercises.


Who Should Attend

The course is designed for technical attendees with basic Solr experience. A person should be able to index data to Solr, run queries, work with Solr analysis and use faceting. Experience with Linux systems is not a must, but a basic familiarity with running shell commands (e.g., using curl command) will make the course more enjoyable. If you do not have prior Solr experience or have just started working with Solr please consider attending Core Solr class.

Prerequisites

Sematext’s Core Solr or pre-existing knowledge of Solr concepts covered in Core Solr


Things To Remember

  • Arrive at least 20 minutes early to class and on time after each break.
  • Participants must bring their own laptop with OSX, Linux or Windows to the workshop. Laptops should have the latest version of Java installed. You should be comfortable using a terminal / command line.

  • If you have any dietary restrictions be sure to let us know at least a week prior to the training.

What We Provide

For this training Sematext provides:
  • A digital copy of the training material
  • A VM with all configs, scripts, exercises, etc.
  • Breakfast, lunch, snacks, coffee, tea, juices, soft drinks, and water

Course Outline

Modules

  1. Solr Architecture
    • Solr master – slave architecture
    • SolrCloud architecture
    • Solr master – slave vs SolrCloud
  2. Spatial Search
    • Indexing spatial data
    • Spatial filters
    • Distance function queries
    • Bounding box field
    • Heatmap faceting
    • Lab
      • Configuring spatial field types
      • Indexing spatial data
      • Searching for documents within distance from a point
      • Sorting documents on the basis of a distance
      • Boosting documents on the basis of distance
  3. Documents Grouping
    • Grouping documents by field value
    • Grouping documents by function value
    • Grouping documents by query
    • Paging in grouped results
    • Controlling number of groups and documents count
    • Sorting inside groups
    • Documents grouping and faceting
    • Using collapse query parser
    • Using expand component
    • Lab
      • Displaying top matching document per group
      • Sorting grouping results
      • Controlling number of displayed documents and groups
      • Sorting inside groups
      • Using queries for creating document groups
      • Displaying number of calculated groups
      • Using faceting with grouping
      • Using collapse parser to execute efficient grouping
  4. Function Queries
    • Using function queries
    • Math function queries
    • Term function queries
    • Example use cases
    • Boosting by using functions
    • Sorting by function
    • External file field type
    • Using external file field type for boosting
    • Lab
      • Using efficient range filtering
      • Sorting on the basis of function value
      • Including function value in returned documents
      • Boosting using function value
  5. Spellchecking
    • Spellchecker with its own index
    • File based spellchecker
    • Index based spellchecker
    • Building spellchecker
    • Including spell checking results with queries
    • Querying spellchecker independently
    • Maximum number of suggestions
    • Collation
    • Controlling collation
    • Accuracy
    • Extended results
    • Lab
      • Working with Spellchecker configuration
      • Running queries with Spellchecker
      • Using various Spellchecker implementations
  6. Suggesters
    • What are suggesters
    • Suggester types
    • Configuring suggesters
    • Using different dictionary factories
    • Lab
      • Creating suggester configuration for a field
      • Building suggester dictionary automatically
      • Creating separate suggester configuration
      • Using created suggester
  7. Configuring Solr Internals
    • General solrconfig.xml section
    • Replication
    • Update request processors
    • Language detection
    • Configuring logging
    • Slow query log
    • Lab
      • Preparing master – slave replication
      • Language detection during document indexing
      • Configuring slow logging
  8. Tuning Solr
    • Indexing threads
    • Indexing buffer size
    • Auto commit tuning
    • Caches
    • Replication throttling
    • Warming up
    • Lab
      • Configuring indexing buffer
      • Configuring indexing threads
      • Configuring auto commits
      • Configuring warming queries
  9. Scaling Solr & SolrCloud
    • Solr master & slaves configuration
    • ZooKeeper role explained
    • Working with ZooKeeper
    • Sharding
    • Using Collections API
    • Caches in SolrCloud
    • Aliases
    • Lab
      • Creating collections
      • Creating aliases
      • Setting up caches for rapidly changing data
      • Setting up caches for high querying scenarios
  10. Operations
    • Running Solr as a service on Linux and Windows systems
    • Backing up Solr master ­ slave
    • Backing up SolrCloud
    • Current cluster state view
    • Authentication and authorization
    • Monitoring using JMX
    • Monitoring using SPM
    • Key Metrics to Monitor
    • Authentication and authorization