Registration is open - Live, Instructor-led Online Classes - Elasticsearch in March - Solr in April - OpenSearch in May. See all classes


Virtual Course Outline – Advanced Solr


Advanced Solr, March 12-13, 2018

Days: March 12-13, 2018

Time: 9:00 AM to 1:00 PM EDT each day

Cost: $800 per participant

Overview

Comprehensive 2-day sessions (two 4-hour sessions), this
Solr online class is taught by Rafal Kuć a seasoned Solr instructor and consultant from Sematext, the author of several Solr and Elasticsearch books and frequent
conference speaker. The training is held online from 9:00 am – 1:00 pm (ET).

In this course you will learn about query routing, results re-ranking, term vectors, schema API, custom similarity, merge policy, codecs, language identification, data import handler, advanced Solr and SolrCloud tuning and scaling, shard splitting, data migrations, handling a large number of collections, authentication, Solr and HDFS, and so on. Each section is followed by a lab with multiple hands-on exercises. See course outline below for more.

Who Should Attend

The course is designed for technical attendees experienced with Solr and looking to extending their Solr knowledge. A person should be able to index data to Solr, run queries, work with Solr analysis, use faceting, grouping, know basic Solr configuration and tuning principles. Experience with Linux systems is not a must, but a basic familiarity with running shell commands (e.g., using curl command) will make the course more enjoyable. If you do not have prior Solr experience and you would like to take advantage of Solr advance training, please consider attending Core Solr and Intermediate Solr classes.

Prerequisites

Sematext’s Intermediate Solr or pre-existing knowledge of Solr concepts covered in Intermediate and Core Solr

Why Attend

The virtual Solr training gives you and your team the skills needed to successfully use Solr capabilities by improving your workflow and increasing efficiency.  Further benefits:

  • a customized learning experience
  • same high-quality instruction as our public or private Elasticsearch classes
  • more affordable than public training
  • more flexible – no need to travel

Things to Remember

For the online training all participants must use their own computer with OSX, Linux, or Windows, with the latest version of Java installed.  Participants should be comfortable using a terminal / command line. Sematext provides:

  • a digital copy of the training material
  • a VM with all configs, scripts, exercises, etc.

Course Outline

Modules

  1. Search Under Control
    • Routing
    • Index time routing
    • Query time routing
    • Basic syntax for local params
    • Parameter dereferencing
    • Using parameter dereferencing in handlers configuration
    • Using filters tagging
    • Using faceting exclusions
    • Using pivot facets with stats component and query faceting
    • Advanced facets control
    • JSON facets control
    • Re-ranking queries results
    • Timing out searches
    • Lab
      • Indexing documents with routing
      • Running queries with routing
      • Using parameter dereferencing to create your own parameters in query
      • Tagging and excluding filters
  2. Term Vectors
    • What are term vectors
    • Retrieving additional information from Solr
    • Understanding term vector component
    • Lab
      • Configure fields to use term vectors
      • Creating handlers that use term vectors
      • Retrieving term vectors when searching
      • Retrieving term positions and offsets
  3. Configuring Solr Index
    • Schema API
    • Managed resources
    • Various relevancy algorithms
    • Lab
      • Using Schema API to retrieve information about collection structure
      • Adding new field type using Schema API
      • Adding new field using Schema API
      • Adding copy field using Schema API
      • Adding and removing synonyms using API
      • Adding and removing stopwords using API
  4. Configuring Solr Internals
    • Lucene directory configuration
    • Schema factory settings
    • Codecs
    • Merge policy
    • Merge scheduler
    • Transaction log configuration
    • Config API
    • Lab
      • Configuring Solr to use managed schema
      • Creating new handler using API
      • Configuring merge policy for faster indexing
      • Configuring merge policy for less segments
  5. Data Import Handler
    • Configuring data import handler
    • Using data import handler
    • Entity processors
    • Transformers
    • Lab
      • Importing data from SQL database
      • Partial data import from SQL database
      • Importing data from XML files using
  6. Streaming Aggregations
    • Streaming expressions basics
    • Stream sources
    • Stream decorators
    • Scheduling streams
    • Streaming statistical language
    • SQL over MapReduce in SolrCloud
    • Export request handler
    • Lab
      • Searching using streaming aggregations
      • Merging two results streams
      • Retrieve unique documents based on a given field
      • Using scheduling streams
  7. Expert Tuning Solr
    • Memory considerations
    • Indexing threads
    • Auto commit tuning
    • Caches
    • Replication throttling
    • Lab
      • Configuring indexing threads
      • Configuring auto commits
      • Throttle replication
  8. Scaling Solr
    • Proper Solr master configuration
    • Proper Solr slaves configuration
    • Multiple masters architecture
    • Setting up Solr slaves for multiple masters
    • Indexing data in multi-­master environment
    • Querying in multi­-master environment
  9. Expert SolrCloud
    • ZooKeeper role explained
    • Sharding and replication
    • Rule based shard placement
    • Cluster state explained
    • Caches in SolrCloud
    • Shard splitting
    • Migrating data between collections
    • Working with large number of collections
    • Lab
      • Creating collection matching environment needs
      • Adding and removing replicas
      • Moving shards around the cluster
      • Adding shards to collection
      • Migrating data between collections
  10. Other Solr Features
    • Enabling security in SolrCloud
    • Basic authorization
    • Other authorization options
    • Cross datacenter replication
    • Learning to Rank
    • Running Solr on HDFS
    • Lab
      • Securing Solr instance
      • Giving permissions to users
      • Removing permissions from users
      • Setup cross datacenter replication

Start Free Trial