Skip to main content

Advanced Solr Training

Learn Advanced Solr & SolrCloud Tuning and Scaling

In this Solr course you will learn about query routing, results re-ranking, term vectors, schema API, custom similarity, merge policy, codecs, language identification, data import handler, advanced Solr and SolrCloud tuning and scaling, shard splitting, data migrations, handling a large number of collections, authentication, Solr and HDFS, and so on. Each section is followed by a lab with multiple hands-on exercises. See course outline below for more.

Your Apache Solr instructor is active Solr engineer and consultant with years of experience helping enterprise, medium and small organizations. Rafal has worked with clients from 15+ industries and he is the author of Solr Cookbook series with a ready to use recipes providing solutions to common problems when working with Solr. Here are some problems Rafal Kuć solved for Sematext clients recently:

  • Designed and deployed Master-Slave and SolrCloud based architectures from small businesses to large scale enterprises use-cases
  • Improved search relevancy to provide on-point results in various business use cases from e-commerce to health industries
  • Optimized clusters handling thousands of queries per second
  • Helped clients reduce operational costs by optimizing the amount of hardware needed as a result of SolrCloud tuning
  • Diagnosed and suggested solutions for various JVM related issues – from garbage collector problems to heap usage reduction without costs increase

What’s Included

  • 8 hours online training
  • A digital copy of the training material
  • Docker Compose files, configs, scripts, etc.
  • Certificate of Completion

Next Class: May 13-14 See Upcoming Classes

$800.00 -10% Early Bird Register Now

On-site training available upon request

Looking for an extended knowledge-based Solr training covering form beginner to an advanced level? You’ve come to the right place.

Request Now

Why attend?

  • Small, interactive, instructor-led classes
  • Lots of hands-on exercises
  • Customized learning experience
  • More flexible – no need to travel
  • Get our Solr certification – Certificate of Completion included
 

Who should attend?

This Solr course is designed for technical attendees experienced with Solr and looking to extending their Solr knowledge. A person should be able to index data to Solr, run queries, work with Solr analysis, use faceting, grouping, know basic Solr configuration and tuning principles. Experience with Linux systems is not a must, but a basic familiarity with running shell commands (e.g., using curl command) will make the course more enjoyable. If you do not have prior Solr experience and you would like to take advantage of Solr advance training, please consider attending Core Solr and Intermediate Solr classes.

Prerequisites: Sematext’s Intermediate Solr or pre-existing knowledge of Solr concepts covered in Intermediate and Core Solr.

What attendees say

Thank you for a very informative training, such a wealth of information that I truly enjoyed learning about.

Vickie Jean Charles Sr. System Engineer – Xactly Corporation

Course Outline

Search Under Control
  • Routing
  • Index time routing
  • Query time routing
  • Basic syntax for local params
  • Parameter dereferencing
  • Using parameter dereferencing in handlers configuration
  • Using filters tagging
  • Using faceting exclusions
  • Using pivot facets with stats component and query faceting
  • Advanced facets control
  • JSON facets control
  • Re-ranking queries results
  • Timing out searches
  • Lab
    • Indexing documents with routing
    • Running queries with routing
    • Using parameter dereferencing to create your own parameters in query
    • Tagging and excluding filters
Term Vectors
  • What are term vectors
  • Retrieving additional information from Solr
  • Understanding term vector component
  • Lab
    • Configure fields to use term vectors
    • Creating handlers that use term vectors
    • Retrieving term vectors when searching
    • Retrieving term positions and offsets
Configuring Solr Index
  • Schema API
  • Managed resources
  • Various relevancy algorithms
  • Lab
    • Using Schema API to retrieve information about collection structure
    • Adding new field type using Schema API
    • Adding new field using Schema API
    • Adding copy field using Schema API
    • Adding and removing synonyms using API
    • Adding and removing stopwords using API
Configuring Solr Internals
  • Lucene directory configuration
  • Schema factory settings
  • Codecs
  • Merge policy
  • Merge scheduler
  • Transaction log configuration
  • Config API
  • Lab
    • Configuring Solr to use managed schema
    • Creating new handler using API
    • Configuring merge policy for faster indexing
    • Configuring merge policy for less segments
Data Import Handler
  • Configuring data import handler
  • Using data import handler
  • Entity processors
  • Transformers
  • Lab
    • Importing data from SQL database
    • Partial data import from SQL database
    • Importing data from XML files using
Streaming Aggregations
  • Streaming expressions basics
  • Stream sources
  • Stream decorators
  • Scheduling streams
  • Streaming statistical language
  • SQL over MapReduce in SolrCloud
  • Export request handler
  • Lab
    • Searching using streaming aggregations
    • Merging two results streams
    • Retrieve unique documents based on a given field
    • Using scheduling streams
Expert Tuning Solr
  • Memory considerations
  • Indexing threads
  • Auto commit tuning
  • Caches
  • Replication throttling
  • Lab
    • Crld
    • User
API v2
  • What is API v2
  • Nested documents
Configuring Solr Internals
  • General solrconfig.xml section
  • Replication
  • Update request processors
  • Language detection
  • Configuring logging
  • Slow query log
  • Lab
    • Preparing master – slave replication
    • Language detection during document indexing
    • Configuring slow logging
Tuning Solr
  • Indexing threads
  • Indexing buffer size
  • Auto commit tuning
  • Caches
  • Replication throttling
  • Warming up
  • Lab
    • Configuring indexing buffer
    • Configuring auto commits
    • Configuring warming queries
Scaling Solr & SolrCloud
  • Solr master & slaves configuration
  • ZooKeeper role explained
  • Working with ZooKeeper
  • Sharding
  • Using Collections API
  • SolrCloud Replica Types
  • Caches in SolrCloud
  • Aliases
  • Lab
    • Creating collections
    • Creating aliases
    • Setting up caches for rapidly changing data
    • Setting up caches for high querying scenarios
Operations
  • Running Solr as a service on Linux and Windows systems
  • Backing up Solr master ­slave
  • Backing up SolrCloud
  • Monitoring using JMX
  • Monitoring using SPM
  • Key Metrics to Monitor
  • Lab
    • Install Solr using install scripts
    • Create a backup of master – slave environment

Main Topics

  • Solr Master-Slave Architecture
  • SolrCloud Architecture
  • Using Spatial Search & Heatmap Faceting
  • Grouping Documents using Various Criteria
  • Working with Relations
  • Using Function Queries for Relevancy
  • Using New v2 API
  • Understanding Apache Solr Configuration
  • Indexing Buffer, Automatic Commits, Caches & Warming Up
  • Sharding and Replication in SolrCloud
  • Working with Collections
  • Installing Solr
  • Key Apache Solr Metrics to Monitor

Elasticsearch Training

Upcoming Classes

Pick from a range of online classes, covering from beginner to advanced level. Choose the ones matching your needs.

DateClassPriceRegistration
May 13-14, 2019Advanced Solr$800 / person Only $720 / person before Apr 6thRegister Now
Sept 23-24, 2019Advanced Solr$800 / person Only $720 / person before 20 JulyRegister Now
Dec 9-10, 2019Advanced Solr$800 / person Only $720 / person before 30 SepRegister Now

Course key takeaways

After taking this course you will:

  • Understand the differences and use-cases for Solr and SolrCloud
  • Create Spatial Search and Function Queries
  • Perform Document Grouping
  • Configure and tune Query Spellchecking and Suggesters

Things to remember

Participants must use their own computer with OSX, Linux, or Windows, with a recent version of Java installed.

Participants should be comfortable using a terminal/command line. Sematext provides:
  • A digital copy of the training material
  • A VM with all configs, scripts, exercises, etc.

Need On-Site or Remote Training

Get in touch with us

Stay up to date

Get tips, how-tos, and news about Elastic / ELK Stack, Observability, Solr, and Sematext Cloud news and updates.

Sematext Newsletter