Skip to main content

Solr Training in New York City — October 19-20

sematext sematext on

[Note: since this workshop has already taken place, stay up to date with future workshops at our Solr Training page]


For those of you interested in some comprehensive Solr training taught by an expert from Sematext who knows it inside and out, we’re running a super hands-on training workshop in New York City from October 19-20.

This two-day workshop will be taught by Sematext engineer — and author of Solr books — Rafal Kuc.

Target audience:

Developers and Devops who want to configure, tune and manage Solr at scale.

What you’ll get out of it:

In two days of training Rafal will help:

  • bring Solr novices to the level where he/she would be comfortable with taking Solr to production
  • give experienced Solr users proven and practical advice based on years of experience designing, tuning, and operating numerous Solr clusters to help with their most advanced and pressing issues

* See the Course Outline at the bottom of this post for details

When & Where:

  • Dates:        October 19 & 20 (Monday & Tuesday)
  • Time:         9:00 a.m. — 5:00 p.m.
  • Location:     New Horizons Computer Learning Center in Midtown Manhattan (map)
  • Cost:         $1,200 “early bird rate” (valid through September 1) and $1,500 afterward.  And…we’re also offering a 50% discount for the purchase of a 2nd seat!
  • Food/Drinks: Light breakfast and lunch will be provided


Attendees will go through several sequences of short lectures followed by interactive, group, hands-on exercises. There will be a Q&A session after each such lecture-practicum block.

Got any questions or suggestions for the course? Just drop us a line or hit us @sematext!

Lastly, if you can’t make it…watch this space or follow @sematext — we’ll be adding more Solr training workshops in the US, Europe and possibly other locations in the coming months.  We are also known worldwide for our Solr Consulting Services and Solr Production Support.

Hope to see you in the Big Apple in October!


Solr Training Workshop – Course Outline

  • Introduction to Solr
      1. What is Solr and use – cases
      2. Solr master – slave architecture
      3. SolrCloud architecture
      4. Why & When SolrCloud
      5. Solr master – slave vs SolrCloud
      6. Starting Solr with schema-less configuration
      7. Indexing documents
      8. Retrieving documents using URI request
      9. Deleting documents
  • Indexing data

      1. Index structure configuration
      2. Defining custom field types
      3. Tokenizers
      4. Char filters
      5. Filters
      6. Dynamic fields
      7. Copy fields
      8. Running Solr with our own configuration
      9. XML data format explained
      10. JSON data format explained
      11. CSV data format explained
      12. Batch indexing
      13. Doc values
      14. Norms
      15. Term vectors
      16. Nested documents support
  • Searching
      1. Simple URI search
      2. Paging
      3. Sorting
      4. Filters
      5. Choosing display fields
      6. Pseudo fields
      7. Debug query
      8. Lucene query language
      9. Standard query parser
      10. Dismax query parser
      11. Extended dismax query parser
      12. Examples of other parsers
      13. Timing out searches
      14. Using cursor for deep paging
      15. Nested documents support
  • Data analysis
      1. Introduction to faceting
      2. Basic use cases
      3. Field faceting
      4. Field prefix faceting
      5. Sorting faceting results
      6. Limiting faceting
      7. Faceting execution control
      8. Range faceting
      9. Date faceting
      10. Interval faceting
      11. Hierarchical faceting
      12. JSON facets
      13. Facet functions
      14. Nested JSON facets
      15. Using stats component to generate statistics for field
      16. Using stats component with function queries
      17. Using stats component with faceting
      18. Using stats component to calculate distinct field values
  • Beyond Search – highlighting and more like this
      1. Introduction to highlighting
      2. Highlighting query hits
      3. Specifying fields to highlight
      4. Choosing highlighting tags
      5. Merging phrases
      6. Using FastVectorHighlighter
      7. Using PhraseHighlighter
      8. Finding similar documents
      9. Prerequisites for More Like This functionality
      10. Configuring More Like This functionality
  • Beyond Search – Spellchecking
      1. Spellchecker with its own index
      2. File based spellchecker
      3. Index based spellchecker
      4. Building spellchecker
      5. Including spell checking results with queries
      6. Querying spellchecker independently
      7. Maximum number of suggestions
      8. Collation
      9. Controlling collation
      10. Accuracy
      11. Extended results
      12. Performance considerations
  • Beyond Search – Documents grouping
      1. Grouping documents by field value
      2. Grouping documents by function value
      3. Grouping documents by query
      4. Paging in grouped results
      5. Controlling number of groups and documents count
      6. Sorting inside groups
      7. Documents grouping and faceting
      8. Using collapse query parser
      9. Using expand component
  • Function queries
      1. Using function queries
      2. Math function queries
      3. Term function queries
      4. Example use cases
      5. Boosting by using functions
      6. Sorting by function
      7. External file field type
      8. Using external file field type for boosting
  • Search under control
      1. Routing
      2. Index time routing
      3. Query time routing
      4. Basic syntax for local params
      5. Parameter dereferencing
      6. Using parameter dereferencing in handlers configuration
      7. Using faceting tagging
      8. Using faceting exclusions
  • Tuning Solr
      1. General solrconfig.xml sections
      2. Lucene directory configuration
      3. Codec factory settings
      4. Schema factory settings
      5. Indexing threads
      6. Indexing buffer size
      7. Merge policy
      8. Merge scheduler
      9. Auto commit tuning
      10. Transaction log configuration
      11. Slow query threshold
      12. Caches
      13. Replication
      14. Replication throttling
  • Scaling Solr
      1. Proper Solr master configuration
      2. Proper Solr slaves configuration
      3. Replication for master – slave
      4. Querying in master – slave deployment
      5. Multiple masters architecture
      6. Setting up Solr slaves for multiple masters
      7. Indexing data in multi-master environment
      8. Querying in multi-master environment
  • Scaling SolrCloud
      1. ZooKeeper role explained
      2. Uploading configuration to ZooKeeper
      3. Sharding
      4. Using collections API
      5. Cluster state explained
      6. Creating replicas
      7. Removing replicas
      8. Caches in SolrCloud
      9. Shard splitting
      10. Migrating data between collections
      11. Aliases
      12. Adding shards with implicit routing
  • Operations
      1. Running Solr as a service on Linux systems
      2. Running Solr as a service on MS Windows systems
      3. Backing up Solr master – slave
      4. Backing up SolrCloud
      5. Current cluster state view
      6. Monitoring using JMX
      7. Monitoring using SPM
      8. Configuration of Solr logging
  • Developer APIs
      1. Connecting to Solr using Java
      2. Connecting to SolrCloud using Java
      3. Using SolrJ to index data
      4. Using SolrJ to query Solr
      5. Connecting to Solr using Python
      6. Using pySolr to index data
      7. Using pySolr to query Solr
      8. Streaming aggregations explained
  • Ecosystem
    1. Using Flume with Solr
    2. Using Logstash with Solr
    3. Solritas as the out of the box tool for data discovery
    4. Visualizing data using Banana

Yeah — you’ll learn a TON in just two days!