At the end of November, we’ll be migrating the Sematext Logs backend from Elasticsearch to OpenSearch

Solr V2 API – Quick Look

May 10, 2017

Table of contents

Last updated on Jan 11, 2018

We are all used to the Solr API that has been present in Solr from its beginnings. We send the data using HTTP protocol, we include all parameters in the URL itself, and we are bound to that. Some people loved this, some not so much. Staring with Solr 6.5 and continuing with Solr 7 we got into our hands a new, self-documenting API called v2. Let’s look at this new API, how to use it and how it is different from the old-fashioned Solr API.

This post has been updated to match Solr 7.



Introducing the New Solr API

Let’s just immediately start working with the new API.  It’s probably the best way to learn about it.  Here’s the most basic request we can execute against the new Solr API:

$ curl http://localhost:8983/v2

The first thing you’ll notice is that the new API is not available in the usual Solr context – there is no /solr in the URL. Instead, we talk to it using the /v2 URI path or just /api. This lets Solr have two separate sets of APIs in the same instance of Solr and have a space for new APIs introduced in the future. The response to the above call looks as follows:

{
 "responseHeader":{
 "status":0,
 "QTime":0},
 "documentation":"https://lucene.apache.org/solr/guide/v2-api.html",
 "description":"V2 API root path"} 

As we can see, the new API returns quite a different information – we have the same response header and the information on where we can find the documentation for the API or given endpoint. This is the first major difference – the new v2 API is self-descriptive.

You can expect some of the new and old API calls to be very similar, for example, let’s try getting the list of collections that are present in our SolrCloud cluster. With the old API we could run the following command:

$ curl 'http://localhost:8983/solr/admin/collections?action=LIST'

The response contains the list of collections that are present and looks as follows:

{
 "responseHeader":{
 "status":0,
 "QTime":0},
 "collections":["test"]}

The new way of retrieving this information would look as follows:

$ curl 'http://localhost:8983/v2/c'

or

$ curl 'http://localhost:8983/v2/collections'

You can see, we can either use the full path or abbreviated form of it – quite nice if you write the request manually.


So, why is that different?

First things first – as already mentioned the new API is self-documenting. That means that we can get the list of information and options we have when using the new API. By adding the _introspect endpoint to any API v2 calls we can get the list of possible operations using that endpoint. For example:

$ curl 'http://localhost:8983/v2/collections/_introspect?indent=true'

The response returned by Solr is rather large, so we’ll just show a portion of that, but you can see that the API contains not only the response with the data we are looking for but also some additional descriptions which make the API self-documenting:

{
  "responseHeader":{
    "status":0,
    "QTime":2},
  "spec":[{
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api6",
      "description":"Deletes a collection.",
      "methods":["DELETE"],
      "url":{"paths":["/collections/{collection}",
          "/c/{collection}"]}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1",
      "description":"Create collections and collection aliases, backup or restore collections, and delete collections and aliases.",
      "methods":["POST"],
      "url":{"paths":["/collections",
          "/c"]},
.
.
.

As you can already tell, the new v2 API is more modern and most of the parameters are sent in the request body, instead of the URI. Once the new v2 API covers all the functionality of the old API, SolrJ and Solr admin will start using the new API and after that, it is expected that the old API will be deprecated and then removed. Because of that, it might be a good idea to start getting used to the new API right away, so you have an easier learning curve and faster adoption when you finally decide to move to the new way of talking to Solr.

V2 Solr API Capabilities

The response returned by the commands that we’ve seen above is large, so I encourage you to check the response yourself. What I would like to do is provide you with a brief description of what can be done using the v2 API:

  • Creating, deleting and managing collections
  • Creating aliases, backing up and restoring collections
  • Sending data
  • Updating collection configuration
  • Managing schema and managed resources
  • Using request handlers – for example running search requests
  • Adding and removing replicas
  • Managing cores
  • Performing overseer operations
  • Managing node roles
  • Setting cluster properties
  • Uploading and downloading blobs and metadata

As you can see we can already do lots of things with the new API and because the API is self-documenting we can quickly, without searching for the documentation, see how to work with it. For example, if we wanted to see what we can do with shards, we could run a command like this (we’ll use a collection called test that was created using the _default configuration that is bundled with Solr 7):

$ curl 'localhost:8983/v2/c/test/shards/_introspect&indent=true'

The response shows us what we can do with /shards API:

{
  "responseHeader":{
    "status":0,
    "QTime":0},
  "spec":[{
      "documentation":"https://lucene.apache.org/solr/guide/collections-api.html#deleteshard",
      "description":"Deletes a shard by unloading all replicas of the shard, removing it from clusterstate.json, and by default deleting the instanceDir and dataDir. Only inactive shards or those which have no range for custom sharding will be deleted.",
      "methods":["DELETE"],
      "url":{
        "paths":["/collections/{collection}/shards/{shard}",
          "/c/{collection}/shards/{shard}"],
        "params":{
          "deleteInstanceDir":{
            "type":"boolean",
            "description":"By default Solr will delete the entire instanceDir of each replica that is deleted. Set this to false to prevent the instance directory from being deleted."},
          "deleteDataDir":{
            "type":"boolean",
            "description":"y default Solr will delete the dataDir of each replica that is deleted. Set this to false to prevent the data directory from being deleted."},
          "async":{
            "type":"string",
            "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined. This command can be long-running, so running it asynchronously is recommended."}}}},
    {
      "documentation":"https://lucene.apache.org/solr/guide/collections-api.html",
      "description":"Allows you to create a shard, split an existing shard or add a new replica.",
      "methods":["POST"],
      "url":{"paths":["/collections/{collection}/shards",
          "/c/{collection}/shards"]},
      "commands":{
        "split":{
          "type":"object",
          "documentation":"https://lucene.apache.org/solr/guide/collections-api.html#splitshard",
          "description":"Splits an existing shard into two or more new shards. During this action, the existing shard will continue to contain the original data, but new data will be routed to the new shards once the split is complete. New shards will have as many replicas as the existing shards. A soft commit will be done automatically. An explicit commit request is not required because the index is automatically saved to disk during the split operation. New shards will use the original shard name as the basis for their names, adding an underscore and a number to differentiate the new shard. For example, 'shard1' would become 'shard1_0' and 'shard1_1'. Note that this operation can take a long time to complete.",
          "properties":{
            "shard":{
              "type":"string",
              "description":"The name of the shard to be split."},
            "ranges":{
              "description":"A comma-separated list of hexadecimal hash ranges that will be used to split the shard into new shards containing each defined range, e.g. ranges=0-1f4,1f5-3e8,3e9-5dc. This is the only option that allows splitting a single shard into more than 2 additional shards. If neither this parameter nor splitKey are defined, the shard will be split into two equal new shards.",
              "type":"string"},
            "splitKey":{
              "description":"A route key to use for splitting the index. If this is defined, the shard parameter is not required because the route key will identify the correct shard. A route key that spans more than a single shard is not supported. If neither this parameter nor ranges are defined, the shard will be split into two equal new shards.",
              "type":"string"},
            "coreProperties":{
              "type":"object",
              "documentation":"https://lucene.apache.org/solr/guide/defining-core-properties.html",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set, the node name, the data directory, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined. This command can be long-running, so running it asynchronously is recommended."},
            "waitForFinalState":{
              "type":"boolean",
              "description":"If true then request will complete only when all affected replicas become active.",
              "default":false}}},
        "create":{
          "type":"object",
          "properties":{
            "nodeSet":{
              "description":"Defines nodes to spread the new collection across. If not provided, the collection will be spread across all live Solr nodes. The names to use are the 'node_name', which can be found by a request to the cluster/nodes endpoint.",
              "type":"array",
              "items":{"type":"string"}},
            "shard":{
              "description":"The name of the shard to be created.",
              "type":"string"},
            "coreProperties":{
              "type":"object",
              "documentation":"https://lucene.apache.org/solr/guide/defining-core-properties.html",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set, the node name, the data directory, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined."},
            "waitForFinalState":{
              "type":"boolean",
              "description":"If true then request will complete only when all affected replicas become active.",
              "default":false}},
          "required":["shard"]},
        "add-replica":{
          "documentation":"https://lucene.apache.org/solr/guide/collections-api.html#addreplica",
          "description":"",
          "type":"object",
          "properties":{
            "shard":{
              "type":"string",
              "description":"The name of the shard in which this replica should be created. If this parameter is not specified, then '_route_' must be defined."},
            "_route_":{
              "type":"string",
              "description":"If the exact shard name is not known, users may pass the _route_ value and the system would identify the name of the shard. Ignored if the shard param is also specified. If the 'shard' parameter is also defined, this parameter will be ignored."},
            "node":{
              "type":"string",
              "description":"The name of the node where the replica should be created."},
            "instanceDir":{
              "type":"string",
              "description":"An optional custom instanceDir for this replica."},
            "dataDir":{
              "type":"string",
              "description":"An optional custom directory used to store index data for this replica."},
            "coreProperties":{
              "type":"object",
              "documentation":"https://lucene.apache.org/solr/guide/defining-core-properties.html",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set and the node name, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined."},
            "type":{
              "type":"string",
              "enum":["NRT",
                "TLOG",
                "PULL"],
              "description":"The type of replica to add. NRT (default), TLOG or PULL"},
            "waitForFinalState":{
              "type":"boolean",
              "description":"If true then request will complete only when all affected replicas become active.",
              "default":false}},
          "required":["shard"]}}},
    {
      "documentation":"https://lucene.apache.org/solr/guide/collections-api.html#list",
      "description":"Lists all collections, with details on shards and replicas in each collection.",
      "methods":["GET"],
      "url":{"paths":["/collections/{collection}",
          "/c/{collection}",
          "/collections/{collection}/shards",
          "/c/{collection}/shards",
          "/collections/{collection}/shards/{shard}",
          "/c/{collection}/shards/{shard}",
          "/collections/{collection}/shards/{shard}/{replica}",
          "/c/{collection}/shards/{shard}/{replica}"]}}],
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "availableSubPaths":{
    "/c/test/shards/{shard}/{replica}":["DELETE",
      "GET"],
    "/c/test/shards/{shard}":["DELETE",
      "POST",
      "GET"]}}

As you can see, the API provides us all information about itself that we need – the HTTP verbs that we can use, the parameters that can be present, and finally their description, so that we know what each parameter is all about. We can also get information about the given command and/or the HTTP verb, for example:

$ curl 'http://localhost:8983/v2/c/test/shards/shard2/_introspect?method=DELETE&indent=true'

Judging from the response further above we could, for example, delete a replica by running the following command:

$ curl -XDELETE 'localhost:8983/v2/c/test/shards/shard1/core_node6'

The response to the last command would look as follows:

{
 "responseHeader":{
 "status":0,
 "QTime":136},
 "success":{
 "192.168.1.15:8983_solr":{
 "responseHeader":{
 "status":0,
 "QTime":16}}}}

Which means that the replica of the shard1 has been removed, which can also be checked via the Solr admin panel. As we know we can also add replicas using the new API and this operation will be good to illustrate how to pass parameters with the request. Let’s add the replica to shard1 by using the following command:

$ curl -XPOST 'localhost:8983/v2/c/test/shards/' -H 'Content-type:application/json' -d '{
 "add-replica" : {
  "shard" : "shard1",
  "node" : "192.168.1.15:8983_solr"
 }
}'

We added the header identifying the content type of the body and we provided the add-replica command along with two parameters – shard and node. The shard parameter specifies which part of the collection we are interested in and the node property tells Solr, on which Solr instance the replica should be created. Please note that the node address is not only the IP address also include the port and usual _solr part.

The response would look as follows:

{
 "responseHeader":{
 "status":0,
 "QTime":370},
 "success":{
 "192.168.1.15:8983_solr":{
 "responseHeader":{
 "status":0,
 "QTime":238},
 "core":"test_shard1_replica_n13"}}}

What’s Next

The API we just introduced is still work in progress. We are still missing a few things, but the V2 API is fairly new, so we can expect lots of changes in the next few Solr versions.

Want to learn more about Solr? Subscribe to our blog or follow @sematext. If you need any help with Solr / SolrCloud – don’t forget that we provide Solr Consulting, Solr Production Support, and offer Solr Training!

Key JVM Metrics to Monitor for Peak Java Application Performance

Monitoring is crucial if you want to see what happens...

How Do You Monitor Cassandra Performance: Key Metrics to Measure

Apache Cassandra is a distributed database known for its high...

Java Logging Frameworks Comparison: SLF4j vs Log4j vs Logback vs Log4j2 [Differences]

Any software application or a system can have bugs and...