In this tutorial, you’ll see how to deploy Solr on Kubernetes. You’ll also see how to use the Solr Operator to autoscale a SolrCloud cluster based on CPU with the help of the Horizontal Pod Autoscaler. Let’s get going! đ
Prerequisites
For SolrCloud weâll need Zookeeper. Also, if we want to autoscale Solr with the Horizontal Pod Autoscaler (HPA) based on CPU – admittedly, the simplest scenario – weâll need a Metrics Server.
Zookeeper
Here, you have three options. First, is to set up Zookeeper manually. Basically you can create a StatefulSet, with your Zookeeper nodes (typically 3), then add a headless service on top. Here’s an example gist.
A cleaner way is to use Helm. Thereâs a well-maintained chart by Bitnami that you can install like this:
helm install bitnami-zookeeper oci://registry-1.docker.io/bitnamicharts/zookeeper \
--set image.tag=3.8 \
--set fourlwCommandsWhitelist="mntr\,conf\,ruok" \
--set autopurge.purgeInterval=1 \
--set heapSize=512 \
--set replicaCount=3
Where:
- bitnami-zookeeper is the name we give to this release (package installation)
- image.tag is your Zookeeper version. In this tutorial, weâll use Solr 9.3, which uses Zookeeper 3.8
- fourlwCommandsWhitelist will inject these four letter commands into Zookeeperâs configuration file, so that Solr can check the health of the Zookeeper ensemble
- autopurge.purgeInterval makes Zookeeper check every hour for old snapshots and remove them
- heapSize is the JVM heap in MB that we allow Zookeeper to use
- replicaCount is the number of Zookeeper nodes
Have a look at the chart readme for more details. In this tutorial, the most important thing that weâre missing is persistence: youâll want your Zookeeper nodes to use persistent volumes to store data. Weâre not doing persistence here to keep things simple and environment-agnostic (i.e. you can run this tutorial on your local Docker Desktop).
Once the Zookeeper pods are up, you can try to connect to them via a port forward:
kubectl port-forward svc/bitnami-zookeeper 2181:2181
And then (in a different terminal), check if it responds:
echo ruok | nc localhost 2181
It should say imok
. These are Zookeeperâs four-letter words for âAre you OK?â and âI am OKâ respectively.
There is a third way to deploy Zookeeper: directly with the Solr Operator. This is actually the default method, you’d do nothing at this stage. In fact, for the other two options, we have to tell the Solr Operator not to deploy Zookeeper and use the “external” Zookeeper instead. Weâll do that later.
Why an external Zookeeper? I don’t have a strong preference, itâs just that I find the âembeddedâ Zookeeper Operator to be too much on the complex side and too little on the reliable side. Donât quote me on it đ
Metrics Server
Metrics Server is a lightweight source of basic metrics (CPU, memory) specifically designed to work with Horizontal Pod Autoscaler or Vertical Pod Autoscaler. Later in this tutorial, weâll dynamically add&remove nodes based on CPU and weâll need a provider for these metrics. The simplest way to get there is with a Metrics Server.
As always, Helm would be the most straightforward way. Add the official repo:
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
Then install it:
helm install metrics-server metrics-server/metrics-server --set args\[0\]="--kubelet-insecure-tls"
Notice the --kubelet-insecure-tls
parameter. This is needed for development setups (e.g. Docker Desktop) where the kubeletâs certificate isnât signed by a trusted CA. In a production Kubernetes you shouldnât need this.Â
Installing Solr
Solr will be managed by the Solr Operator and weâll need three things:
- Set up CRDs for the object weâll use. In this tutorial weâll only refer to SolrClouds, but there are others under https://dlcdn.apache.org/solr/solr-operator/ in the âcrdsâ directory of your Solr Operator version
- Install the Solr Operator
- Set up a SolrCloud cluster
CRDs
If you want to be lean and mean, youâd just install the SolrCloud CRD for this tutorial:
kubectl create -f
https://dlcdn.apache.org/solr/solr-operator/v0.7.1/crds/solrclouds.yaml
If you need to work with backups and other objects, youâll likely want to try the all.yaml
or all-with-dependencies.yaml
instead. The latter is useful if you want to use the Zookeeper that Solr Operator can install on its own.
Installing the Solr Operator
Weâll use Helm once again. Add the repository:
helm repo add apache-solr
https://solr.apache.org/charts
If you had the repo already there, you might want to do helm repo update
. This applies to all repositories, like youâd do apt-get update
for a local package manager.
When installing the Solr Operator, youâll want to specify the version and whether you want it to use the Zookeeper Operator to manage Zookeeper. We donât, we installed Zookeeper separately earlier:
helm install solr-operator apache-solr/solr-operator --version 0.7.1 \
--set zookeeper-operator.install=false
Youâll find other configuration options in the Solr Operator Helm chart values documentation. Note that this is the main
branch, some options might differ for older versions of the Solr Operator.
Deploying a Solr Cluster
Helm is your friend. Hereâs an example:
helm install example-solr apache-solr/solr --version 0.7.1 \
--set image.tag=9.3 \
--set solrOptions.javaMemory="-Xms500m -Xmx500m" \
--set zk.address="bitnami-zookeeper-headless:2181" \
--set podOptions.resources.requests.cpu=200m
Where:
- The Helm chart version should match your Solr Operator version, theyâre tested together.
- image.tag is the Solr version.
- solrOptions.javaMemory is one of the Solr-specific configuration options. In this case, heap size.
- zk.address is the Zookeeper headless service. In our case, itâs the one set up by the Bitnami Helm chart. If youâre using the Zookeeper Operator referenced by the Solr Operator, skip this option. Note that we’re using the headless service here, because we need Solr to know the IPs of the Zookeeper pods, we don’t want to just connect to them via a single IP. Which is what a regular service does.
- podOptions.resources.requests.cpu will request 0.2 vCPUs from Kubernetes. The value isnât very important (unless itâs too high and the default of 3 pods wonât be able to fit in your cluster). Itâs just that itâs required for autoscaling: if you want to use the Horizontal Pod Autoscaler on a metric from the Metrics Server, you need to request that metric for the pod first. If you donât want to autoscale Solr, you can skip this option – though resource requests and limits are generally a good practice. You can always scale the cluster manually with a command like:
kubectl scale --replicas=5 solrcloud/example
To see the complete list of options for the Solr Helm chart, have a look at the Solr Helm chart values documentation . As with Zookeeper, persistent storage might be a useful one.
After installing the Helm chart, you can see the overall status of your Solr cluster with:
kubectl get solrclouds
Accessing Solr from Outside
At this point, Solr can only be accessed from within the Kubernetes cluster. If you want to, say, open the Admin UI, you have two options: port forwarding and Ingress.
Youâd normally use port-forwarding in a development environment. You can connect to the SolrCloud service like this (it listens to port 80 by default):
kubectl port-forward svc/example-solrcloud-common 8983:80
Then you can see the Admin UI at http://localhost:8983
For production, youâll want to use an Ingress. Thereâs a tutorial on how to install it locally on the official Solr Operation documentation. Once you have Ingress set up, youâll need to point your SolrCloud to it via addressability.external options when you deploy it – you have an example in the docs as well.
Solr + HPA = ❤️
To autoscale in Kubernetes youâre most likely going to use the Horizontal Pod Autoscaler (HPA). Thereâs also the Vertical Pod Autoscaler (VPA), but itâs less common. The difference is that HPA will adjust the number of pods to match your capacity needs, while VPA will adjust the resource requests and (currently) restart pods so they can use more resources. With Solr, you usually have one pod per host, so using VPA wouldnât help that much. Weâll concentrate on HPA.
If you followed so far, you should have 3 Solr pods (thatâs the default). And if you didnât set replicas manually, HPA can change the number of replicas, if you choose to set it up. So letâs do that:
kubectl autoscale solrcloud example --cpu-percent=2 --min=3 --max=6
In plain English, we just told Kubernetes to set up an autoscale (HPA) object that looks at our SolrCloud object named âexampleâ (our test cluster) and aim for a CPU usage of 2%. Make sure that you have at least 3 pods and no more than 6.
Why 2%? Because I want to test scaling up, your idle CPU is likely to be more than that, so if you run:
kubectl get hpa
Youâll probably see that weâre using more than 2% CPU and HPA is adding more Solr pods. It should get to 6 quickly. You can check that in Solr Admin at http://localhost:8983/solr/#/~cloud?view=nodes
Similarly, you can âprovokeâ the cluster to scale down by updating the rule to a high CPU value:
kubectl autoscale solrcloud example --cpu-percent=90 --min=3 --max=6
Â
Rebalancing Solr Shards
Adding and removing Solr pods isnât terribly useful if data doesnât populate new pods while scaling up. Or doesnât move off of nodes before they get evicted on scale-down. This kind of functionality is supported starting from Solr Operator version 0.8.0, which – at the time of writing this – is just around the corner.
In short, if you have Solr Operator 0.8.0 or later, feel free to create a collection that has more shards (e.g. 12). Then youâll see (in e.g. the nodes view of the Solr Admin UI) not only nodes being added/removed, but also shards rebalancing.
For more information about how this rebalancing works under the hood, have a look at this talk from Houston Putman:
Conclusions
As weâve set up a Solr cluster in Kubernetes and we autoscaled it, I hope you learned a few things. In my opinion, the devil is in the details. For example:
- Which metrics should determine when to scale up or down? Itâs quite likely that CPU alone doesnât cut it for your use-case. Although if it does, HPA might just be enough, as it has some useful options around how often to add nodes, etc. If it doesnât, you might want to have a look at KEDA, which can trigger HPA to autoscale on custom metrics, as defined by Scalers.
- What exactly happens to the cluster during autoscaling? It may be under heavy load, in which case it might be even counterproductive to start shuffling shards. It might move too much data at once (in that case maybe replication throttling will help) or it might be in the middle of an upgrade.
- Would it be better to add/remove replicas than to shuffle shards? For example, if you have a spike in traffic every morning when people start working, you might want to add additional nodes and replicas, then remove them when the spike is gone. You might not even need HPA for this, a couple of CronJobs that change the number of replicas at specific times could do the trick.
Either way, Iâd suggest watching this space, as we plan to write more in-depth content about such scenarios.
And if youâre into Solr, you might not be aware that Sematext is a one-stop-shop for it. We offer:
- Solr Training classes: public, private, remote and on-site.
- Solr Consulting: if you need help developing your Solr project.
- Solr Production Support: for when production fires happen.
- Solr Monitoring and Solr Log Analysis: so that you know whatâs going on. Click the big orange button below to start a free trial.
Last but not least, if youâd like to provide such services, weâre looking for new colleagues đ