If you’re thinking of running OpenSearch on Kubernetes, you have to check out the OpenSearch Kubernetes Operator. It’s by far the easiest way to get going, you can configure pretty much everything and it has nice functionality, such as rolling upgrades and draining nodes before shutting them down.
Also, keep in mind that Sematext offers services for OpenSearch consulting, support, and training. Check out this short video below to learn more:
Let’s get going 🙂
You need three commands to start. First, add the Helm repository:
helm repo add opensearch-operator https://opster.github.io/opensearch-k8s-operator/
Then install the Operator:
helm install opensearch-operator opensearch-operator/opensearch-operator
Finally, install your first OpenSearch cluster:
helm install example-opensearch opensearch-operator/opensearch-cluster
If you’re on an empty Kubernetes and you have the luxury to run
kubectl get all and grok the output, you’ll notice the following (after a few minutes):
- A StatefulSet where OpenSearch nodes run. You have 3 OpenSearch pods by default
- A Deployment for OpenSearch Dashboards – by default with one pod
- There’s also a deployment for the Operator itself
- A service to connect to OpenSearch (by default,
- A service to connect to OpenSearch Dashboards (by default,
- A couple of “side” pods that take care of bootstrapping the cluster and applying the security configuration via securityadmin.sh
Once your cluster is up (e.g. your StatefulSet is 3/3), you should be able to connect to OpenSearch via port forwarding:
kubectl port-forward svc/my-cluster 9200:9200
And then test it via:
curl -k -u 'admin:admin' 'https://localhost:9200'
And/or you can port forward the OpenSearch Dashboards port:
kubectl port-forward svc/my-cluster-dashboards 5601:5601
After that, you can connect to http://localhost:5601 and log in with the user
admin and the password
admin. Don’t worry, we’ll change the default credentials in a bit. But until then, let’s tackle two immediate issues:
- If the
curlabove worked, you might notice that you’re running an old version of OpenSearch. At the time of writing this, with OpenSearch Operator 2.4, the latest version of OpenSearch is 2.9. Yet you’ll see OpenSearch 2.3
- The installation may fail, because by default the cluster asks for a lot of resources. In my case, it didn’t fit the local Docker Desktop with 8GB of RAM. One OpenSearch pod failed and
kubectl describetold me there aren’t enough resources. But surely 8GB of RAM is enough for a test OpenSearch cluster
Basic Changes: Version, Resources
To change the default parameters, we’ll need a values file. You can also override via the command line, but you’ll see there are quite a few such parameters. Here’s a sample values file, let’s call it
opensearchCluster: enabled: true general: httpPort: "9200" version: 2.9.0 serviceName: "my-cluster" drainDataNodes: true setVMMaxMapCount: true nodePools: - component: nodes replicas: 3 roles: - "cluster_manager" - "data" jvm: -Xmx256M -Xms256M resources: requests: memory: "500Mi" cpu: "500m" security: tls: transport: generate: true http: generate: true dashboards: enable: true replicas: 1 version: 2.9.0 resources: requests: memory: "500Mi" cpu: "500m"
You’d apply it via:
helm install -f values.yaml example-opensearch opensearch-operator/opensearch-cluster
If you previously installed an OpenSearch cluster, you’ll want to remove it first via
helm uninstall example-opensearch. And you may also want to remove the persistent volumes of old nodes (check
kubectl get pvc for details).
Once installation completes, you should have an OpenSearch 2.9 cluster asking for less memory than before. Let’s take a closer look at the parameters from
- The general section has to do with your OpenSearch cluster as a whole:
- HTTP port to listen to.
- Version 2.9.
- Name of the Kubernetes service. It’s different than the cluster name, the cluster name is the Helm release name (
example-opensearchin our case).
- Whether to drain the data nodes on shutdown. We want that, otherwise we can lose data
- Whether to set
sysctlof the host (required for OpenSearch to allocate lots of virtual memory when mmapping files).
- nodePools defines different types of nodes. For example, cluster managers vs data nodes, but here our nodes have both roles. Here you can also change the number of pods (nodes) and also its resource requests and limits. By default, the heap size is going to be half of the requested memory, but you can override
Xmxand other java parameters via the
- The security section allows you to configure TLS for inter-node communication (transport) as well as HTTP, and whether you’d have the Operator generate certificates for you (vs supplying your own, which is probably what you’ll do in production).
- Finally, you may enable OpenSearch Dashboards (it’s enabled by default) and there you can also provide the number of replicas (pods), version, requests, etc. Dashboards requests 1GB of RAM by default, we can reduce that for a local test, too.
At this point we have the simplest cluster with three nodes holding all roles. In production, we’ll want to have dedicated cluster managers, or maybe a more complex configuration, like hot-cold. The Operator supports all this, let’s have a look how.
So far, all nodes are the same because they’re part of the same nodePool. But we can have more. Here’s an example cluster with three dedicated cluster manager nodes and two data nodes:
nodePools: - component: cluster-managers diskSize: "1Gi" replicas: 3 roles: - "cluster_manager" resources: requests: memory: "500Mi" cpu: "500m" - component: data-nodes diskSize: "30Gi" replicas: 2 roles: - "data" resources: requests: memory: "500Mi" cpu: "500m"
This will create two StatefulSets: one
example-opensearch-cluster-managers with the three dedicated cluster managers, and one for the data nodes with a similarly constructed name: helm release + node pool name.
diskSize option: it dictates the size of the Persistent Volume that the Operator creates for OpenSearch to store data.
By default, OpenSearch Dashboards listens on plain HTTP. If you want it to listen on HTTPS, the
dashboards section of your values file can look like this:
dashboards: enable: true tls: enable: true generate: true replicas: 1 version: 2.9.0
Last but not least, you might want to change the default admin user credentials. Or any other security aspect, such as roles or roles mapping. To do that, you’ll need a secret that holds your security configuration. You can find an example here. In that secret, you’ll provide the content of all the security YAML files that are normally in the
config/opensearch-security/ within a tar.gz installation.
For changing the default admin user, the essential part is this one:
apiVersion: v1 kind: Secret metadata: name: securityconfig-secret type: Opaque stringData: internal_users.yml: |- _meta: type: "internalusers" config_version: 2 admin: hash: "$2y$12$B1MZUbsRd6AbhUXiSw2GOejrlPrnqgoDwgPm/LqH0VTlF8xM2.leO" reserved: true backend_roles: - "admin" description: "Demo admin user" action_groups.yml: |- …
In this case, I’m keeping the username as
admin, but I’m changing the password to
admin123, as the Operator documentation also exemplifies. OpenSearch validates the password via the
hash, which we have to compute. If you have the tar.gz archive unpacked somewhere, you can compute the hash of a password via:
plugins/opensearch-security/tools/hash.sh -p admin123
If not, you can try using Python, as the Operator documentation suggests:
python -c 'import bcrypt; print(bcrypt.hashpw("admin123".encode("utf-8"), bcrypt.gensalt(12, prefix=b"2a")).decode("utf-8"))'
You’d apply the secret via:
kubectl create -f securityconfig-secret.yaml
Now OpenSearch can validate our new password, once security is initialized. But in order to initialize security, we need to give the Operator the default admin credentials. To do that, we’ll use another secret, where we base64-encode both the username and the password of our
admin user. Let’s call it
apiVersion: v1 kind: Secret metadata: name: admin-credentials-secret type: Opaque data: # admin username: YWRtaW4= # admin123 password: YWRtaW4xMjM=
Finally, we’ll put the secrets in our Helm values file. The
security section would look like this:
security: config: adminCredentialsSecret: # these are the admin credentials for the Operator to use name: admin-credentials-secret securityConfigSecret: # this is the whole security configuration for OpenSearch name: securityconfig-secret tls: transport: generate: true http: generate: true
If you have OpenSearch Dashboards enabled, it also needs to communicate with OpenSearch (internally, separately from your login user when you go to Dashboards). You’ll provide that username and password with a secret as well, just like we did with
To keep things simple, we’ll use the same default admin user (i.e. the same
admin-credentials-secret), so the
dashboard section becomes:
dashboards: enable: true tls: enable: true generate: true opensearchCredentialsSecret: # doesn't have to be the same as adminCredentialsSecret name: admin-credentials-secret replicas: 1 version: 2.9.0
And that’s it! With your new secrets and values file, if you reinstall your
opensearch-cluster release, you should have your own admin credentials and Dashboards will work over HTTPS.
Cool! What’s Next?
That depends on what you’d like to do 🙂 Here’s an FAQ:
Other Options for the Operator?
Anything Exciting Coming Up?
Yes, I’m looking forward to autoscaling. Check out the roadmap for more. Solr can already do that and there’s another operator that works for both Elasticsearch and OpenSearch that supports autoscaling: you’ll find a tutorial focused on the logs use-case here.
How Do I Troubleshoot?
I’m usually doing
kubectl get statefulsets,
kubectl get pods and the like to see what’s going on. If something is wrong,
kubectl describe and
kubectl logs were my best friends.
Though at some point I got sick of going back and forth between the logs of the operator, those of OpenSearch pods and those of “helper” pods. So I just set up Sematext Logs: a
helm repo add, a
helm install, then I chose the logs to ship via log discovery. In 5 minutes I could filter out all the “I can’t connect to TLS” noise (of course TLS doesn’t work yet, it didn’t get set up) and focus on the important messages.
If Sematext Logs sounds interesting, click on the big orange button at the end of this post to start your trial. Or check out Opensearch performance monitoring and Opensearch log monitoring integrations.
Where Do I Learn More about OpenSearch or get help?
Any Interesting Jobs with OpenSearch and Kubernetes?
Yes, I just happen to know a company who’s hiring worldwide.