i'm asking this probably for the second time as most resources out there are identical and I need some more answers/help to determine the proper way of improving my ES cluster performance.
We are using ES 5.4.0 and our cluster consist of client nodes, master nodes and 4 data nodes. Also, 2 LS instances are streaming log events to ES.
My issues/questions are as follows:
1. I have done most of the suggested options to increase performance but don't seem to be able to go beyond primary indexing rate of 4k events/s, and this causes a back pressure on LS. LS without the ES output plugin can do about 10k/s, but with ES plugin output enabled it drops by 50% or more.
2. I changed the flush size and other settings on LS but still no big change.
3. I have monitored the ES queues and some of them are barely going beyond 35.
4. How much of a difference can replication set 0 make in such case ?
5. Does all my none data nodes need to be set to node.ingest = false ? does this affect performance if not set to false ?
6. Disks are not SSD is this a big factor in the indexing performance ?
7. Does setting index.merge.scheduler.max_thread_count: 1 in my case help ?
The only missing piece since I worked with ES is I dont understand is how i can easily achieve for example a 10k primary indexing rate, taking into account that CPU and RAM don't seem to be highly utilized.
Any help on this topic is appreciated as every time I think we got a good result we get back to this performance issue.
Actually not sure if this is something related to how fast my ES nodes are writing, but today i looked at the disk i/o utilization and its pretty high at all times, 90% high !
So I'm wondering if I have multiple logical volumes mapped to different devices would that increase the speed of the ES i/o operation ? That way I can distribute multiple data paths on the ES data nodes, on multiple logical volumes.