We've got, again, a little mystery here. Our main text collection is suddenly running at a snail's pace since Monday very early in the morning, the monitoring graph for response time went up. This is not unusual for Solr so the JVM's were all restarted, it always solves a sluggish collection, not this time. They were restarted yesterday as well, but no change. The VM's Solr is running on were rebooted today, also no change.
Not all queries are slow all the time, a random query is just slow sometimes, or sometime most of the times. All 6 replica's are sometimes slow.
We also took a good look at our monitoring, JVM heap was normal, IO was normal, CPU was normal until the first restart. CPU usage is since the first restart erratic but not worryingly off the charts, just not 'normal' as usual.
No changes were made to the collection for days before it became sluggish.
CPU sampling with VisualVM is not helpful either, nothing really stands out, especially when i compare it to another cluster that is still healthy. GC is also normal.
On 8/8/2018 7:26 AM, Markus Jelsma wrote: > We also took a good look at our monitoring, JVM heap was normal, IO was normal, CPU was normal until the first restart. CPU usage is since the first restart erratic but not worryingly off the charts, just not 'normal' as usual.
I've seen systems with severe performance issues where the user did not see anything out of the ordinary for these metrics. Sometimes this is because they do not know what to look for. What exactly does "normal" mean to you?
> No changes were made to the collection for days before it became sluggish. > > CPU sampling with VisualVM is not helpful either, nothing really stands out, especially when i compare it to another cluster that is still healthy. GC is also normal. > > So, any ideas out here?
Here's the initial questions for a performance issue, to see whether it'srelated to available memory or not:
* What OS is it running on? * How much memory does the server have? * How much index data is being handled by all Solr instances on that machine? * What is the total size of all Solr heaps on that machine? * Is there any other software besides Solr on the machine?
If the OS is Linux or another POSIX operating system that has the gnu version of "top" installed, then the following information is *extremely* helpful, and can answer most of the questions asked above:
Run the "top" program. Don't use htop or some other variant, it must be the actual program named "top" and it should be the version of that program from the Gnu projectso that Gnu keyboard shortcuts work.
Press shift-M to sort the listing by resident memory size. If your version of top is not from the Gnu project, this might not work ... but this is an extremely important step in these instructions, so if you don't have gnu top, you should see if you can get your version to sort by the resident memory column, descending.
Grab a screenshot of the top listing and share it with a file-sharing website. Dropbox is usually a good choice.
If you're running Solr on Windows, you can use the program named "Resource Monitor" to get something very similar. In that program, click on the Memory tab, click the "Working Set" column until it's sorted descending, and grab a screenshot. If necessary, expand the columns so all the numbers can be seen clearly.