As you may know, Sematext runs a service we internally call SPM – Scalable Performance Monitoring, a currently-still-free SaaS for monitoring performance of Solr, HBase, and soon a few other technologies we often help our clients with. One of the things we monitor for Solr and other search technologies is the size of the index. We monitor it by periodically checking its size, number of documents in it, number of deleted documents, number of index segments, files, etc.
Recently, we had an internal discussion about how to best report the index size when the index changes over time and decided we’d ask people who run Solr (or ElasticSearch or Sensei or…) – you – what you would like to see in this report.
For example, imagine that in some 5-minute time period (say 10:00 AM to 10:05 AM) we check the index 5 times (in reality we do it much for frequently) and each time we do that we find the index has a different number of documents in it: 10, 15, 20, 25, and finally 30 documents. Now imagine this data as a graph showing the number of indexed document over time, but with the smallest time period shown being a 5 minutes interval.
At this point the question we have for you is: How many documents should this graph report for our example 10:00 – 10:05 AM period above? Should it show the minimum – 10? Average – 20? Mean – 20? Maximum -30? Something else? Minimum, average, and maximum – 10, 20, 30?
Any feedback and suggestions you give us regarding this will be greatly appreciated – thanks!