Recently, Andy Jackson from UK Web Archive discovered a ginormous Pit Of Pain with Solr distributed faceting, where some response times reached 10 minutes. The culprit is *facet.limit=100* (the number of returned values for each facet is 100), as the secondary fine-counting of facet terms triggers a mini-search for each term that has to be checked. With the 9 facets UK Web Archive uses, that’s 9*100 searches in the worst-case. Andy has done a great write-up on their setup and his experiments: Historical UKWA Solr Query Performance Benchmarking.

The shape of the pit can be explained by the probability of the need for fine-counts: When there is less than 1K hits, chances are that all shards has delivered all matching terms with count > 0 and thus need not be queried again (clever merging). When there are more than 1M hits, chances are that the top-100 terms in each facet are nearly the same for all shards, so that only a few of the terms needs fine-counting. Between those two numbers, chances are that a lot of the terms are not present in all initial shard results and thus require fine-counting.

While the indexes at Statsbiblioteket and UK Web Archive are quite comparable; 12TB vs. 16TB, build with nearly the same analysis chain, the setups differ with regard to hardware as well as facet setup. Still, it would be interesting to see if we can reproduce the Pit Of Pain™ with standard Solr faceting on our 6 facet fields and facet.limit=100.

Sorta, kinda? We do not have the low response times < 100 hits, and 10 minutes testing only gave 63 searches, but with the right squinting of eyes, the Pit Of Pain (visualized as a hill to trick the enemy) is visible from ~1K to 1M hits. As for the high response times < 100 hits, it is due to a bad programming decision from my side – expect yet another blog post. As for the pit itself, let’s see how it changes when the limit goes down.

Getting a little crowded with all those dots, so here’s a quartile plot instead.

Again, please ignore results below 100 hits. I will fix it! Promise! But other than that, it seems pretty straight forward: High limits has a severe performance penalty, which seems to be more or less linear to the limit requested (hand waving here).

The burning question is of course how it looks with sparse faceting. Technically, distributed sparse faceting avoids the mini-searches in the fine-counting phase, but still requires each term to be looked up in order to resolve its ordinal (it is used as index in the internal sparse faceting counter structure). Such a lookup does take time, something like 0.5ms on average on our current setup, so sparse faceting is not immune to large facet limits. Let’s keep the y-axis-max of 20 seconds for comparison with standard Solr.

There does appear to be a pit too! Switching to quartiles and zooming in:

sparse_limit_10min_12.5TB_4.2B_sparse_finecount_l1000_100_50_5_nomax.png

This could use another round of tests, but it seems that the pit is present from 10K to 1M hits, fairly analogue to Solr fc faceting. The performance penalty of high limits also matches, just an order of magnitude lower. With worst-case of 6*100 fine-counts (with ~10^5 hits) on each shard and an average lookup time of ½ms, having a mean for the total response time around 1000ms seems reasonable. Everything checks out and we are happy.

### Update 20140912

The limit for each test were increased to 1 hour or 1000 searches, whichever comes first, and the tests repeated with facet.limits of 1K, 10K and 100K. The party stopped early with OutOfMemoryError for 10K and since raising the JVM heap size skews all previous results, what we got is what we have.

Quite similar to the Solr fc faceting test with facet.limit=100 at the beginning of this post, but with the Pit Of Pain moved a bit to the right and a worst-case of 3 minutes. Together with the other tested limits and quartiled, we have

Looking isolated at the Pit Of Pain, we have the median numbers

facet.limit | 10^4 hits | 10^5 hits | 10^6 hits | 10^7 hits |

1000 | 24559 | 70061 | 141660 | 95792 |

100 | 9498 | 16615 | 12876 | 11582 |

50 | 9569 | 9057 | 7668 | 6892 |

5 | 2469 | 2337 | 2249 | 2168 |

Without cooking the numbers too much, we can see that the worst increase switching from limit 50 to 100 is for 10^5 hits: 9057ms -> 16615ms or **1.83 times**, with the expected increase being **2 **(50 -> 100). Likewise the worst increase from limit 100 to 1000 is for 10^6 hits: 12876ms -> 141660ms or **11.0** times, with the expected increase being **10 **(100 -> 1000). In other words: Worst-case median response times (if such a thing makes sense) for distributed fc faceting with Solr scales *lineary to the facet.limit*.

Repeating with sparse faceting and skipping right to the quartile plot (note that the y-axis dropped by a factor 10):

Looking isolated at the Pit Of Pain, we have the median numbers

facet.limit | 10^4 hits | 10^5 hits | 10^6 hits | 10^7 hits |

1000 | 512 | 2397 | 3311 | 2189 |

100 | 609 | 960 | 698 | 939 |

50 | 571 | 635 | 395 | 654 |

5 | 447 | 215 | 248 | 588 |

The worst increase switching from limit 50 to 100 is for 10^6 hits: 395ms -> 698ms or **1.76 times**, with the expected increase being **2**. Likewise the worst increase from limit 100 to 1000 is also for 10^6 hits: 698ms -> 3311ms or **4.7** times, with the expected increase being **10**. In other words: Worst-case median response times for distributed sparse faceting *appears* to scale *better than lineary to the facet.limit*.

Re-thinking this, it becomes apparent that there are multiple parts to facet fine-counting: A base overhead and an overhead for each term. Assuming the base overhead is the same, since the number of hits is so, we calculate this to 408ms and the overhead per term to 0.48ms for sparse (remember we have 6 facets so facet.limit=1000 means a worst-case of fine-counting 6000 terms). If that holds, setting facet.limit=10K would have a worst-case median response time of around 30 seconds.

Pingback: Sparse facet caching | Software Development at Statsbiblioteket

Pingback: Sudden Solr performance drop | Software Development at Statsbiblioteket

Pingback: Samsung 840 EVO degradation | Software Development at Statsbiblioteket