Hierarchical faceting in Solr

Solr already has SOLR-64 which does hierarchical faceting and SOLR-792 which does pivot faceting. A few minutes ago, I uploaded SOLR-2412 which does hierarchical faceting. What’s the big idea?

SOLR-2412 is a fairly thin wrapper around LUCENE-2369. LUCENE-2369 was designed with the clear trade-offs

* Slow startup
* Low memory overhead
* Fast response

with the archetypal usage scenario being a large index containing one or more rich hierarchies that is batch-updated every night (see Hierarchical faceting – working code for more details). With fear of misrepresenting, SOLR-64 and SOLR-792 were created from a feature-standpoint with performance characteristics being secondary.

Feature wise, SOLR-2412 (let’s call it Exposed faceting from now on) differs markedly from pivot faceting (SOLR-792) at this time, as neither of the two solutions can do what the other one does. However, I feel confident that Exposed faceting can be tweaked to do pivot faceting later on. The main reason to use Exposed over SOLR-792 would be to change trade-offs.

Compared to SOLR-64, Exposed faceting’s features differs primarily by supporting multiple paths per document: A product belonging to multiple categories, multiple locations for a bus route and so on.

The next step is to create a test bed for doing performance measurements on Exposed vs. Solr’s different faceting implementations. Naturally the hoped-for outcome is that Exposed is markedly better under the defined trade-offs.

About Toke Eskildsen

IT-Developer at statsbiblioteket.dk with a penchant for hacking Lucene/Solr.
This entry was posted in eskildsen, Faceting, Low-level, Lucene, Solr. Bookmark the permalink.

5 Responses to Hierarchical faceting in Solr

  1. justin says:

    Are any of these three ways to do hierarchical facetting already included with solr? Or, how do you plug these into a running solr instance?

  2. justin says:

    Hej! Du kommer från Danmark och jag studerar svensk språket! Jag undrar om du kan förstå svenska också?

  3. Toke Eskildsen says:

    As far as I remember, pivot faceting (SOLR-792) is only on trunk and will be part of Solr 4. If you want to try it, you need to do a checkout and a compile. This is a lot easier than it sounds. Something like the following should work on a Linux-like box, and should be adaptable to Windows:

    svn checkout http://svn.apache.org/repos/asf/lucene/solr/trunk solr-trunk
    cd solr-trunk/solr
    ant run-example

    Single-path hierarchical faceting (SOLR-64) requires patching and I don’t think it patches against trunk, so this is more tricky. I know it patches against @921562 so a Solr can be constructed by doing

    svn checkout http://svn.apache.org/repos/asf/lucene/solr/trunk@921562 solr-64
    cd solr-64
    wget "https://issues.apache.org/jira/secure/attachment/12438443/SOLR-64.patch"
    patch -p0 < SOLR-64.patch
    ant run-example

    Multi-path hierarchical faceting might patch against trunk, but since it is a moving target, I recommend @1066767 to be sure. There are instructions at https://issues.apache.org/jira/browse/SOLR-2412

    And yes, as a dane, Swedish is fairly understandable for me. I understand almost everything when I read it, but not very much when I hear it. However, please keep comments in English for the sake of other readers.

  4. abhay says:

    hi Toke,
    This patch seems to a really nice for what i m looking into doing.

    Is this patch production ready? Any example how would i use it?

    I saw one of ur post on solr form suggesting its experimental but that was a old post so wanted to know if i can use this for production use.

    I use solr 4.0 from trunk

  5. Toke Eskildsen says:

    Production ready? Well, since you’re bold enough to use Solr 4.0 from trunk, I’d say yes 🙂

    We’re using Lucene 4.0 trunk + the mentioned faceting patch in production so the basic functionality seems fairly solid. While the Solr-code is just a simple wrapper around that and seems to work fine, I cannot vouch for the stability of that part as we don’t use it actively.

    Now, Solr (and Lucene) trunk is a moving target so the patch only works for a given range of check-outs. If you are still interested, I’ll try and bring the patch up to speed with the newest truck or a specific checkout if you have one that you deem reliable.

    It might take a few weeks though, there’s a vacation coming up.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s