Just one last status update before I rush off to buy the last Christmas presents. Toke just converted Summa’s internal Lucene document builder to use the new XMLStreamReader found in Java 6, instead of using some XPath/DOM magic we had been using hitherto when indexing in Summa (yeah, we know DOM parsing is silly here, but it had proven stable and “OK” for a very long time).
This provided an overal indexing speedup of a factor 8. I think that is Toke’s way of saying “merry Christmas everybody”
Tags: Summa, optimization, indexing, lucene, xml, streamparser, java