Monthly Archives: February 2017

SolrWayback Machine

Another ‘google innovation week’ at work has produced the SolrWayback Machine. It works similar to the Internet Archive: Wayback Machine (https://archive.org/web/) and can be used to show harvested web content (Warc files).  The Danish Internet Archive has over 20billion harvested … Continue reading

Posted in Uncategorized | 1 Comment

juxta – image collage with metadata

Creating large collages of images to give a bird’s eye view of a collection seems to be gaining traction. Two recent initiatives: The New York Public Library has a very visually pleasing presentation of public domain digitizations, but with a … Continue reading

Posted in Uncategorized | Leave a comment

Automated improvement of search in low quality OCR using Word2Vec

This abstract has been accepted for Digital Humanities in the Nordic Countries 2nd Conference, http://dhn2017.eu/ In the Danish Newspaper Archive[1] you can search and view 26 million newspaper pages. The search engine[2] uses OCR (optical character recognition) from scanned pages … Continue reading

Posted in Uncategorized | Leave a comment