Category Archives: Digitizing

Large-scale digitization projects – quality and contractors

Our large-scale digitization project of newspapers from microfilm is just on the verge of going into production. Being technical lead on the process of ingesting and controlling quality of the digitized data has been a roller coaster of excitement, disillusionment, humility, admiration, despair, confidence, … Continue reading

Posted in Digitizing | Tagged , , | 3 Comments

Quack – an ALTO viewer

Never heard of ALTO (Analyzed Layout and Text Object)? Fear not, it is a very simple representation of structured text from physical pages, and I mean that in a positive way. Its primary use is to hold OCRed text. A … Continue reading

Posted in Digitizing, eskildsen, open source, Uncategorized | 1 Comment

Hardware guesstimation

We are currently at the starting phase of two new search-oriented projects at Statsbiblioteket. The frontenders are playing with wireframes and the backenders are queried for hardware requirements. Estimating hardware requirements is tricky though. This post is about our newspaper … Continue reading

Posted in Digitizing, Low-level, Solr | 1 Comment

Searching and presenting digitized newspapers

As a break from the more tech-heavy postings, this will be a loosely structured post with general observations and a few ideas. Going digital We have kilometers of paper at our library and a vision that says to digitize it … Continue reading

Posted in Digitizing, Statsbiblioteket | 1 Comment