The potential of newspapers as a source for family history research is enormous. And thanks to a digitisation project begun in 2011 by Findmypast in partnership with the British Library, our access to British newspapers is better than it has ever been before. At the time of writing, almost 75 million pages of newsprint have been scanned with nearly a quarter of a million of those added in the past seven days, according to the website of the British Newspaper Archive.
The database is one of the biggest that you’ll use in the course of your family history research and it can reveal facts and stories about your ancestors that you would almost certainly never have discovered otherwise. But we need to start with a word of caution. The database has not been populated by employing human beings to read and transcribe the text; instead a process known as OCR (optical character recognition) has been used to ‘read’ the newsprint and interpret it as letters, words and sentences.
As a business model this is entirely understandable; the cost involved in employing real people to do the work would be absolutely prohibitive. The good news is that the technology behind the OCR process is very good – and improving all the time – and provided that the original scanned image is of good enough quality, the end result is usually a fairly accurate ‘transcription’. As the print in more recent newspapers tends to be clearer we are, as a general rule, more likely to get better results with papers printed in the 20th century than in the 19th century or earlier.