Skip to main content
NewsResources

One million pages and counting

By December 21, 2009August 7th, 2013No Comments

The millionth page on the Australian Newspaper Digitisation Program was made publicly available on 14 December 2009, marking a project milestone. The millionth page contained the 10 millionth article. This was a 1901 edition of the Sydney Morning Herald. The National Library of Australia has advised that there will be a staggering 40 million articles available by 2011.

Digitisation started in 2007 and 4.4 million pages were targeted for digitisation over 4 years to be complete and publicly accessible as full text articles by June 2010. 3 million of the identified 4.4 million pages have been scanned from microfilm into digital images so far. Of the 3 million scanned pages 1 million have been converted into full text articles by the OCR process and are publicly available. The remaining pages will be made available from now through til June 2010.

The 1 million pages publicly available amounts to 10 million articles with coverage dates of 1803 -1954.

Public users have enhanced the data significantly since August 2008 by correcting 8.13 million lines of text in 368,390 articles. This really improves the searching. Also 5061 comments and 230,384 tags have been added to articles, which will be used for search and retrieval in the 2010 version of Trove.

The first 70 years of the Sydney Morning Herald are now publicly available. 1831-1901. it is important to note that some issues of this title are missing. These are being sourced in hard copy from locations in Australia and will be added to the public service in 2010. So don’t worry if you spot a missing issue, the National Library knows about it and it will appear in the service soon.