 |  |
OVERVIEW
Project Retrospect will endeavor to digitize the entire back catalog of the Met Grotto News ("MGN").
GOALS
The goals of Project Retrospect are:
- Produce an immutable electronic archive of MGN
- Produce electronic editions of MGN suitable for download via the Internet
- Make the contents of the entire MGN archive searchable
- Make yearly electronic archives of MGN available on offline media to Met Grotto members for a nominal fee
RATIONALE
As Met Grotto marches on through time, many of its most important contributions are documented in its monthly newsletter, the Met Grotto News. Beginning in January 2001 (Volume 52, Number 1), this newsletter was made available monthly in Portable Document Format (PDF). Prior to this issue, the monthly newsletter was mailed in hard copy to each Grotto member. A copy of each issue was also placed into the Grotto Library for archival purposes.
As the Library's collection increased, several years of MGN were combined into a single binder, making storage easy. However, as the library has always been housed in a private storage facility (usually a member's home), accessing the wealth of knowledge MGN represents has not always been easy. In addition, environmental factors have contributed to the degradation of the quality of the source documents, a trend that will render the earliest issues unreadable within 10 years. Project Retrospect will ensure that the MGN archive is preserved indefinitely.
Another consideration is the historical import of the newsletters. Met Grotto, in its 50+ years of existence, has made many significant contributions to the caving community in the fields of exploration and survey, beginner instruction, and writings of general interest. Rather than have these become distant memories, Project Retrospect will make them available to the current generation of Grotto members, and, hopefully, cavers around the world. By making a searchable archive, cavers will be able to draw upon MGN for research into current projects and techniques.
TEAM
Team members: Seth Perlman, Benjamin Peikes
ACTIVITY LOG
January 2002
- Investigated document scanning techniques (appropriate DPI, image formats)
- Tested at 150dpi-300dpi
- TIFF, JPEG, GIF
- Investigated optical character recognition (OCR) and PDF conversion options
- Adobe Capture
- Will accept many input formats (TIFF, JPEG, GIF, etc.)
- Resultant PDF file size independent of input file format
- Resultant PDF disregards print page size (i.e., 8.5"x11") if JPEG is used; this data is preserved with TIFF
- Requires source images to be at least 200dpi for paper Capture function to work
- OCR functionality is not very accurate, even with source files at 300dpi
- DocMorph
- Will accept many input formats (TIFF, JPEG, GIF, etc.)
- Resultant PDF file size independent of input file format
- Resultant PDF disregards print page size (i.e., 8.5"x11") with all input formats (ignores TIFF header data)
- OCR works at lower image DPI; tested at 150dpi and result was very accurate
- Created test PDF of February 1977 MGN: 10 pages, 8.5MB
|  |  |