[This message is being sent to everyone on the Five College archives
project distribution list.]
Something old, something new, something borrowed, something (Mount
1. "Something old" is a Kurzweil scanner, inherited courtesy of
Foreign Language Resource Center at UMass. It is about 10 years old and
can still do good optical character recognition. The price (free) was
hard to resist. I'll be experimenting with it to see how it stacks up
against the Caere's OmniPage Pro 8.0.
2. "Something new": The project web site has recently been restructured
somewhat, and I've been working on making navigation around the site more
efficient. I'd appreciate hearing feedback on the results. (The home
page URL -- clio.fivecolleges.edu -- is still valid, but if you have
bookmarks for pages other than that one on your browser, you will want to
3. "Something borrowed": one of Peter Carini's student workers,
preparing the item-level inventory for the correspondence series in the
Mary Lyon Collection. Work on this collection (ca. 2,400 scanned items)
has been underway for the last few weeks. Project assistants Lauren
Cunniffe and Melissa Watterworth have processed about 30 folders of
material. HTML for displaying the images has been postponed until we make
decisions on how best to guide users through the collection pages.
(Incidentally, last week the MHC Archives received an e-mail from a
distant researcher who was unaware of this project but wondered if any
Mary Lyon Collection material might be available on the Internet. People
are knocking at the door and we're hustling to make the place
4. Something (MHC) blue: Digitization of the Mount Holyoke Journal
& Journal Memoranda, a collection of over 2,400 items, is nearly complete.
We are finishing the Memoranda series and now are returning to some items
that didn't scan satisfactorily. This has been a very challenging
collection. In some respects, however, the results have been encouraging.
I would like to hear what you think.
5. New & Blue, I guess: I've been happy with progress in the
character recognition (OCR) of the book we are scanning, Stow's HISTORY OF
MOUNT HOLYOKE SEMINARY DURING ITS FIRST HALF-CENTURY, 1837-1887. This is
the only published monograph we are digitizing in the project. We decided
to go the extra mile and use OCR, rather than simply scan the book as page
images, since (a) it is a standard history and text searching would be a
big benefit, (b) a spare copy is available that can be disbound, (c) the
type is generally crisp and easily recognized in OCR, and (d) it's a
valuable exercise. Currently 10 of 22 chapters are available online, with
several more added each week. We are using OmniPage Pro 8.0 for the OCR,
which can save text in a variety of formats (MS Word, in this case) and
its accuracy has been higher than expected.
Since I have been getting questions about our scanning method, here
brief description: the Mary Lyon Collection, like most of the collections
before it, is being scanned as black and white bitonal TIFF images, from
which grayscale GIF images are derived for display on the web. Bitonal
TIFFs are smaller in size than grayscale TIFFs and thus save server space;
it is an easy matter to convert to grayscale using Photoshop. In most
cases, color scanning is not warranted.
As always, I welcome your comments and questions. Visit the
site and take
a spin around, and let me know if you think we're creating something truly
Peter Nelson firstname.lastname@example.org
Five College Archives Digital Access Project
c/o Mount Holyoke College Archives http://clio.fivecolleges.edu
South Hadley, MA 01075 (413) 538-3020