Car Collections: The Greenstone Prototype/Sample Collections

Bob Schmitt

July, 2012

Return to Main Page

Summary:

This webpage is a user guide for the Greenstone Car Library collections on DVD or found through the  carlibrary.org website

The DVD labeled "Car-Collections" is a self-running program (Greenstone software on a Windows PC only) which contains three sample "collections" intended to demonstrate the ability of this software to organize data, particularly related to car collections.  These separate files, selected from the opening screen, are:

  • Petersen Automotive Museum data (370 documents, images, webpages)

  • Frazer Nash data (100 documents, images)

  • Fabulous Fifties Newsletters (48 documents)

There are four additional collections on carlibrary.org.  Whether you use the DVD or the website, this guide should help you find your way!

After selecting any "collection", you can further search for documents, photos or webpages using the full-text search window or a preset "browser" to display the content based on manually-set categories (metadata), such as "Title" or "Description".

Put in the DVD and you should see this screen:

Click "Enter Library" and this will appear:

You can choose any of the three "collections".  If you choose the Frazer Nash logo, this will appear:

From this screen, you can do a full-text Search or browse the Titles, Descriptions or Subjects categories.  If you make the pull-down choice "car.location, car.model, (etc) in the "Search in" box, you will see this:

'

If you search on "1952", you will get this result:

These are the 42 results that have been categorized under "1952".  You can further click on any item to see the full image or document.  The third item is a PDF file.  When the icon is clicked, you will see the full magazine article.

Go back in your web-browser interface to the original library screen and click "subjects" in the menu bar.  You will get this:

The icon on the left is a "bookshelf", indicating one or more images or documents have been categorized under one of these "subjects".  The number in parantheses shows how many documents are in the "bookshelf".

Click "Frazer Nash, Mille Miglia", just past halfway down, and you will see this:

You can click on any of the small images or the Acrobat icon to see the full document or image.  If you click on the "page" to the left of these, you may see Greenstone's text of the same document, if available.  Next to the first image is the name of the image file, the car "make" and the original photo date.  These are meta-data items.  The file name and photo date are automatically extracted from the image or document.

You can go back to the main screen again and enter the "Fabulous Fifties" newsletter collection or the Petersen Automotive Museum collection and search or browse in a similar manner.

At this time, my indexing and categorization in these collections is not consistent or complete, but these samples should  nevertheless demonstrate Greenstone's potential.  As we learn more, our categorization skills will improve!  More important - it's highly desirable that a community of users consider and settle on standards for "titles", "descriptions", and "subjects".  "Car.make", "car.manufacturer", and "car.year" are much easier to understand!

Contents of the Collections

1.  The Fabulous Fifties collection consists of 48 newsletters and special announcements from this "non-club" organization.  They are in PDF (text and image) format and can be searched in full-text by Greenstone.

2.  The Petersen collection includes a few static web pages, photos of their current exhibits and some digital assets in a group titled Digital Library Collection. A list of the video interviews done by Bill Pollack are part of these digital assets.

The Digital Library Collection also includes early auto periodicals.  Eight years ago Bob Norton (gone from us too early) embarked on a massive scanning project of early west coast racing newsletters and magazines.  Less than 25% of his work is included in this Petersen sample collection.  Bob made a full 28 page index of the 9 volumes of "MotoRacing" (Index.pdf) which was text-rich and searchable, including in this Greenstone sample collection.  The PDF files of the documents in the Norton segment otherwise are "image only", not text-searchable.  However, these PDF images can by put through OCR software to generate text.  One page of one copy of MotoRacing was processed using ABBY FineReader and it worked very well, with about 95% accuracy.

3.  In the Frazer Nash collection includes a complete list of the prewar (chain-drive) and postwar Frazer Nash cars.  The details of these cars, in an Excel file, is now "metadata" that makes the addition of other photos and documents much easier and consistent.  There are also approximately 100 Frazer Nash images and documents in this collection, still only a fraction of my personal files.

Email me with any questions!  Bob Schmitt, rgschmitt@gmail.com

Greenstone Digital Library Background

Greenstone is designed to organize digital media into a "library".  It originated from and has an active development team at Waikato University, Hamilton, New Zealand.  It follows widely-accepted digital library standards and is both very professional and powerful.

The "Main Page" has further information on Greenstone and steps to implement a digital library using it.

Return to Main Page