Car Collections: A Greenstone Sample
November 11, 2011
Return to Main Page
The DVD labeled "RGS-Collections" is a self-running program (Greenstone software) which contains three sample "collections" intended to demonstrate this software's ability to organize data, related particularly to car collections. These separate files, selected from the opening screen, are:
After selecting any "collection", you can further search for documents, photos or webpages using the full-text search window or a preset "browser" to display the content based on manually-set categories (meta-tags), such as "Title" or "Description".
Put in the DVD and you should see this screen:
Click "Enter Library" and this will appear:
You can choose any of the three "collections". If you choose the Frazer Nash logo, this will appear:
From this screen, you can do a full-text Search or browse the Titles, Descriptions or Subjects categories. If you make the pull-down choice "car.location, car.model, (etc) in the "Search in" box, you will see this:
If you search on "1952", you will get this result:
These are all the results that have been categorized under "1952". You can further click on any item to see the full image or document.
Go back in your web-browser interface to the original library screen and click "titles" in the menu bar. You will get this:
The icon on the left is a "bookshelf, indicating one or more images or documents have been categorized under one of these "titles".
Click "Bristol and Frazer Nash Documents", third from the top, and you will see this:
You can click on any of the small images or the Acrobat icon to see the full document. If you click on the "page" to the left of these, you may see Greenstone's text of the same document, if available.
You can go back to the main screen again and enter the "Fabulous Fifties" newsletter collection or the Petersen Automotive Museum collection and search or browse in a similar manner.
At this stage, the indexing and categorization in these collections is not consistent or complete, but these samples should demonstrate Greenstone's potential nevertheless. As we learn more, our categorization skills will improve! More important - we want to form a community of users to settle on standards for "titles", "descriptions", and "subjects". "Car.make", "car.manufacturer", and "car.year" are much easier to understand!
Email me with any questions! Bob Schmitt, email@example.com
Greenstone Digital Library Background
Greenstone is designed to organize digital media into a "library". It originated from and has an active development team at Waikato University, Hamilton, New Zealand. It follows widely-accepted digital library standards and is both very professional and powerful.
Making a Greenstone Digital Collection
To make a Greenstone collection:
The Greenstone.org website has workshop courses and tutorials to download. These are well-designed courses and explain how to use existing meta-tags or add custom meta-tags. For the sample data, I've added the the "car.xxx" meta-tags.
Greenstone has functions to import nearly any type of file, change the web-browser interface appearance and change how the search and browse results are displayed. The Librarian Interface makes this customization "easier" but expect using some technical skills to get full use of the Greenstone functions.
Greenstone is very robust; see the examples of completed Collections at http://www.greenstone.org/examples The largest collection, with a reported 1,000,000+ images, is an extensive archive of New Zealand newspapers, both as images and full text.
A similar open-source program, DSpace, was developed by MIT and HP for academic use. The DSpace site shows over 300 institutions using this system, primarily in the U.S. Greenstone has a tutorial to show how to move a digital collection from DSpace to Greenstone and vice-versa. Comparisons of both systems on the Internet seem to confer no advantage to either and note that meta-tag classification in either collection is preserved.
Appendix - Contents of the Collections
1. The Fabulous Fifties collection consists of 48 newsletters and special announcements from this "non-club" organization. Most are in PDF format and can be searched in full-text by Greenstone.
2. The Petersen collection includes several of their web pages, photos of their current exhibits and some digital assets in a group titled Digital Library Collection. A list of the video interviews done by Bill Pollack are part of these digital assets.
The Digital Library Collection also includes early auto periodicals. Eight years ago Bob Norton (gone from us too early) embarked on a massive scanning project of early west coast racing newsletters and magazines. Less than 25% of his work is included in the Petersen collection. He made a full 28 page index of the 9 volumes of "MotoRacing" (Index.pdf) which was text-rich and searchable, including in this Greenstone sample collection. The PDF files of the documents in the Norton segment otherwise are images only, not text-searchable. However, such PDF images can by put through OCR software to generate text. One page of one copy of MotoRacing was processed using ABBY FineReader and it worked very well, with about 95% accuracy.
3. In the Frazer Nash collection is a (redacted) Excel list of the Frazer Nash Club members and their cars. The "cars" file was expanded to include all the postwar cars. There are also approximately 100 Frazer Nash images and documents in this collection, many of the postwar cars.
A car collector or the manager of a(ny) collection with a need to organize the data on the vehicles in the collection should:
About Bob Schmitt: I am a car hobbyist and mostly retired from my former occupation as a technology contracts manager (tech-contracts.com). My car hobby is currently centered on a Frazer Nash car which was restored in Arizona and New Zealand and is currently in the Classic Car Museum in Nelson NZ. (my website for the car is FrazerNash-USA.com). Long before my career in law and contract managements, I completed grad school in Information Science at the University of Hawaii (in the days of punch cards!) However, when a law school opened at UH and I had unused GI Bill time, I took a path away from computers for a long time.
I've been involved with several database and scanning projects for contracts with my last three (large corporate) employers; the results did not seem to justify the costs.
A few years ago, I began to help Bill Pollack with his video interview project (with people more or less well-known in the car/racing world) for the Petersen Automotive Museum (Los Angeles). Initially I digitized video tapes of Bill's completed interviews and later helped with his new interviews. We now use high-definition video, which I process to create standard DVDs from the hi-def format. Iíve also collected and organized the results of a scanning project of '50s-'60s motoring journals done by retired engineer Bob Norton.