Creating A Database of Ship Citations
-
Upload
peter-mccracken -
Category
Documents
-
view
606 -
download
0
Transcript of Creating A Database of Ship Citations
![Page 1: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/1.jpg)
CREATING A DATABASE OF
SHIP CITATIONS: THE CHALLENGES ENCOUNTERED
IN SHIPINDEX.ORG
The Charleston Conference, 3 Nov 2010
Peter McCrackenCo-Founder & Director of Content
and Business Development, ShipIndex.org
![Page 2: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/2.jpg)
What kinds of ships are these?
Bark (or barque); Ship; Brigantine; Barquentine; Topsail Schooner; Schooner
![Page 3: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/3.jpg)
Serials :: Ships
Publication pattern (or format?) :: Vessel type
Serial title :: Ship name
ISSN :: IMO
Ship research :: Any other historical research
![Page 4: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/4.jpg)
Ships :: Other historical research
Problems with ships are the same as problems
with personal names, geographic descriptors,
etc.
Can also apply to concepts, as well as things
Also ‘non-unique’ items, like a car model
![Page 5: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/5.jpg)
Data challenges – personal names
Innumerable works by “Anonymous”
Names are often shortened
Pablo Picasso’s full name was Pablo Diego José
Francisco de Paula Juan Nepomuceno María de
los Remedios Cipriano de la Santísima Trinidad
Ruiz y Picasso
Names have strange limitations
Some must be unique – Consider Michael J. Fox
Some are very common – Consider Adam Smith
![Page 6: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/6.jpg)
Data challenges – geographic
names
Numerous variations: Köln; Cologne; Keulen;
Colonia; Colònia; Kolín nad Rýnem; Cwlen;
Κολωνία; Kolonjo; كولونيا; Кьолн; Ķelne; Кёльн
Name changes
Hot Springs, NM -> Truth or Consequences, NM
Halfway, OR -> Half.com, OR
Clark, TX -> DISH, TX
St. Petersburg -> Petrograd -> Leningrad ->
St. Petersburg (“Petersburg,” or “Piter”)
![Page 7: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/7.jpg)
A “meaning-less” identifier
Regardless of the topic, some meaning-less
identifier can provide significant assistance
“Meaning-less” in the sense of a one-to-many
relationship between the identifier and the
data
The identifier doesn’t change, but the data can
![Page 8: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/8.jpg)
Overview of ShipIndex.org
A database of citations –
>1.42 million citations, from >200 resources
>140,000 citations are freely available
Changes how one does maritime research
Far more content can researched more quickly
Opens up maritime research to everyone No need for inside knowledge on where to start
searching
Uncovers many hidden resources
Locates free, but hidden, web resources
![Page 9: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/9.jpg)
Maritime access points
Vessel name
Vessel number
IMO numbers are new; hull numbers change
Captain name
They change between voyages, and die during them
Rig or vessel type
Ships are rebuilt; definitions change; “ship”
ALSO: Port of registration; crew members; others
![Page 10: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/10.jpg)
Vessel names – this is easy!
“What does the
stern say?”
1872, American Lloyd’s Register of American and Foreign
Shipping
1867, American Lloyd’s Register of American and Foreign
Shipping
![Page 11: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/11.jpg)
Sources of errors – primary sources
Mistakes in primary sources are very common,
and forgiveable
Digitized version of Lloyd’s List of 1812
Ships called “Adolph & Fredericka”
![Page 12: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/12.jpg)
Sources of errors – transcribers,
indexers, OCR operators, etc.
Transcription errors are very easy to make –
whether through incorrect assumptions, or
just mistakes
“Earnets” for “Earnest”; “Elizaneth” for
“Elizabeth”, etc.
Some files are much tougher to manage than
others
![Page 13: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/13.jpg)
More challenges
How do we locate Elizabeth? Or Mary?
Elizabeth = 1899 citations
Mary = 2614 citations
Top ten ship names, for no good reason: Mary, Maria, Elizabeth, Anna, Union, Victoria, Hope, Flora, Emma, America
Try to limit results sets?
by time period
by vessel rig (maybe?)
by location(?)
by nationality
![Page 14: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/14.jpg)
Changing vessel names
What do we do when a vessel changes its
name?
A person researching a vessel wants to know the
life of a ship; at present they need to know its
previous or subsequent names
This can only be done when we have unique
vessel identifiers – otherwise, how do you know
which Elizabeth became Hogwarts Belle?
![Page 15: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/15.jpg)
Existing vessel identifiers
Hull Identification Number – Only US; any powered boat
USCG Documentation Number – Only US; >5 net tons
IMO Number – Assigned by Lloyd’s/Fairplay; international; passenger ships >100 gross tons, and cargo ships >300 gross tons; mandatory from 1996
Naval Identifiers – eg, PT-109, CV-42, BB-18, DD-793, D118, etc.
Lloyd’s numbers, and many more…
![Page 16: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/16.jpg)
Unique historical vessel identifiers
Need an easy way to differentiate between
“Mary,” “Mary,” and “Mary”
Needs to be unique and unchanging (unlike
name, naval identifier, etc.)
Identifier itself has no meaning – no
indication within it of size, nationality, etc.
Identifier is quickly & automatically assigned
Identification is coordinated with multiple
organizations
![Page 17: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/17.jpg)
Creating an identifier
Could be done through a standards-creation
process, via NISO or another organization
Or informally, with publicly-defined
guidelines, such as (just as examples):
Nine-digit number; ddd-ddddd-c (c=check digit)
Allow individuals to easily request identifiers for
their vessels or their citations
Need ability to easily combine/split/modify
User-managed is likely most cost-effective solution
![Page 18: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/18.jpg)
Creating an identifier
Must have buy-in from many groups
Should be easy to implement
Should be easy to use; available to many
individuals and resources
Pre-populate as much as possible, open
editing to all
Maintain advisory group to address concerns,
disagreements, etc.
![Page 19: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/19.jpg)
Defining <ShipIdentifier>
<OtherIdentifiers>
<IdentifierType>
<IdentifierNumber>
<ShipName>
<DateNameStartedInUse>
<DateNameEndedInUse>
<PreviousShipName>
<SubsequentShipName>
<RigType> - defined list of types, & “other”
<VoyageIdentifier> - multiple
![Page 20: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/20.jpg)
More <ShipIdentifier>
<MilitaryUsage?> - yes/no/unclear
<Nationality>
<ServiceBranch>
<HullIdentifier>
<VesselMeasurements>
<MeasurementType> - list of options
<MeasurementValue>
![Page 21: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/21.jpg)
Defining <VoyageIdentifier>
<ShipIdentifier>
<Captain>
<Crew> - multiple positions, multiple names
<CrewPosition>
<CrewmemberName>
<OtherVoyageIdentifiers>
<OtherVoyageDatabase>
<OtherVoyageDbId>
![Page 22: Creating A Database of Ship Citations](https://reader034.fdocuments.net/reader034/viewer/2022051314/55a331771a28ab77618b458d/html5/thumbnails/22.jpg)
Expanding to other fields
Makes discovery more manageable
Makes linking possible
Use the same concept for other areas of
research, linking everything together
People
Places
Manufactured items
Artwork
Everything