MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services...

49
MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services [email protected]

Transcript of MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services...

Page 1: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEdit: Doing more, but fasterTerry ReeseGray Family Chair for Innovative Library [email protected]

Page 2: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

• Making your metadata work for you• Finding ways to use MarcEdit to merge and manipulate existing

metadata in various formats• i.e., working with XML formats, delimited formats, Excel, Access• Dealing with data in multiple charactersets as we transition to a

Unicode world

• Learning how to automate repetitive tasks, and understand what editing functions are available to you

• Leveraging webservices like OCLC WorldCat to provide automatic classifications

Page 3: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

METADATA MANIPULATION

Page 4: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MARC Tools Portal

Page 5: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Marc Tools• Built-in functions

• MarcBreaker – Tool used to convert MARC records to the MarcEdit mnemonic format

• MarcMaker – Tool used to convert MarcEdit mnemonic format to MARC

• MARC=>MARC21XML – converts MARC to MARC21XML• Automatically converts data from MARC-8 to UTF8

• MARC21XML=>MARC – converts MARC21XML to MARC• Doesn’t automatically convert data from UTF8 to MARC8 – will leave

data in UTF8

Page 6: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MARC Character Conversions

• Supports moving between any known Windows Characterset and MARC8.

• Can be run from the Breaker/Maker – or as its own standalone utility

Page 7: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MARCSplit/MARCJoin• Utility used for

splitting large MARC record sets into smaller files

• Utility used for joining large sets of MARC data to a single file

Page 8: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Batch Record Processor

• Allows MarcEdit to process “lots” of files.

• Files can be processed against an entire folder’s contents or by file type

• Can utilize any built-in or derived XML Function transformation

Page 9: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEdit and bad records• Two MARC breaking algorithms

• Strict MARC algorithm• Loose breaking algorithm

• Loose algorithm can heal MARC records (sometimes)• Structural errors• Missing field or record markers

Page 10: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Delimited text translator

• Delimited Text Translator• Translates Tab, comma, pipe, Excel (Office 2000-2007), Access

(Office 2000-2007) files into MARC• Can save translation maps• Can create constant data

Page 11: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Delimited text translator Options

• Wizard-like interface• Supports Unicode data (in excel or delimited file)• Joining (relating) fields• Editing global 008/LDR

Page 12: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Delimited Text Translator: Mapping format

• Map to: Field + subfield• Indicators: Indicator values• Term Punct.: Trailing

punctuation• Arguments – Joining

defined items (select and right click on items)

• Ability to save templates

Page 13: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Common Joining techniques• When would I mark a field as repeatable?

• By default, when the Delimited Text translator encounters two like subfields on the same field, it creates a new field. For example:column 1: This is a notecolumn 2: This is a note 2if I mapped column 1 500$a and column 2 to 500$a, by default, MarcEdit would generate the following output:=500 \\$aThis is a note=500 \\$aThis is a note 2

• However….

Page 14: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Common Joining techniques• When would I mark a field as repeatable?

• If I need to have multiple, like subfields on the same field, for example, like a subject field – we would mark the field as repeatable:column 1: Geologycolumn 2: Oregoncolumn 3: CorvallisIf these fields were not marked as repeatable, the output would look like:=650 \0$aGeology$zOregon=650 \0$zCorvallis

However, if these fields were marked as repeatable, the output would look like: =650 \0$aGeology$zOregon$zCorvallis

Page 16: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEdit Crosswalking model

MARC21XML

EAD

FGDC

MODSMARC

Dublin Core

Page 17: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Finding and Contributing Crosswalks

• In MarcEdit 5.6, an option was added to allow users to search for crosswalks

• Currently, these are crosswalks I or LC have created• Hopefully, community members will submit crosswalks for

inclusion into the registry

Page 18: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEdit: Crosswalks for everyone

Page 19: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Harvesting Metadata• MarcEdit includes a

builtin OAI harvester• Allows for direct

XML=>MARC translations

• Allows for custom modification of XSLT translation tables.

Page 20: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Harvesting Metadata• Required data

• Host name: i.e., http://ir.library.oregonstate.edu/request/oai• Metadata Type

• Natively supports MARCXML, Dublin Core, OAIMARC and MODS• Options to support conditional harvests, raw data harvests, and

resumptive harvests.

Page 21: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

RECORD EDITING

Page 22: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEditor

Page 23: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEditor Properties

• Templates• Fonts• Encodings• Preview Settings

Page 24: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Configuring New Paging• Set in the Options dialog

Page 25: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Paging Example• If you load the full file, or turn the preview mode off

Page 26: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC

• MarcEditor• Supports a number of global editing functions:

• Edit Subsets of records• Find/Replace functionality• Globally Add/Delete MARC fields• Globally Edit Subfield data

• Conditionally add/remove field data• Globally Edit Indicator data• Globally Swap field data• Record Deduplication• Record Sorting• Call Number Generator• Macros

Page 27: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Find/Replace

• Works like a normal Find/Replace in most Textpad utilities.

• Unlike most Textpads, Replace supports UTF-8 (when working with UTF-8 files) and regular expressions.

Page 28: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Find All

• Find all function was designed for use with the Paging mode

• Allows users to find any text across all pages

• Generates a jump list that can be used to find individual records for edit

Page 29: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Jump to• Jump to…record:

• Allows you to jump to any records

• Jump to…page:• Allows you to jump to any page

Page 30: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Global Add/Delete Field

• Globally add fields to all MARC records• Allows users to set insertion position.

• Globally delete fields• Allows global delete• Allows conditional delete

• Supports Regular Expressions

Page 31: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Modifying subfield data

• Allows for the modification of variable MARC field subfield data (MARC fields >10)

• Allows for the modification of control field data by position or range of positions

• Allows users to prepend and append data to subfields.

• Allows users to change subfield tagging.

Page 32: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Modifying subfield data

• Allows users to insert new subfields and define subfield placement.

• Allows users to move field data from one field to another.• Supports:

• UTF-8 with UTF-8 files• Regular Expressions• Adding new subfields.

Page 33: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Modifying subfield data

Page 34: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Editing MARC – Swapping Fields

• Swap parts of MARC Fields or entire MARC fields• Define field, indicator and

subfields to move.• Can move field data and

delete the original field or clone the field data and move the clone to the new location.

• Can add data to an existing field.

Page 35: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Character Conversions within the MarcEditor

• MarcEditor allows users to convert character data between different charactersets.

Page 36: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Fixing Boo-boos• MarcEdit’s Special Undo

• Allows you to step back one global change.

Page 37: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Sorting Fields• MarcEdit provides multiple sorting

types:• Control Number

• Sorts record position within the file

• Title• Sorts record position within the file

• Author• Sorts record position within the file

• Call Number• Sorts record position within the file

• 0xx Fields• Sorts the 0xx fields within individual records

(does *not* change record position within a file)

• All Fields• Sorts all fields within individual records

(does *not* change record position within a file)

• Custom Sort• Sorts all defined fields within individual

records (does *not* change record position within a file)

Page 38: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Record Deduplication• MarcEdit provides a simple

dedup tool that can:• Dedup on a defined control

field (any field)• Dedup on a transaction field

(or using an additional transaction field)

• Output• Removes all duplications and

saves the duplications to a file

• Prints just unique items within the file (i.e., those without a duplicate pair)

Page 39: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Field Counts• Field Count

• Provides a quick count of fields

• Report of subfields used within a particular field

• Detailed reports of all fields/subfields used within a fileset.

Page 40: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Material Type Report• Material Type Report

• Reports number of records by material type

• Breaks down material type by sub-types

• Utilizes the Leader, 008 and GMD to determine format types

Page 41: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Task Automation Tool• Stacking Operations

• Task automation provides a way for non-programmers to create defined task lists that can then be executed automatically

• The different between a task and a macro is that MarcEdit tasks essentially function like the user was calling specific functions within MarcEdit.

• Anything that you can do in the MarcEditor, you can automate as a task.

Page 42: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Task Automation• Managing Tasks

• Task management works like macro management

• You can• Create new tasks• Clone tasks• Rename tasks• Delete tasks• Edit tasks

Page 43: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Task Automation Demo• Additional Information:

• Youtube: • Introduction to task automation:

http://www.youtube.com/watch?v=gmqTGfTubU4• Introduction to new task automation functions:

http://www.youtube.com/watch?v=fnorN0MFFN0

Page 44: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

• MarcEdit can leverage OCLC WorldCat to generate call numbers automatically for files• Fields used:

• 001• 010$a$z• 020$a$z• 022$a$z• 024$a$z• 1xx$a• 776$w$z

OCLC Classify Service

Page 45: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

OCLC Classify Service

Page 46: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

FUTURE DEVELOPMENT

Page 47: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

MarcEdit 5.9+• AACR2->RDA macros

• Low-hanging conversions to support batch data processing

• Merge Record Enhancements• Adding more data points and customized merge fields

• More Automation support• Ability to turn Edit shortcuts into Automation tasks

• Batch OAI Harvesting• Create jobs that you can schedule and have automatically run for you

• Batch Set Holdings • Using either crappy z39.50 or OCLC’s yet to be publically released API

for holdings settings.

Page 48: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Getting Help• Call/write me:

[email protected]

• Ask the list:• MarcEdit ListServ

• http://listserv.gmu.edu/cgi-bin/wa?A0=marcedit-l

Page 49: MarcEdit: Doing more, but faster Terry Reese Gray Family Chair for Innovative Library Services Terry.reese@oregonstate.edu.

Questions