Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to...

28
Managing Libraries with Creative Data Mining Learning to Use Your Library’s Data Warehouse to Understand and Improve the Services You Provide Ted Koppel The Library Corporation Computers in Libraries 2005 Session B203, March 17, 2005

Transcript of Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to...

Page 1: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Managing Libraries with Creative Data Mining

Learning to Use Your Library’s Data Warehouse to Understand and Improve the

Services You Provide

Ted KoppelThe Library Corporation

Computers in Libraries 2005

Session B203, March 17, 2005

Page 2: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

The Plan

• What is data mining and why is it useful?

• Who else does it?

• Does it make sense for libraries?

• Are libraries already doing data mining?

• What data can libraries mine?

• How much sophistication do I need?

Page 3: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

What is Data Mining?• Collection and Analysis of one’s own data in order to

make better business decisions.

• More than simple data storage

• Business intelligence technology for discerning unknown patterns from large databases

• Uses statistics, artificial intelligence, various modeling techniques

• Related to, but different from,

bibliomining

Page 4: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Value and Importance

• By identifying patterns and predicting future trends …

– Make decisions based on facts, not guesswork

– Develop sensible processes– Reduce costs or increase services by efficient

use of resources

• Serve the customer better

Page 5: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

‘High Level’ planning

• Remember -- GIGO. • Define the data mining goals• Data collection• Data organization and normalization• Analysis• Analysis• Analysis• Reiteration

Page 6: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Who is Data Mining now?• Manufacturing –process

control

• Banks and financial institutions – “full service”

• Government and law – fraud, abuse

• Sports – RHP versus LHB? Sucker for a curve ball?

• Service industries – almost all CRM systems

• Retail: product stock and placement

• Travel: airline overbooking

• Las Vegas: guest tracking for comps and benefits

• Groceries: affinity cards

• Internet: GoogleAds

Page 7: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Nuggets Found by Mining

• Chase Bank: minimum balance versus other bank business

• Home Depot hurricane planning

• WalMart (UK) diapers and beer (actually a hoax, but an informative one)

• Casino security in Las Vegas - fraud

Page 8: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Implementer Level Tools

• Oracle® Data Mining Suite

• Microsoft SQL Server 2000

• SPSS and similar

• Statistica STATSOFT

• Open Source: – Cornell Univ. Himalaya Data

Mining Tools– WEKA Waikato Environment for

Knowledge Analysis (Univ. of Waikato, NZ)

Page 9: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Looking for the Dog that Doesn’t Bark

• NORA – Non Obvious Relationship Awareness– Examines third ++ level relationships between

datasets

• ANNA – Anonymized Data – Double-blind application/offshoot of NORA

that deals with personal attributes anonymously

Page 10: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Vocabulary Lesson

• Bagging (averaging)

• Boosting (calculating predictive data)

• Drilling down

• Stacking (combining predictions from different models)

• Predictive mining (using X to predict Y)

• Data Models:

– CRISP = Cross Industry Standard Process for DM

– SEMMA = Sample, Explore, Modify, Model, Access

Page 11: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Value to Libraries a Tool

• Citizens demand more/better service at a time of reduced funding.

• Anticipate USER behavior

• Anticipate STAFF behavior

• Service hours and staffing needs, facilities planning

• Collection development – anticipating customer needs

Page 12: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Do Libraries Use DM?

• Association of Research Libraries ARL Spec Kit 274 (2003) – Mento and Rapple– 124 surveys, 65 responses– 40% already doing some data mining– 90% had plans

• Major areas of activity– Research and Collection Support– Administration– Repository management (future)

Page 13: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

ARL Member Benefits Seen

• Serials cancellation projects• Collection Development tuning• Budget allocation by material use• Workflow analysis• Weeding• OPAC and Web presence usability

and redesign• Hacking and break-in analysis

(defensive data mining)

Page 14: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Other Library Data Mining• Kun Shan University of Technology (Taiwan)

– ABAMDM Model = Acquisition Budget Allocation Model based on Data Mining

– More material use More money

– Compared:• Circulation• Collection size• Department size• # of courses• # students/faculty per department

Page 15: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Other Library Data Mining (2)

• OCLC’s ACAS (Automated Collection Analysis System) (recently upgraded!)

– Analyzes bibliographic records by call number ranges (LC 4-digit, Dewey tens for example)

– Subdivides by years and aggregated years– Subdivides by branch / collection

– “Collection conspectus” as a way to:• Compare library collections• Identify collection deficiencies

Page 16: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Other Library Data Mining (3)

• Univ. of Florida with FCLA– Decision Support System for acquisitions activities– Extracted from NOTIS bib files; saved to DB2– Screen scraped Acq files– Created large database of bib and in-process records

which allowed querying:• Circ history of approval versus firm orders?• $ spent on titles that never circulate• Do originally-cataloged items circulate? More or less than

copy cataloged items?• How many items circulate more than “n” times?

– Assesses collection development and tech service activity

Page 17: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Libraries are fountains of data

Page 18: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Everything is countable(example: Circulation transaction)

• Book: branch location Media type pubdate size color thickness #circs cost vendor holds

Extractable: Census Tract

Curriculum

Holds

Circ History

Repairs

User: age

Location

Language

Sex

Zipcode

phone#

School

Loan history

delinquencies

Multiply this by 10 million times a year!

Page 19: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Expand to:

• Acquisitions information (book attributes, vendor history and performance, fund history, requester and department, etc.)

• OPAC searching and navigation (databases, searches, not founds)

• Metasearch usage (databases, usage)

• Reference desk interactions (who, what, how long?). VRD by extension

• Resource sharing (NCIP, ILL)

• In-house usage transactions

• Physical plant: elevator, restroom, copier use

Page 20: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Crunch (Data) Creatively

• Unlikely variables give interesting data

• Ideas:– Sex of user versus color of book– Call # range vs. age of item vs. circulation ratio

by avg. $ paid per item– Story hour attendance vs. Adult circ vs. Fines

collected– Best sellers cost vs. Trade books by cost per circ– Etc.

Page 21: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

If you can count it, you can analyze it

But remember -

QUALITY and

CONSISTENCY

Page 22: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

• Library Automation vendor for over 30 years• Family-owned, customer focused

• Library•Solution®• Library•Solution™ for Schools

• CARL•Solution®• CARL•X ™

Page 23: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Library•Solution Reports

• Utilizes ReportNet software• Drag and Drop Report Design• Completely Web-based• Fitted to Library.Solution data framework• Zero footprint on workstations• Central reporting with enhanced

distribution• Multiple export formats• Charts, tables, etc.• Powerful

Page 24: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Using Library Data Outside the Library

• City, County, RCOG, State Planning and Development Authorities– Require solid statistics about population,

educational level, etc.– Quality of Life and capital budget services

planning

• Preserve user anonymity but share trends

• Input to GIS systems for real time projection of future library needs

Page 25: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Applying GIS in the Library Market

• Library.Decision product• Works with ILS vendors including TLC• Focus collections development • Strengthen advocacy planning; undertake

cardholder development campaigns • Support grant applications • Site new facilities • Calculate service indicators • Evaluate service delivery in relation to the

unique needs of your community

Page 26: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

In closing …• Libraries are producing data every minute of

every day

• You need:– Some tools– Some creativity– Some analytical ability

Knowledge is Power !

Page 27: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Acknowledgements

• Nicholson and Stanton, Gaining strategic advantage through bibliomining. At www.bibliomining.com

• Banerjee, Is Data Mining Right for your library? Computers in Libraries, Nov. 98

• Kao, Chang, and Lin. Decision Support for the Academic Library…, Information Processing and Management 39(2003)

• Fabris. Advanced Navigation. CIO May 1998

• Library Administration and Management (journal) Winter 1996, section on Data Mining

Page 28: Managing Libraries with Creative Data Mining Learning to Use Your Librarys Data Warehouse to Understand and Improve the Services You Provide Ted Koppel.

Thank You

• Contact information

Ted Koppel

The Library Corporation

[email protected]

(800)624-0559