BISAC-to-Thema for International Sales

40
BISAC-to-Thema Mapping for International Sales MARCH 2015 PRESENTED FOR BISG BY MICHAEL OLENICK

Transcript of BISAC-to-Thema for International Sales

BISAC-to-Thema Mapping for International Sales

MARCH 2015

PRESENTED FOR BISG BY

MICHAEL OLENICK

Agenda• What Is Thema?

• How Was It Developed?

• How Will It Be Used?

• Implications for BISAC Users

• What Does It Look Like?

• The BISAC-to-Thema mapping

2

• First global subject category system for the book trade.

• Developed and updated with input from several national groups.

• Made for all members of the supply chain to use for physical and digital data.

Thema: What Is It?

3

• Thema classification has two parts: subject headings and qualifiers.

• Codes are variable length and hierarchical.

• Qualifiers may have a national extension and can be used with any Thema subject (adult, children’s, fiction, non-fiction, etc.)

Subject HeadingDNBS Biography: sport

Qualifier1KBB-US-NAKC New York City

Thema: What Is It?

4

modifies

Project launched at Frankfurt Oct 2012 and Version 1.0 released at Frankfurt Oct 2013. Current Version 1.1 released Nov 2014.

UK Schema: Version 1.0 released July 1998. Current Version 2.1 released Nov 2010.

How Thema Was Developed

5

iBIC Thema

Proposed as internationalized version of BIC late 2011/early 2012 – never released but led to Thema.

Who Maintains Thema?• Managed, published, and promoted by EDItEUR, the trade

standards body for the global book, e-book and serials supply chain, through the Thema International Steering Committee.

• Steering Committee is comprised of representatives from various national stakeholder groups.

• BISAC Subject Committee created a subcommittee known as the Thema Working Group (WG).

• WG serves as the US National Group and the chairperson serves as the US representative to the Steering Committee.

6

How Will Thema Increase Sales?

7

• Facilitate international transactions.

• Increase understanding in international markets.

• Reduce subject code confusion.

• Increase discoverability.

Implications for BISAC Users• Reduces mappings to standards not used in the US.

• Thema and BISAC will operate in parallel.

• No timeline for BISAC being deprecated.

• BISAC 2014-to-Thema 1.1 mapping is almost ready for distribution and will be available free to BISG members and free to non-members with purchase of the BISAC Subject Codes List.

• Mapping is only as specific as BISAC subjects are.

8

Differences Between BISAC and Thema

• BISAC subjects do not have qualifiers.

• BISAC subjects are meant to be used in conjunction with Merchandising Themes and Regional Themes.

• BISAC codes do not indicate the sequence in which the subjects are presented (BISAC literals are hierarchical but codes are not).

• If the list is reorganized, codes remain the same even if the literal changes (unless a subject is moved to another section).

9

Differences Between BISAC and Thema

10

For example, sometimes subjects that formerlyappeared on their own are moved into a “tree”:

HIS000000 HISTORY / GeneralHIS001000 HISTORY / Africa / GeneralHIS001010 HISTORY / Africa / CentralHIS001020 HISTORY / Africa / EastHIS001030 HISTORY / Africa / NorthHIS001040 HISTORY / Africa / South / GeneralHIS047000 HISTORY / Africa / South / Republic of South AfricaHIS001050 HISTORY / Africa / WestHIS038000 HISTORY / Americas (North, Central, South, West Indies)

Differences Between BISAC and Thema

11

Sometimes subjects are taken out of a tree and now appear on their own.`

ART015030 ART / EuropeanART057000 ART / Film & VideoART013000 ART / Folk & Outsider ArtART058000 ART / Graffiti & Street ArtART015000 ART / History / GeneralART015050 ART / History / Prehistoric & PrimitiveART015060 ART / History / Ancient & ClassicalART015070 ART / History / MedievalART015080 ART / History / RenaissanceART015090 ART / History / Baroque & RococoART015120 ART / History / RomanticismART015100 ART / History / Modern (late 19th Century to 1945)ART015110 ART / History / Contemporary (1945-)

What Does Thema Look Like?

12

Code HeadingL LawLA Jurisprudence & general issuesLN Laws of specific jurisdictions & specific areas of law LNJ Entertainment & media law LNJD Defamation law, slander & libelH

IER

AR

CH

Y

The variable-length codes are hierarchical and determine the order in which the subjects are presented. The text of the subject or qualifier does not contain all parts of the hierarchy and is meant to stand on its own when output.

Subject Headings (2619)

Top Level Subjects (20)A The ArtsC Language & LinguisticsD Biography, Literature & Literary studiesF Fiction & Related itemsG Reference, Information & Interdisciplinary subjectsJ Society & Social SciencesK Economics, Finance, Business & ManagementL LawM Medicine & NursingN History & ArchaeologyP Mathematics & ScienceQ Philosophy & ReligionR Earth Sciences, Geography, Environment, PlanningS Sports & Active outdoor recreationT Technology, Engineering, AgricultureU Computing & Information TechnologyV Health, Relationships & Personal developmentW Lifestyle, Hobbies & LeisureX Graphic novels, Comic books, CartoonsY Children’s, Teenage & Educational

13

Geographical Qualifiers (1407)

USE: to indicate the geographical scope or applicability of book content (such as the location of a travel guide, the setting of a novel, the jurisdiction to which laws apply, etc).

DO NOT USE: to indicate the literary tradition of the work or to indicate the location or nationality of the author, publisher, etc. (e.g., literature of Peru).

Examples:

1D Europe1DF Central Europe1DFG Germany1DFG-DE-F East Germany1DFG-DE-FS Saxony1DFG-DE-FSA Leipzig

Qualifiers: Six Different Types

14

Language Qualifiers (281)

USE: to indicate the language(s) to which the book content applies (such as a linguistics or literary studies work, phrasebook or dictionary, etc.).

DO NOT USE: to indicate the language of the text itself or to indicate the literary tradition of the work (e.g., Spanish literature).

Examples:

2H African languages2HC Niger-Congo languages2HCB Bantu languages2HCBB Central Bantu languages2HCBBC Chichewa (Chewa)

Qualifiers: Six Different Types

15

Time Period Qualifiers (187)

USE: to indicate the time period range of book content (e.g., a history book, memoir, historical fiction, etc.).

DO NOT USE: to indicate the year of an annual, travel guide, conference proceedings, etc. or to indicate the first publication date of a work (e.g., of classic fiction or literature).

Examples:

3M c 1500 onwards to present day3MP 20th century, c 1900 to c 19993MPB Early 20th century c 1900 to c 19503MPBG Inter-war period c 1919 to c 19393MPBGJ c 1930 to c 19393MPBGJ-ES-B Spain: Civil war (1936–1939)

Qualifiers: Six Different Types

16

Educational Purpose Qualifiers (130)

USE: to indicate the curriculum, examination, or level for which educational material is specifically designed (such as school textbooks, study aids, etc.).

DO NOT USE: to indicate educational institutions as subject of a book

Examples:

4C For all educational levels4CX For adult education4G For international curricula & examinations4L For language learning courses & examinations4LE ELT examinations & certificates4T For specific educational purposes4TY For home learning4Z For specific national educational curricula4Z-US-F For LSAT (Law School Admission Test) (USA)

Qualifiers: Six Different Types

17

Interest Age & Special Interest Qualifiers (84)

USE: to indicate a variety of characteristics relating to content: the reading age or level; special events or holidays; groups of people that book content is related to, and/or, in some cases, specifically intended for (e.g. for women, for/about religious & ethnic groups, etc); and to indicate explicit content.

Examples:

5A Interest age / level5AN Interest age: from c 12 years5H Holidays, events & seasonal interest5HK Special events5HKU Engagement / Wedding / Marriage5PB Relating to ethnic minorities & groups5PB-US-H Relating to Latin / Hispanic American people5X Contains explicit material

Qualifiers: Six Different Types

18

Style Qualifiers (166)

USE: To indicate the particular style of artistic or creative expression covered by book content – such as books on art, architecture, music, literary studies – or exemplified by fiction & literature texts.

Examples:

6B Styles (B)6BA Baroque6BB Barbizon school6BC Bauhaus6BD Berliner Sezession6BF Biedermeier6BG Beat style

Qualifiers: Six Different Types

19

Qualifiers in Action

20

Geographic Code Heading1K The Americas1KB North America (USA & Canada)1KBB United States of America, USA1KBB-US-S US South1KBB-US-SC US South: East South Central States1KBB-US-SCA Alabama

Time Period Code Heading3M c 1500 onwards to present day3MP 20th century, c 1900 to c 19993MPQ Later 20th century c 1950-19993MPQS c 1960 to c 19693MPQS-US-P USA: Civil Rights Movement

• Columns 1 and 2 contain the code and subject and are always populated.

• Column 3 contains Notes and Column 4 contains Related codes (as appropriate).

Examples:

Code Subject Notes Related

AMR Architecture: interior design Class here: professional works WJK

WJK Interior design, decor & style AMR

guides

Additional Information

21

Technical Considerations• Only a Subject Category is mandatory. Qualifiers are optional and

should not be used without at least one subject.

• First Subject Category listed is the primary subject.

• Qualifiers are not attached to specific subjects but to the record.

• Thema codes can be sent as part of any ONIX 2.* and ONIX 3.* using standard ONIX practice for subject classification metadata.

• There is no upper limit to the number of Subjects or Qualifiers that may be assigned.

• It is expected that a maximum of 10 of each type would cover all reasonable circumstances.

• Data element lengths are specified in the next chart.

22

Summary of Elements

23

Element Code begins

May contain Length[currentlongest code]

ONIX Code (from codelist 26)

Mandatory / Optional

Categories A-Y A-Z 1-9 1-8 [6] 93 MandatoryGeographical Qualifiers 1 1 A-Z - 2-19 [13] 94 OptionalLanguage Qualifiers 2 2 A-Z - 2-19 [6] 95 OptionalTime Period Qualifiers 3 3 A-Z - 2-19 [11] 96 OptionalEducational Purpose Qualifiers

4 4 A-Z - 2-19 [9] 97 Optional

Interest Qualifiers 5 5 A-Z - 2-19 [8] 98 OptionalStyle Qualifiers 6 6 A-Z - 2-19 [3] 99 Optional

Notes on ImplementationCurrently available at http://www.editeur.org/151/thema/

• Overview

• Version 1.1 headings and codes (Excel, in English. Also available as PDF, Word, XML and JSON)

• German version 1.1 translation (Excel, PDF, Word)

• Norwegian version 1.1 translation (Excel)

• Interactive category browser (in many languages)

• Instructions

• Mapping from BIC 2.1 to Thema 1.0

• Link to the BISAC 2013 to Thema 1.0 mapping (available through BISG)

• Mapping tool for BISAC 2013 to Thema 1.0 (provided by BookNet Canada in association with BISG)

24

Notes on Implementation• Sunrise date: 12/31/2013 (date when stakeholder could

start to send out Thema data).

• This was a “not expected before” date (not a “must be done by” date).

• Usage already seen in countries that had no national system in place, or had one that was not widely recognized internationally.

• US usage is growing.

25

Who Did the Mapping?• New version created by Thema Working Group over meetings from

12/3/14 to 2/13/15 and approved by BISAC Subject Committee 2/19.

• BISAC 2014 contained roughly 75 new subjects and nearly 50 revisions and inactivations. Thema 1.1 contained over 200 new subjects and qualifiers and roughly 175 revisions (with no inactivations).

• Members of the group were assigned BISAC sections to update and the results were discussed and agreed upon by the full WG.

• International Steering Committee chair consulted to clarify issues and is currently reviewing the results to insure proper interpretation of Thema usage. The meetings also generated suggestions for new Thema headings which were passed along (US suggestions resulted in over 100 changes or additions to the previous release of Thema).

26

What Does the Mapping Look Like?

27

What Does the Mapping Look Like?

28

Methodology for Creating the Mapping

• Find the best equivalent Thema subject(s) and qualifier(s) with the understanding that the results may not apply to every title assigned a given BISAC code.

• Treat each BISAC subject on its own. No consideration was given to conditional logic (e.g., “if this BISAC code is assigned with this other BISAC code, adjust the mapping accordingly”).

• No further specificity than is present in the BISAC literal is presumed by the mapping. There is no indication in the mapping that more specific subjects are available if one chooses to assign Thema subjects directly.

• Multiple Thema codes used as needed to convey the information contained in one BISAC subject (up to four subject codes and four qualifier codes assigned to each BISAC code).

29

Methodology for Creating the Mapping

• There are times when all the codes will apply and times when not all the codes will apply. The policy of the mapping was that it was better to use multiple, specific codes as opposed to one general code – although each mapping was taken on a case-by-case basis.

• If at all possible, a BISAC subject is not mapped to two Thema codes where one is higher up in the hierarchy than the other. However, the hierarchy of Thema may not reflect the hierarchy of BISAC.

• When there were multiple Thema codes used, the first was the most representative of the BISAC subject. Note that as a general Thema rule, secondary subject codes can be used as “qualifiers” to the primary code although there is no specific way to communicate that this happening (and the actual qualifiers are preferred when appropriate).

30

Methodology for Creating the Mapping

• “ANT025000 ANTIQUES & COLLECTIBLES / Performing Arts” is mapped to “WC Antiques & collectables” and “AT Performing arts”. The second code effectively “qualifies” the first code – on its own, AT would not be an expected mapping for a book about antiques, but it adds more information to an otherwise very general mapping.

• “BIO004000 BIOGRAPHY & AUTOBIOGRAPHY / Composers & Musicians” is mapped to “DNBF Biography: arts & entertainment”, “AVN Composers & songwriters”, “AVP Musicians, singers, bands & groups” even though a biography of a singer would not get all the Thema subjects if assigned manually. However, it was thought that the more general subject “AV Music” was too general, so the decision was made to use the two, more specific subjects (DNBF is the primary mapping because it is the best match to the BIO nature of the BISAC code).

31

Methodology for Creating the Mapping

• “BUS050010 BUSINESS & ECONOMICS / Personal Finance / Budgeting” is mapped to “VSB Personal finance” and “KJMV1 Budgeting & financial management”. In this case, both subjects are needed to convey the full BISAC intent – the first subject pertains to the “tree” while the second to the “branch”.

• “TRV009130 TRAVEL / Europe / Spain & Portugal” is mapped to “WT Travel & holiday”, “1DSE Spain”, and “1DSP Portugal”. This illustrates the same principle but with qualifiers. Mapping to “DS Southern Europe” was deemed to be less helpful than mapping to the specific countries, even though a given book would generally just cover one or the other country.

32

Methodology for Creating the Mapping

• In Thema, fiction codes may be used with nonfiction codes (as opposed to BISAC best practices) but only if they convey something central to the work that is not conveyed otherwise. However, while this was the case with the previous mapping, many new fiction codes were added to Thema (e.g., “FXP Narrative theme: Politics”), so we were able to avoid doing this for this mapping.

• There are no restrictions about combining Y* codes regarding either fiction/nonfiction or across the age/format indicated by the subject. Since one of the main revisions for Thema 1.1 was the change in the wording of several Thema children’s subjects from “Children’s / Teenage general non-fiction:” to “Children’s / Teenage general interest:”, the JUV and JNF sections of BISAC were heavily updated.

33

Methodology for Creating the Mapping

• When it is clear that the meaning is the same, the most specific subject was used as the first mapping even if it is in a Thema hierarchy that does not mirror BISAC’s hierarchy.

• For example, “PSY045070 PSYCHOLOGY / Movements / Cognitive Behavioral Therapy (CBT)” was mapped first to “MKMT6 Cognitive behavioural therapy” even though it is in Thema’s “Psychotherapy” tree rather than in Thema’s equivalent of BISAC’s “Movements” tree. There is no other subject in BISAC or Thema related to this topic, so clearly the meaning and expected usage is the same.

34

Methodology for Creating the Mapping

• The context of the BISAC subject was taken into account, especially for topics that are covered in more than one section. For example, “FAM048000 FAMILY & RELATIONSHIPS / Autism Spectrum Disorders” was mapped to “VFJB Coping with illness & specific conditions” first and “MKJA Autism & Asperger’s Syndrome” second, because this conveys the overall context of the FAM section and because the more clinical aspects are covered by “PSY022020 PSYCHOLOGY / Psychopathology / Autism Spectrum Disorders” (which was mapped to “MKJA Autism & Asperger’s Syndrome” only).

• Single-letter subject codes, while not explicitly prohibited from usage by Thema, were avoided if at all possible (they were used very sparingly and there is a recommendation on the table to consider remapping these). Single-letter qualifiers are expressly prohibited and were not used.

35

Benefits & Drawbacks of Using a Mapping

• A subject mapping is a broad stroke method of getting a lot of codes on a lot of data quickly. This applies to any mapping, not just to this one.

• A mapping cannot replicate the accuracy of manual assignment because two things have to be correct rather than one (the original assignment and the accuracy of the mapping). And even if both things are “correct” you may still get a result that could be improved.

• A mapping will work best when going from a more specific to a more general schema. It will work fairly well when going from schemas of roughly equal specificity (as is the case here). But it should be noted that BISAC and Thema have different levels of specificity in different areas.

• The results should be reviewed to see if they work for your particular titles with an eye towards improving specificity and eliminating duplicate and superfluous codes.

36

Benefits & Drawbacks of Using a Mapping

• If your company already uses a mapping from a proprietary schema to BISAC, then you may want to consider mapping directly from that schema to Thema. Each extra mapping step diminishes accuracy.

• We can only take you so far in the process – you need to decide on a company-by-company basis how much work is needed beyond just applying the mapping.

• For example, a history of relations between France and Germany could be assigned “HIS013000 HISTORY / Europe / France” and “HIS014000 HISTORY / Europe / Germany“ which would result in the Thema codes NHD; NHD; 1DDF; 1DFG whereas in fact you want your final results to be: NHD; 1DDF; 1DFG (recalling that qualifiers are not appended to subjects but modify the entire record).

37

Benefits & Drawbacks of Using a Mapping

• “SCI089000 SCIENCE / Life Sciences / Neuroscience” is mapped to “PSAN Neurosciences”. This is the preferred mapping despite the fact that there are five Thema subjects available under PSAN*. Multiple Thema subjects are used when no one code covers the BISAC subject fully (or when that one code is too general to be useful). In this case, PSAN covers the topic as fully as can be inferred from the BISAC subject and is not too general to be useful. There is no way to indicate in the mapping that a book on, say, developmental neuroscience could be assigned “PSAN2 Developmental neuroscience” via direct assignment.

• If your company has a proprietary subject for developmental neuroscience, you will be better off mapping that subject directly to PSAN2 rather than mapping indirectly from proprietary-to-BISAC-to-Thema (although, again, assigning Thema codes directly is preferred to any mapping).

38

Conclusion• Thema is gaining rapid acceptance in the global marketplace.

• It would be beneficial for titles to have Thema subjects.

• While it is recommended, it would be a daunting task for publishers, retailers, and data aggregators to manually assign Thema codes to all their data.

• Considering that a majority of titles in the US data stream have already been classified with BISAC subjects, a mapping from BISAC-to-Thema can serve as either a starting point or an endpoint to assigning Thema subjects.

39

Questions?

Michael Olenick, Business Analyst

[email protected]

40