Linked Media and Data Using Apache Marmotta
-
Upload
sebastian-schaffert -
Category
Internet
-
view
2.999 -
download
0
Transcript of Linked Media and Data Using Apache Marmotta
Linking Media and Data using Apache Marmotta
Keynote at LIME 2014 WorkshopSebastian Schaffert and Thomas Kurz
Contents
Motivation: The Red Bull Content Pool
Background: Linked Media Principles
Media Fragments and Media Ontology
Implementation: Linked Media FrameworkRed Bull Use Case
ConnectMe Use Case
Standardising: The Linked Data Platform
Introducing Apache Marmotta
Querying for Multimedia Fragments: SPARQL-MM
2009
2011
2013
2014
Motivation: The Red Bull Content Pool
Linked Media (2009)
Linked Media = Linked People + Linked Content + Linked Data
Motivation: The Red Bull Content Pool
online archive containing video and image material related to extreme sports events organised by Red Bull
business-to-business portal where journalists can get material for further broadcasting (mostly for free)
material comes with metadata in the form of tables in word documents:interview transcriptions (with time interval start/end second)
scene descriptions (with time interval start/end second)
music cue sheets (copyright information about background music tracks)
Motivation: The Red Bull Content Pool (2009)
Motivation: The Red Bull Content Pool
Problems:videos consist of series of scenes with many different persons
scanning through a video to find a particular scene is a huge amount of work
metadata is valuable but not really exploited for searching videos and while playing videos
Can we help Markus?
Name: Markus Occupation: sports journalist Company: RegioTV Pinzgau Objective: create report about cliff diving Requires: videos, background info, contacts
How can we help Markus?efficient and precise search in the Red Bull Content Pool
compact and relevant display of background information
contacts (e.g. website,email) of athletes, other journalists, etc.
fast and successful creation of the report
Background: Linked Media Principles
Linked Media Principles (2009)
Linked Data is read-only
i.e. focus was on publication of big datasets, not the interaction
with data
a system for managing media assets needs to be capable of updating
resources and their metadata
Linked Data is data-only
i.e. a resource is represented either as RDF metadata for machines
or as HTML tables for humans, but in all cases it is metadata and
not content
a system for managing media assets needs to be capable of managing
both media content and metadata about that content
Linked Media Principles (2009)
extend Linked Data for updates using REST principles (HTTP):GET: returns a resource (as in Linked Data)
POST: creates a new resource and uploads content or metadata
PUT: updates content or metadata of a resource
DELETE: removes a resource and all associated information
extend Linked Data for arbitrary media formats using MIME:controlled by Accept: (in case of GET) and Content-Type: (in case of PUT/POST) HTTP headers
header value: MIME type (e.g. text/turtle or image/jpeg) and type of relationship (e.g. rel=content or rel=meta)
accessing a resource with GET or PUT redirects to the actual representation specified by MIME type and relationship
Linked Media Principles (2009)
Example 1: Retrieve HTML table representation of resource metadata
Example 2: Retrieve HTML content of resource
Example 3: Update resource metadata
GET http://data.redlink.io/resource/1234
Accept: text/html; rel=meta
GET http://data.redlink.io/resource/1234
Accept: text/html; rel=content
PUT http://data.redlink.io/resource/1234
Content-Type: text/turtle; rel=meta
mm:hasFragment
Background: Media Fragments URI &Ontology for Media Resources
Media Fragments URI
media content currently treated as black box binary contentinteraction only via plugin or special browser support
linking to a subsequence of a video not possible
Media Fragments URI: use the fragment part of a URI to encode temporal and spatial subsequences
Examples:
Identify the sequence from second 3 to second 10 of the video:
http://data.redlink.io/resource/cliff_diving.ogg#t=3,10
Identify the spatial box 320x240 at x=160 and y=120 of the video
http://data.redlink.io/resource/cliff_diving.ogg#xywh=160,120,320,240
Ontology for Media Resources
common data model for representing video metadata:identification
creation (hasCreator, hasPublisher, ...)
content description (hasLanguage, hasGenre, hasKeyword,...)
rights and distribution (hasPermissions, hasTargetAudience, ...)
technical properties (hasCompression, hasFormat, ...)
fragments (hasFragment, hasChapter, ...)
mapping tables from the most popular video metadata formats to the Ontology for Media Resources (EXIF, MPEG-7, TV-Anytime, YouTube, ID3)
Combining Media Fragments and Media Ontology
use Media Fragment URIs to uniquely identify fragments of media contentbrowser compatibility
Linked Data compatibility
use Ontology for Media Resources to describe these fragmentsRDF compatibility
rich description graph with SPARQL querying
Combining Media Fragments and Media Ontology
@prefix ma: .@prefix rdfs: .@prefix foaf: .@prefix dct: .
a ma:MediaResource;rdfs:label "A sports video";ma:locator ;ma:hasFragment ;ma:hasFragment . ma:locator ; dct:subject . ma:locator ; dct:subject . foaf:name "Connor Macfarlane". foaf:name "Lewis Jones".
Combining Media Fragments and Media Ontology
Implementation:The Linked Media Framework
Behind the Scenes: Linked Media Framework
Linked Data Server with updates and uniform management of content and metadata => particularly well-suited for multimedia content and metadata!
Linked Media Principles for resource-centric access to content and metadata
SPARQL Query and SPARQL Update 1.1 for structural updating and querying
Modules for Reasoning, Semantic Search, Linked Data Caching, Versioning, and Social Media
Specialised on Linked Media and Linked Enterprise Content
Code, Installer, Screencasts and more:http://code.google.com/p/lmf/
Linked Media Framework (Architecture)
LMF Semantic Search
Facetted Search over Content and Metadata with SOLR compatible API
RDF Path Language for configurable Metadata Indexing
Multiple Cores with different configurations to adapt to different search requirements
LMF Reasoning
Rule-based reasoning over triples in the LMF triple store to represent implicit knowledge
Reason maintenance allows to describe justifications for inferences
adapted version of sKWRL rule language:
more efficient implementation,
improved reason maintenance
LMF Linked Data Caching
transparently retrieves linked resources from the Linked Data cloud when needed (e.g. LD Path or SPARQL query)
powerful component for integrating with other information systems exposing their data as Linked Media or Linked Data
adapters for services offering their data in proprietary formats (e.g. YouTube, Vimeo, )
LMF Classification and Sentiment Analysis
support for statistical text classification, allows to train different classifiers with sample texts for arbitrary categories
suggest most likely category for a text according to similarity with training data
analyse text for positive or negative sentiment (German and English)
LMF Social Media Integration
allows linking to social media resources, e.g. Facebook or Google accounts, videos, interests
allows authentication and data import from selected social media services (Facebook, YouTube, generic RSS)
LMF Versioning
keeps history of updates in the Linked Media Framework
provides information for trust and provenance
of data, e.g. annotations added to the system
Use Case:Red Bull Semantic Search Prototype
Media Fragment Search
Spatial and Temporal Fragments
Use Case:LIME Media Player (ConnectMe Project)
LIME Player: Interaction with Fragments
Standardisation:The Linked Data Platform
Linked Data Platform: Introduction
recommendation draft of the LDP working group at W3Csupport for read/write Linked Data
support for RDF and non-RDF resources
can be used as an alternative for Linked Media Principlesadvantage of standardisation and wide adoption
considerably more complex standard and protocol
URL: http://www.w3.org/TR/ldp/
Linked Data Platform: Concepts
access and interaction according to REST webservice principlesGET: returns description of a resource
POST: creates a new resource
PUT: replaces the description of a resource
DELETE: removes the description of a resource
Linked Data Platform Resources (LDP-R)RDF resources (LDP-RS): RDF description of a resource
non-RDF resources (LDP-NR): arbitrary (media) content
Linked Data Platform Containers (LDP-C)collection of LDP resources, e.g. students, professors, lectures
basic container (LDP-BC): simple collection of resources with common URI prefix
direct container (LDP-DC): collection with explicit membership (as triple)
indirect container (LDP-IC): collection with implicit membership (based on content)
LDP Basic Containers (LDP-BC)
collection of LDP resourcesidentification via common URI prefix,
e.g.
http://example.com/container1/a
http://example.com/container1/b
can contain both RDF and non-RDF resources at the same time
container is itself an RDF resource
description as RDF:
@base @prefix dcterms: .@prefix ldp: .
a ldp:BasicContainer; dcterms:title "A very simple container"; ldp:contains , , .Introducing Apache MarmottaApache Marmottaa simplification of the Linked Media Framework taking core components:Linked Data Server with SPARQL 1.1Linked Data CacheVersioning, Reasoningno search, no content analysisreference implementation of the Linked Data Platform and participation in W3C working grouphighly modular and extensible to build custom Linked Data applications (both client and server)http://marmotta.apache.orgApache Marmotta: ArchitectureQuerying Multimedia FragmentsSPARQL-MM: Introductionextension of SPARQL with specific multimedia functions and relations, implemented in Apache MarmottaRelation FunctionAggregation FunctionSpatialmm:rightBesidemm:spatialIntersectionmm:spatialOverlapsmm:spatialBoundingBoxTemporalmm:aftermm:temporalIntersectionmm:temoralOverlapsmm:temporalIntermediateCombinedmm:overlapsmm:boundingBoxmm:containsmm:intersectionA list of all functions can be found at:https://github.com/tkurz/sparql-mm/blob/master/sparql-mm/functions.md SPARQL-MM: A sample queryGive me the spatio-temporal snippet that shows Lewis Jones right beside Connor Macfarlane.PREFIX foaf: PREFIX mm: PREFIX ma: PREFIX dct: SELECT (mm:boundingBox(?l1,?l2) AS ?two_guys) WHERE { ?f1 ma:locator ?l1; dct:subject ?p1. ?p1 foaf:name "Lewis Jones". ?f2 ma:locator ?l2; dct:subject ?p2. ?p2 foaf:name "Connor Macfarlane". FILTER mm:rightBeside(?l1,?l2) FILTER mm:temporalOverlaps(?l1,?l2)}SPARQL-MM: A sample querymm:boundingBox(?l1,?l2)SPARQL-MM: DemoDEMO!ConclusionsConclusionssemantic media asset management requires management and interaction with both content and metadataLinked Media Principles (2009) were a first approach to extend Linked Data with support for semantic media asset managementLinked Data Platform (W3C working draft) supersedes Linked Media Principles, as it covers the same aspects and moresemantic media asset management requires specific media access and queryingMedia Fragments URI (W3C) to identify media fragmentsOntology for Media Resources (W3C) to describe media fragmentsSPARQL-MM to query media fragment descriptionsThanks for your Attention!Dr. Sebastian SchaffertChief Technology OfficerRedlink [email protected]