Rijksmuseum presentation
-
Upload
dreamgirl314 -
Category
Presentations & Public Speaking
-
view
86 -
download
4
Transcript of Rijksmuseum presentation
Trusting user-contributed data in Cultural Heritage Domain
Archana Nottamkandath(Work done with Davide Ceolin & Wan Fokkink)
VU University Amsterdam
COMMIT/SEALINC
1
Context
• COMMIT/SEALINC project• Museums have collections which can be
annotated with (external) user-contributed information for searching better through collection
COMMIT/SEALINC 2
TulipsTulips
ButterflyButterfly
PortraitPortrait
Can we trust the user provided content directly? – Apparently Not!
COMMIT/SEALINC 4
Stella is GayStella is Gay
wwwapartmentvermeercomwwwapartmentvermeercom
Evaluation costs Resources
• Is expensive manual labor• Costs a lot of time• Requires adherence to museum policies– Museum X [Accept, not sure, reject]– Museum Y [Foreign, Judgmental, Strong reject,
Strong accept ]..
COMMIT/SEALINC 7
Need for automated trust analysis
• Algorithms automatically/ semi-automatically evaluate annotations
COMMIT/SEALINC 8
(a) Flower(b) 19th century (c) Sunshine(d) Vermeer(e) Bronze
Automated Trust analysis algorithms
• Requirements– High accuracy (Accurately predict evaluations
most of the time)– Minimum input from cultural heritage
professionals– Scalable and Efficient (w.r.t resources and time)– Works with different cultural heritage data
COMMIT/SEALINC 9
Definition
• Trustworthy annotation – Relevant to image– Enhances/re-instates existing knowledge– Is acceptable by museums policies to be published
on their website
COMMIT/SEALINC 10
Used
Accurator Interface
Existing workflow
COMMIT/SEALINC 11
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributed
Tags
How to determine trust from user contributing annotations to the
system?
COMMIT/SEALINC 12
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributedUsed
Accurator Interface
Tags
How to determine trust from the Annotation Process?
COMMIT/SEALINC 13
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributedUsed
Accurator Interface
Tags
How to determine trust from contributed data?
COMMIT/SEALINC 14
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones
contributedUsed
Accurator Interface
Tags
How to determine trust from users?[1]
• Evaluate subset of user tags
COMMIT/SEALINC 15
TulipsRosesNight SkyVan GoghBuddhistPortraitMonumentAsianWar memorial
User_name: Jones Test setRosesNight skyVan GoghAsianWarMemorial
contributed
Train set
TulipsVan GoghBuddhistMonument
Evaluates
Museum
• User expert on one topic might be expert on similar topics
COMMIT/SEALINC 16
Expert on
Tulips
Possibly Expert on
Possibly Expert on
Roses
Lilies
User_name: Jones
Test setRosesNight skyVan GoghAsianWarMemorial
Train setTulipsVan GoghBuddhistMonument
How to determine trust from users?[1]
With a certain probability
Determine trust from users[2]• User profile : [Experience, education, country,
gender, income, museum visits…]
COMMIT/SEALINC 17Steve.museum dataset
Determine trust from users[2]
• Predict user reputation using machine learning
• [Feature1, Feature2, ..] -> Category of user– [21 yrs, Female, Bachelors, Australia] -> Excellent– [60 yrs, Male, PhD, America] -> Good– [56 yrs, Female, Masters, Croatia] -> Bad– [30 yrs, Male, Bachelors, Mexico] -> ?
COMMIT/SEALINC 18
How to determine trust from Annotation process?
• Time of day, Day of week, Day of month etc. affect user quality
• Typing speed affects user quality– Typing fast might indicate higher confidence
COMMIT/SEALINC 19
TulipsVan GoghBuddhistMonument
Rich LadyPlantLeonardoBronze plate
How to determine trust from Annotation process?
• Predict tag quality using machine learning• [Feature1, Feature2, ....] -> Category of Tag– [10:00, Monday, June, 3s] -> Excellent– [12:00, Wednesday, 15s] -> Good– [23:56, Friday, April, 80s] -> Bad– [06:00, Thursday, March, 70s] -> ?
COMMIT/SEALINC 20
How to determine trust from Annotation process?
• Why is this important?– Useful for anonymous users who did not fill profile
information
COMMIT/SEALINC 21
How to determine trust from data?• Contributed data itself has features, use
machine learning to predict quality of tag– Length – Specificity – Presence in vocabularies– Times already contributed– Noun
COMMIT/SEALINC 22
TulipsVan GoghBuddhistMonument
[6,specific, yes, English, 10, no…] -> Good[7,specific, yes, Dutch, 1,yes…] -> Bad
Goals achieved
• Requirements– High accuracy (Accurately predict evaluations
most of the time)– Minimum input from cultural heritage
professionals– Scalable and efficient– Works with different cultural heritage data
COMMIT/SEALINC 23
– High accuracy (Accurately predict evaluations most of the time)• Predicted quality of a tag based on user profile with
accuracy from 68% to 72%
COMMIT/SEALINC 25
Steve dataset results
Goal 1: High Accuracy
Goal 2: Minimum input from Cultural Heritage Institutions
• Algorithms require minimum of 5 evaluated tags per user for predictions
• Working on to minimize/eliminate this requirement
COMMIT/SEALINC 26
Goal 3: Scalable and efficient• Reduced computation time while maintaining
accuracy in Steve dataset
COMMIT/SEALINC 27
Goal 4: Works with different cultural heritage data
• Steve Museum dataset• Waisda? Dataset– Video Tagging Game
• SEALINC Media experiments at CWI
COMMIT/SEALINC 28
Future Work
• Employ our experiences and algorithms to analyze the data from Accurator
• Employ trust scores for ranking in search• Identify techniques to visualize trust
COMMIT/SEALINC 29