1. Relationships to linguistic features
Understanding Interpersonal Variations in Word Meanings via Review Target Identification
Top
-20
ery bready ark slight floral toasty tangy updated citrusy soft deep mainly grassyaroma doughy dissipating grass ot great earthy
Botto
m-20
reminds cask batch oil reminded beyond canned conditioned double abv hope horse oats rye brewery blueberry blueberries maple bells old
2. Word list sorted by semantic variation
Proposal:
1. Pretrain model by MTL with metadata identification (auxiliary tasks)・Stabilize the training for the extreme multi-class classification
2. Fine-tune personalized word embeddings only for the target task・Prevent the word embeddings from learning selection bias・Reduce memory usage caused by explosion of parameter counts in proportion to the number of reviewers
Reviewer specific matrixReviewer universal word embedding
Experiments: Review target identification
Analysis: What kind of words have strong semantic variations?
[Target]: Words used by at least 30% reviewers(excluding stop words)
[Proposed metrics]:Semantic variation
1
|U(wi)|!
uj∈U(wi)
(1− cos(eujwi, ewi))
Words related to the five senses and adjectives have large semantic variations
[Premise]: Target identification performance reflectswhether the model can capture semantic variations
Related Work:
・Annotation bias: How annotators labelex) Some reviewers tend to give extreme ratings
・Selection bias: How reviewers choose the review targetsex) Write reviews on products of a specific manufacturer
・Semantic variation (our target): How people use the words
Approach: Induce personalized word embeddings through review-target (objective label) identification! By solving a task with objective target, the model can capture only the semantic variation
We express different meanings with the same word or the same meaning with different words
Motivation: Clarify what kind of words have different meanings by individuals
Cold
Yellow
Cold
Gold
!"℉~!%℉ ≠
3. Relationships between the words
Intersection between clusters means thatthe same meaning is expressed different ways
Pearson correlation -0.07Pearson correlation 0.43
grainy
bready
doughy
toastybiscuity
crackery
grassy
[Baseline]: Select the majority label in the training set[Proposed]: Four different settings, ・Whether pretraining by MTL is applied (MTL or ----)・Whether the fine-tuning for personalization is employed (PRS or ----)
RateBeer dataset Yelp dataset
ModelTarget metadata Target metadataBeer
[Acc.(%)]Brewery[Acc.(%)]
Style[Acc.(%)]
ABV[RMSE]
Service[Acc.(%)]
Location[Acc.(%)]
Category[Micro F1(%)]
Baseline 0.08 1.51 6.19 2.321 0.05 27.00 31.5---- / ---- 15.74 n/a n/a n/a 6.75 n/a n/aMTL/ ---- 16.16 19.98 49.00 1.428 9.71 70.33 57.8---- /PRS 16.69 n/a n/a n/a 7.15 n/a n/aMTL/PRS 17.56 20.81 49.78 1.406 10.72 83.14 57.7
" Prevents smooth communication in our daily lives" Causes problems for computer when solving NLP tasks
Our method could successfully capture semantic variations
* The same trend was also seen on Yelp dataset
・Proposed a method of inducing personalized word emb. via review target identification
・Showed the effectiveness of the personalized word emb. on the target identification task
・Clarified that the meanings of words related to the five senses highly fluctuate by individuals
Summary
Daisuke Oba1, Shoetsu Sato1, Naoki Yoshinaga2, Satoshi Akasaki1, Masashi Toyoda21 The University of Tokyo, 2 Institute of Industrial Science, The University of Tokyo
Output: Review target
In the target identification task ...・The output label is automatically given without an annotator: ! Annotation bias・The same reviewer selects the same target only once in the dataset: ! Selection biasInput: Review
This beer has high drinkability estimate
'(℉~'"℉
dataset # reviews # reviewers metadataRateBeer[McAuley+,13] 2,695,615 3,670 Style, Brewery,
Alcohol by volume(ABV)Yelp 426,816 2,414 Location, Category
Frequent words tend to have strong semantic variations・ In the beer/food/restaurant domain, expressions depending
on individual senses or experiences frequentry appearMeaning change to other synsets does not happen・The small domain dataset restricts the actual meanings
eujwi
<latexit sha1_base64="wil07PffMt5sodU9PSFlQLFu4wc=">AAAC33ichVFNSxVhFH4cK+324dU2QZvByw1DuJwRQREEsU2LFnr1egWvDTPTq705X8y89/oxzFpwEe0iWhW0iPb9gNz4B1r4D4pWZdCmRWc+IFS0M8y85zznec48L8cOXRkrouM+rf/K1WsDg9crN27euj1UHR5ZiYNu5IiWE7hBtGpbsXClL1pKKleshpGwPNsVbXvrYdZv90QUy8BfVruhWPesTV9uSMdSDJnVTsf2EpE+Sbrm89RMtk2Z6pX6rN5RYkdFXtIUj1vpWEZqczsj6YWi4I7nlV22Hujjp7pmtUYNykM/nxhlUkMZC0H1Ezp4igAOuvAg4ENx7sJCzM8aDBBCxtaRMBZxJvO+QIoKa7vMEsywGN3i7yZXayXqc53NjHO1w39x+Y1YqaNOX+gDndARfaRv9OfCWUk+I/Oyy6ddaEVoDh3cXfr9X5XHp8Kzf6pLPStsYDr3Ktl7mCPZLZxC39t7dbI006wn9+kdfWf/b+mYDvkGfu+X835RNN9c4sdmL9l6jLPLOJ+sTDQMahiLk7W5+XJRg7iHUYzxNqYwh0dYQIunf8ZX/MBPzdL2tRfay4Kq9ZWaOzgV2uu/sLq2Ww==</latexit><latexit sha1_base64="wil07PffMt5sodU9PSFlQLFu4wc=">AAAC33ichVFNSxVhFH4cK+324dU2QZvByw1DuJwRQREEsU2LFnr1egWvDTPTq705X8y89/oxzFpwEe0iWhW0iPb9gNz4B1r4D4pWZdCmRWc+IFS0M8y85zznec48L8cOXRkrouM+rf/K1WsDg9crN27euj1UHR5ZiYNu5IiWE7hBtGpbsXClL1pKKleshpGwPNsVbXvrYdZv90QUy8BfVruhWPesTV9uSMdSDJnVTsf2EpE+Sbrm89RMtk2Z6pX6rN5RYkdFXtIUj1vpWEZqczsj6YWi4I7nlV22Hujjp7pmtUYNykM/nxhlUkMZC0H1Ezp4igAOuvAg4ENx7sJCzM8aDBBCxtaRMBZxJvO+QIoKa7vMEsywGN3i7yZXayXqc53NjHO1w39x+Y1YqaNOX+gDndARfaRv9OfCWUk+I/Oyy6ddaEVoDh3cXfr9X5XHp8Kzf6pLPStsYDr3Ktl7mCPZLZxC39t7dbI006wn9+kdfWf/b+mYDvkGfu+X835RNN9c4sdmL9l6jLPLOJ+sTDQMahiLk7W5+XJRg7iHUYzxNqYwh0dYQIunf8ZX/MBPzdL2tRfay4Kq9ZWaOzgV2uu/sLq2Ww==</latexit><latexit sha1_base64="wil07PffMt5sodU9PSFlQLFu4wc=">AAAC33ichVFNSxVhFH4cK+324dU2QZvByw1DuJwRQREEsU2LFnr1egWvDTPTq705X8y89/oxzFpwEe0iWhW0iPb9gNz4B1r4D4pWZdCmRWc+IFS0M8y85zznec48L8cOXRkrouM+rf/K1WsDg9crN27euj1UHR5ZiYNu5IiWE7hBtGpbsXClL1pKKleshpGwPNsVbXvrYdZv90QUy8BfVruhWPesTV9uSMdSDJnVTsf2EpE+Sbrm89RMtk2Z6pX6rN5RYkdFXtIUj1vpWEZqczsj6YWi4I7nlV22Hujjp7pmtUYNykM/nxhlUkMZC0H1Ezp4igAOuvAg4ENx7sJCzM8aDBBCxtaRMBZxJvO+QIoKa7vMEsywGN3i7yZXayXqc53NjHO1w39x+Y1YqaNOX+gDndARfaRv9OfCWUk+I/Oyy6ddaEVoDh3cXfr9X5XHp8Kzf6pLPStsYDr3Ktl7mCPZLZxC39t7dbI006wn9+kdfWf/b+mYDvkGfu+X835RNN9c4sdmL9l6jLPLOJ+sTDQMahiLk7W5+XJRg7iHUYzxNqYwh0dYQIunf8ZX/MBPzdL2tRfay4Kq9ZWaOzgV2uu/sLq2Ww==</latexit><latexit sha1_base64="wil07PffMt5sodU9PSFlQLFu4wc=">AAAC33ichVFNSxVhFH4cK+324dU2QZvByw1DuJwRQREEsU2LFnr1egWvDTPTq705X8y89/oxzFpwEe0iWhW0iPb9gNz4B1r4D4pWZdCmRWc+IFS0M8y85zznec48L8cOXRkrouM+rf/K1WsDg9crN27euj1UHR5ZiYNu5IiWE7hBtGpbsXClL1pKKleshpGwPNsVbXvrYdZv90QUy8BfVruhWPesTV9uSMdSDJnVTsf2EpE+Sbrm89RMtk2Z6pX6rN5RYkdFXtIUj1vpWEZqczsj6YWi4I7nlV22Hujjp7pmtUYNykM/nxhlUkMZC0H1Ezp4igAOuvAg4ENx7sJCzM8aDBBCxtaRMBZxJvO+QIoKa7vMEsywGN3i7yZXayXqc53NjHO1w39x+Y1YqaNOX+gDndARfaRv9OfCWUk+I/Oyy6ddaEVoDh3cXfr9X5XHp8Kzf6pLPStsYDr3Ktl7mCPZLZxC39t7dbI006wn9+kdfWf/b+mYDvkGfu+X835RNN9c4sdmL9l6jLPLOJ+sTDQMahiLk7W5+XJRg7iHUYzxNqYwh0dYQIunf8ZX/MBPzdL2tRfay4Kq9ZWaOzgV2uu/sLq2Ww==</latexit>
ewi<latexit sha1_base64="1Tw3Qq7j2zJO/YljKE/dnOLSS4w=">AAADBnichVFNS9xQFL1JazuOtjO2G6Wb0CFlijDcSEERBGk3Llzo6DiCM4Ykfepz8kXyZqwNWXUj/gEXrlroogjuWreCm/4BF/4E6dKCGxe9+QCZSu0Nybv33HNuzuOavs1DgXghyQ8eDjx6XBgsDg0/eVoqjzxbCb1uYLGG5dlesGoaIbO5yxqCC5ut+gEzHNNmTbPzLuk3eywIuecui12ftR1j0+Ub3DIEQXr5k9oynYjF61FX3471aEfnsVJUZ5SWYB9E4ER1Nt+IqwmpSe2EpGSKjDueVmbeeq2M93WLal/V8shK4jTK4BzXyxWsYRrK3UTLkwrkseCVf0AL3oMHFnTBAQYuCMptMCCkZw00QPAJa0NEWEAZT/sMYiiStkssRgyD0A59N6lay1GX6mRmmKot+otNb0BKBVQ8x294hT/xCC/x5p+zonRG4mWXTjPTMl8v7Y8uXf9X5dApYOtWda9nARswlXrl5N1PkeQWVqbvfTy4Wpquq9Er/IK/yP9nvMAzuoHb+219XWT1w3v8mOQlWY/29zLuJisTNQ1r2uKbyuzbfFEFeAEvoUrbmIRZmIMFaND0S6kkjUpj8p58LH+XTzKqLOWa59AX8ukf1LjENg==</latexit><latexit sha1_base64="1Tw3Qq7j2zJO/YljKE/dnOLSS4w=">AAADBnichVFNS9xQFL1JazuOtjO2G6Wb0CFlijDcSEERBGk3Llzo6DiCM4Ykfepz8kXyZqwNWXUj/gEXrlroogjuWreCm/4BF/4E6dKCGxe9+QCZSu0Nybv33HNuzuOavs1DgXghyQ8eDjx6XBgsDg0/eVoqjzxbCb1uYLGG5dlesGoaIbO5yxqCC5ut+gEzHNNmTbPzLuk3eywIuecui12ftR1j0+Ub3DIEQXr5k9oynYjF61FX3471aEfnsVJUZ5SWYB9E4ER1Nt+IqwmpSe2EpGSKjDueVmbeeq2M93WLal/V8shK4jTK4BzXyxWsYRrK3UTLkwrkseCVf0AL3oMHFnTBAQYuCMptMCCkZw00QPAJa0NEWEAZT/sMYiiStkssRgyD0A59N6lay1GX6mRmmKot+otNb0BKBVQ8x294hT/xCC/x5p+zonRG4mWXTjPTMl8v7Y8uXf9X5dApYOtWda9nARswlXrl5N1PkeQWVqbvfTy4Wpquq9Er/IK/yP9nvMAzuoHb+219XWT1w3v8mOQlWY/29zLuJisTNQ1r2uKbyuzbfFEFeAEvoUrbmIRZmIMFaND0S6kkjUpj8p58LH+XTzKqLOWa59AX8ukf1LjENg==</latexit><latexit sha1_base64="1Tw3Qq7j2zJO/YljKE/dnOLSS4w=">AAADBnichVFNS9xQFL1JazuOtjO2G6Wb0CFlijDcSEERBGk3Llzo6DiCM4Ykfepz8kXyZqwNWXUj/gEXrlroogjuWreCm/4BF/4E6dKCGxe9+QCZSu0Nybv33HNuzuOavs1DgXghyQ8eDjx6XBgsDg0/eVoqjzxbCb1uYLGG5dlesGoaIbO5yxqCC5ut+gEzHNNmTbPzLuk3eywIuecui12ftR1j0+Ub3DIEQXr5k9oynYjF61FX3471aEfnsVJUZ5SWYB9E4ER1Nt+IqwmpSe2EpGSKjDueVmbeeq2M93WLal/V8shK4jTK4BzXyxWsYRrK3UTLkwrkseCVf0AL3oMHFnTBAQYuCMptMCCkZw00QPAJa0NEWEAZT/sMYiiStkssRgyD0A59N6lay1GX6mRmmKot+otNb0BKBVQ8x294hT/xCC/x5p+zonRG4mWXTjPTMl8v7Y8uXf9X5dApYOtWda9nARswlXrl5N1PkeQWVqbvfTy4Wpquq9Er/IK/yP9nvMAzuoHb+219XWT1w3v8mOQlWY/29zLuJisTNQ1r2uKbyuzbfFEFeAEvoUrbmIRZmIMFaND0S6kkjUpj8p58LH+XTzKqLOWa59AX8ukf1LjENg==</latexit><latexit sha1_base64="1Tw3Qq7j2zJO/YljKE/dnOLSS4w=">AAADBnichVFNS9xQFL1JazuOtjO2G6Wb0CFlijDcSEERBGk3Llzo6DiCM4Ykfepz8kXyZqwNWXUj/gEXrlroogjuWreCm/4BF/4E6dKCGxe9+QCZSu0Nybv33HNuzuOavs1DgXghyQ8eDjx6XBgsDg0/eVoqjzxbCb1uYLGG5dlesGoaIbO5yxqCC5ut+gEzHNNmTbPzLuk3eywIuecui12ftR1j0+Ub3DIEQXr5k9oynYjF61FX3471aEfnsVJUZ5SWYB9E4ER1Nt+IqwmpSe2EpGSKjDueVmbeeq2M93WLal/V8shK4jTK4BzXyxWsYRrK3UTLkwrkseCVf0AL3oMHFnTBAQYuCMptMCCkZw00QPAJa0NEWEAZT/sMYiiStkssRgyD0A59N6lay1GX6mRmmKot+otNb0BKBVQ8x294hT/xCC/x5p+zonRG4mWXTjPTMl8v7Y8uXf9X5dApYOtWda9nARswlXrl5N1PkeQWVqbvfTy4Wpquq9Er/IK/yP9nvMAzuoHb+219XWT1w3v8mOQlWY/29zLuJisTNQ1r2uKbyuzbfFEFeAEvoUrbmIRZmIMFaND0S6kkjUpj8p58LH+XTzKqLOWa59AX8ukf1LjENg==</latexit>
U(wi)<latexit sha1_base64="NWpSiaw4n6fYx0g+oVdwK3k+Des=">AAADF3ichVHNS9xQEJ9EW+3a1q1eCr0ElxS3wjIRwSIIYi8eethdu+6Ca0MSn/bVfJG8XT9C/gHPBRFPLfRQvHv1IBSvPfTgnyAeFQqlh04+pKxSnZD3Zn4zv5nfY0zf5qFAPJPkvv4HDwcGHxWGHj95Olx8NrIUep3AYg3Ls72gZRohs7nLGoILm7X8gBmOabOmufEmyTe7LAi5574T2z5bcYx1l69xyxAE6cVPatt0Iha/jzr6x1iPNnUeKwV1VmkLtiUCJ6qzt414PClqUjopUjJGVjuRRmaeKisTPdmCeiPySEsiNcrw68T1MJpEcVkvlrCCqSm3HS13SpBb1SseQRtWwQMLOuAAAxcE+TYYENK3DBog+IStQERYQB5P8wxiKBC3Q1WMKgxCN+hcp2g5R12Kk55hyrZoik1/QEwFVPyJ3/AST/EQz/HPf3tFaY9EyzbdZsZlvj68+3zx170sh24BH/6x7tQsYA1ep1o5afdTJHmFlfG7O3uXizN1NXqJX/CC9H/GMzyhF7jdK+trjdUP7tBjkpaY1qPdXMZtZ2myomFFq02V5ubzRQ3CCxiDcdrGNMzBAlShQd1/S4pUll7J+/Kx/F0+zUplKeeMQo/JP/4CuGfK7A==</latexit><latexit sha1_base64="NWpSiaw4n6fYx0g+oVdwK3k+Des=">AAADF3ichVHNS9xQEJ9EW+3a1q1eCr0ElxS3wjIRwSIIYi8eethdu+6Ca0MSn/bVfJG8XT9C/gHPBRFPLfRQvHv1IBSvPfTgnyAeFQqlh04+pKxSnZD3Zn4zv5nfY0zf5qFAPJPkvv4HDwcGHxWGHj95Olx8NrIUep3AYg3Ls72gZRohs7nLGoILm7X8gBmOabOmufEmyTe7LAi5574T2z5bcYx1l69xyxAE6cVPatt0Iha/jzr6x1iPNnUeKwV1VmkLtiUCJ6qzt414PClqUjopUjJGVjuRRmaeKisTPdmCeiPySEsiNcrw68T1MJpEcVkvlrCCqSm3HS13SpBb1SseQRtWwQMLOuAAAxcE+TYYENK3DBog+IStQERYQB5P8wxiKBC3Q1WMKgxCN+hcp2g5R12Kk55hyrZoik1/QEwFVPyJ3/AST/EQz/HPf3tFaY9EyzbdZsZlvj68+3zx170sh24BH/6x7tQsYA1ep1o5afdTJHmFlfG7O3uXizN1NXqJX/CC9H/GMzyhF7jdK+trjdUP7tBjkpaY1qPdXMZtZ2myomFFq02V5ubzRQ3CCxiDcdrGNMzBAlShQd1/S4pUll7J+/Kx/F0+zUplKeeMQo/JP/4CuGfK7A==</latexit><latexit sha1_base64="NWpSiaw4n6fYx0g+oVdwK3k+Des=">AAADF3ichVHNS9xQEJ9EW+3a1q1eCr0ElxS3wjIRwSIIYi8eethdu+6Ca0MSn/bVfJG8XT9C/gHPBRFPLfRQvHv1IBSvPfTgnyAeFQqlh04+pKxSnZD3Zn4zv5nfY0zf5qFAPJPkvv4HDwcGHxWGHj95Olx8NrIUep3AYg3Ls72gZRohs7nLGoILm7X8gBmOabOmufEmyTe7LAi5574T2z5bcYx1l69xyxAE6cVPatt0Iha/jzr6x1iPNnUeKwV1VmkLtiUCJ6qzt414PClqUjopUjJGVjuRRmaeKisTPdmCeiPySEsiNcrw68T1MJpEcVkvlrCCqSm3HS13SpBb1SseQRtWwQMLOuAAAxcE+TYYENK3DBog+IStQERYQB5P8wxiKBC3Q1WMKgxCN+hcp2g5R12Kk55hyrZoik1/QEwFVPyJ3/AST/EQz/HPf3tFaY9EyzbdZsZlvj68+3zx170sh24BH/6x7tQsYA1ep1o5afdTJHmFlfG7O3uXizN1NXqJX/CC9H/GMzyhF7jdK+trjdUP7tBjkpaY1qPdXMZtZ2myomFFq02V5ubzRQ3CCxiDcdrGNMzBAlShQd1/S4pUll7J+/Kx/F0+zUplKeeMQo/JP/4CuGfK7A==</latexit><latexit sha1_base64="NWpSiaw4n6fYx0g+oVdwK3k+Des=">AAADF3ichVHNS9xQEJ9EW+3a1q1eCr0ElxS3wjIRwSIIYi8eethdu+6Ca0MSn/bVfJG8XT9C/gHPBRFPLfRQvHv1IBSvPfTgnyAeFQqlh04+pKxSnZD3Zn4zv5nfY0zf5qFAPJPkvv4HDwcGHxWGHj95Olx8NrIUep3AYg3Ls72gZRohs7nLGoILm7X8gBmOabOmufEmyTe7LAi5574T2z5bcYx1l69xyxAE6cVPatt0Iha/jzr6x1iPNnUeKwV1VmkLtiUCJ6qzt414PClqUjopUjJGVjuRRmaeKisTPdmCeiPySEsiNcrw68T1MJpEcVkvlrCCqSm3HS13SpBb1SseQRtWwQMLOuAAAxcE+TYYENK3DBog+IStQERYQB5P8wxiKBC3Q1WMKgxCN+hcp2g5R12Kk55hyrZoik1/QEwFVPyJ3/AST/EQz/HPf3tFaY9EyzbdZsZlvj68+3zx170sh24BH/6x7tQsYA1ep1o5afdTJHmFlfG7O3uXizN1NXqJX/CC9H/GMzyhF7jdK+trjdUP7tBjkpaY1qPdXMZtZ2myomFFq02V5ubzRQ3CCxiDcdrGNMzBAlShQd1/S4pUll7J+/Kx/F0+zUplKeeMQo/JP/4CuGfK7A==</latexit>
: the set of reviewers who used the word )*: personalized word emb. for )* of the reviewer+,: the avg. of -./0 for U()*)
Personalized word embeddings are expressed via transformation of reviewer universal word embeddings using reviewer specific parameters
Attempted to consider personal biases to improve task performance
" They model various types of biases altogether
Personalize word embeddings in review target identification based on multi-task learning (MTL) and fine-tuning
Visualizing a word “bready” with closest wordsUsing principal component analysis (PCA)
Top Related