HCI and Smartphone Data at Scale

74
©2009 Carnegie Mellon University : 1 HCI and Smartphone Data at Scale IBM Research July 29, 2013 Shah Amini Justin Cranshaw Afsaneh Doryab Jialiu Lin Jun-Ki Min Jason Wiese Jason Hong Norman Sadeh Joy Zhang John Zimmerman Computer Human Interaction: Mobility Privacy Security

description

 

Transcript of HCI and Smartphone Data at Scale

  • 1. 2009CarnegieMellonUniversity:1 HCI and Smartphone Data at Scale IBM Research July 29, 2013 Shah Amini Justin Cranshaw Afsaneh Doryab Jialiu Lin Jun-Ki Min Jason Wiese Jason Hong Norman Sadeh Joy Zhang John Zimmerman Computer Human Interaction: Mobility Privacy Security

2. 2013CarnegieMellonUniversity:2 Smartphones are Pervasive 50% penetration in the US mid-2012 In 2013Q1, majority of phones sold worldwide 50 billion apps downloaded on each of Apple and Android 3. 2013CarnegieMellonUniversity:3 Smartphones are Intimate Mobile phones and millennials (Pew 2012): 75% use in bed before going to sleep 83% sleep with their mobile phones 90% check first thing in the morning Half use them while eating A third use them in the bathroom (!) A fifth check them every ten minutes 4. 2013CarnegieMellonUniversity:4 Smartphone Data is Intimate Who we know (contact list) Who we call (call log) Who we text (sms log) 5. 2013CarnegieMellonUniversity:5 Smartphone Data is Intimate Where we go (gps, foursquare) Photos (some geotagged) Sensors (accel, sound, light) 6. 2013CarnegieMellonUniversity:6 The Opportunity We are creating a worldwide sensor network with these smartphones We can now capture and analyze human behavior at unprecedented fidelity and scale 7. 2013CarnegieMellonUniversity:7 Three Threads of Research Augmented Social Graph Create richer computational models of our social relationships with others Urban Analytics Create viz and models of cities based on geotagged social media CrowdScanning Apps Crowdsourcing and other techniques to analyze privacy behaviors of apps 8. 2013CarnegieMellonUniversity:8 Modeling Social Relationships If you were in a jail in Mexico, which of the 500+ friends in your phone contact list and on Facebook would come and get you out? 9. 2013CarnegieMellonUniversity:9 Modeling Social Relationships Can we use smartphone data to build a richer augmented social graph? models tie strength, group, role 10. 2013CarnegieMellonUniversity:10 Why Better Models? Secure invitations Who is this person friending me? Communication triage Better info finding (weak ties) Configuration of privacy policies Tie strength strongly correlated with what personal info people willing to share (Wiese et al, Ubicomp 2011) Early detection of depression Less communication with strong ties, less mobility, lots of fast food, insomnia 11. 2013CarnegieMellonUniversity:11 Ongoing Work: Sleep Data Sleep data (self-reported ground truth) Sensor data 12. 2013CarnegieMellonUniversity:12 Using Call Log, SMS, Contacts 13. 2013CarnegieMellonUniversity:13 Using Call Log, SMS, Contacts 14. 2013CarnegieMellonUniversity:14 Using Call Log, SMS, Contacts 15. 2013CarnegieMellonUniversity:15 User Study on Relationships 40 Participants 13 male and 27 female (age 19-50) 55% student, 35% employed, 10% unemp Data collection Phone: Contact list, call & SMS logs Facebook: Friend list from Facebook backup Self-report for 70 contacts: Demographics, group, closeness (1 5 = feel very close) 16. 2013CarnegieMellonUniversity:16 Life Facets Can classify life facets {work, social, home} at 90.1% If at least one comm. Just contact list, call log, SMS log Correlations Min et al, Mining Smartphone Data to Classify Life-Facets of Social Relationships, CSCW 2013 17. 2013CarnegieMellonUniversity:17 Ongoing Work: Tie Strength However, tie strength much harder to predict, 74.6% for {low, med, high} We thought this would be easy Other modes of communication Skype, IM, email, face-to-face Stage of relationship / maintenance comm. 18. 2013CarnegieMellonUniversity:18 Three Threads of Research Augmented Social Graph Create richer computational models of our social relationships with others Urban Analytics Create viz and models of cities based on geotagged social media CrowdScanning Apps Crowdsourcing and other techniques to analyze privacy behaviors of apps 19. 2013CarnegieMellonUniversity:19 The Problem Todays methods for gathering data about cities are slow, expensive, and limited Ex. Travel Behavioral Inventory for traffic flows every 10-20 years and 100s of people US Census 2010 cost $13 billion Quality of life surveys (sociology, city govts) go door-to-door and interview people Some approaches today: Call Data Records, but granularity Deploy a custom app, but scale and utility 20. 2013CarnegieMellonUniversity:20 The Vision: Urban Analytics Goal: Use smartphones + social media + machine learning to offer new and useful insights about a city in a manner that is cheap, fast, and highly scalable 21. 2013CarnegieMellonUniversity:21 Livehoods, Our First Urban Analytics Tool The character of an urban area is defined not just by the types of places found there, but also by the people that make it part of their daily life Cranshaw et al, The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City, ICWSM 2012. 22. 2013CarnegieMellonUniversity:22 What comes to mind when you picture your neighborhood? 23. 2013CarnegieMellonUniversity:23 Youre probably not imagining this. The Image of a Neighborhood 24. 2013CarnegieMellonUniversity:24 The Image of a Neighborhood What youre imagining probably looks a lot more like this. Every citizen has had long associations with some part of his city, and his image is soaked in memories and meanings. ---Kevin Lynch, The Image of a City 25. 2013CarnegieMellonUniversity:25 Kevin Lynch, 1960 Stanley Milgram, 1977 Studying Perceptions: Cognitive Maps 26. 2013CarnegieMellonUniversity:26 Two Perspectives Politically constructed Socially constructed Neighborhoods have fixed borders defined by the city government. Neighborhoods are organic, cultural artifacts. Borders are blurry, imprecise, and may be different to different people. 27. 2013CarnegieMellonUniversity:27 Two Perspectives Socially constructed Neighborhoods are organic, cultural artifacts. Borders are blurry, imprecise, and may be different to different people. Can we discover automated ways of identifying the organic boundaries of the city? Can we extract local cultural knowledge from social media? Can we build a collective cognitive map from data? 28. 2013CarnegieMellonUniversity:28 Livehoods Data Source Crawled 18m check-ins from foursquare Claims 20m users People who linked their foursquare accts to Twitter Spectral clustering based on geographic and social proximity 29. 2013CarnegieMellonUniversity:29 If you watch check-ins over time, youll notice that groups of like-minded people tend to stay in the same areas. 30. 2013CarnegieMellonUniversity:30 We can aggregate these patterns to compute relationships between check- in venues. 31. 2013CarnegieMellonUniversity:31 These relationships can then be used to identify natural borders in the urban landscape. 32. 2013CarnegieMellonUniversity:32 Livehood 2 Livehood 1 We call the discovered clusters Livehoods reflecting their dynamic character. 33. 2013CarnegieMellonUniversity:33 Try it out at livehoods.org 34. 2013CarnegieMellonUniversity:34 Evaluation Interviewed 27 locals Residents, urban planners, businesses Asked them to draw their mental maps of areas first Then showed them our maps and solicited feedback 35. 2013CarnegieMellonUniversity:35 South Side Pittsburgh 36. 2013CarnegieMellonUniversity:36 South Side Pittsburgh 37. 2013CarnegieMellonUniversity:37 South Side Pittsburgh Carson Street runs along the length of South Side, and is densely packed with bars, restaurants, tattoo parlors, and clothing and furniture shops. It is the most popular destination for nightlife. 38. 2013CarnegieMellonUniversity:38 South Side Pittsburgh South Side Works is a recently built, mixed-use outdoor shopping mall, containing nationally branded apparel stores and restaurants, upscale condominiums, and corporate offices. 39. 2013CarnegieMellonUniversity:39 South Side Pittsburgh There is an small, somewhat older strip- mall that contains the only super market (grocery) in South Side. It also has a liquor store, an auto-parts store, a furniture rental store and other small chain stores. 40. 2013CarnegieMellonUniversity:40 South Side Pittsburgh The rest of South Side is predominantly residential, consisting of mostly smaller row houses. 41. 2013CarnegieMellonUniversity:41 South Side Pittsburgh 42. 2013CarnegieMellonUniversity:42 South Side Pittsburgh Livehoods Found in South Side LH8 LH9LH7 LH6 Ill show evidence in support of the Livehoods clusters in South Side, and will describe the forces that people highlighted. 43. 2013CarnegieMellonUniversity:43 South Side Pittsburgh Demographic Differences LH8 LH9LH7 LH6 LH8 vs LH9 Ha! Yes! See, here is my division! Yay! Thank you algorithm! ... I definitely feel where the South Side Works, and all of that is, is a very different feel. 44. 2013CarnegieMellonUniversity:44 South Side Pittsburgh Architecture & Urban Design LH8 LH9LH7 LH6 LH7 vs LH8 from an urban standpoint it is a lot tighter on the western part once you get west of 17th or 18th [LH7]. 45. 2013CarnegieMellonUniversity:45 South Side Pittsburgh Safety LH8 LH9LH7 LH6 LH7 vs LH8 Whenever I was living down on 15th Street [LH7] I had to worry about drunk people following me home, but on 23rd [LH8] I need to worry about people trying to mug you... so its different. Its not something I had anticipated, but there is a distinct difference between the two areas of the South Side. 46. 2013CarnegieMellonUniversity:46 South Side Pittsburgh Demographic Differences LH8 LH9LH7 LH6 LH6 There is this interesting mix of people there I dont see walking around the neighborhood. I think they are coming to the Giant Eagle [grocery store] from lower income neighborhoods... I always assumed they came from up the hill. 47. 2013CarnegieMellonUniversity:47 South Side Pittsburgh I always assumed they came from up the hill. 48. 2013CarnegieMellonUniversity:48 Bezerkeley, CA 49. 2013CarnegieMellonUniversity:49 Other Potential Urban Analytics 50. 2013CarnegieMellonUniversity:50 Three Threads of Research Augmented Social Graph Create richer computational models of our social relationships with others Urban Analytics Create viz and models of cities based on geotagged social media CrowdScanning Apps Crowdsourcing and other techniques to analyze privacy behaviors of apps 51. 2013CarnegieMellonUniversity:51 Shares your location, gender, unique phone ID, phone# with advertisers Uploads your entire contact list to their server (including phone #s) What are your apps really doing? 52. 2013CarnegieMellonUniversity:52 Many Smartphone Apps Have Unusual Permissions App Permissions Used Tiny Flashlight + LED Internet Access, phone# Backgrounds Contact List Dictionary Location Bible Quotes Location 53. 2013CarnegieMellonUniversity:53 Android What do these permissions mean? Why does app need this permission? When does it use these permissions? 54. 2013CarnegieMellonUniversity:54 CrowdScanning Core Ideas Idea 1: find the gap between what people expect an app to do and what it actually does Lin et al, Expectation and Purpose: Understanding Users Mental Models of Mobile App Privacy thru Crowdsourcing. Ubicomp 2012. 55. 2013CarnegieMellonUniversity:55 Nissan Maxima Gear Shift 56. 2013CarnegieMellonUniversity:56 Privacy as Expectations Apply this same idea of mental models for privacy Compare what people expect an app to do vs what an app actually does Emphasize the biggest gaps, misconceptions that many people had App Behavior (What an app actually does) User Expectations (What people think the app does) 57. 2013CarnegieMellonUniversity:57 Crowdsourcing Privacy Idea 2: use crowdsourcing to do this (crowdsource privacy) Few people read privacy policies We want to install the app Reading policies not part of main task Complexity of these policies (the pain!!!) Clear cost (time) for unclear benefit Crowdsourcing can mitigate these problems 58. 2013CarnegieMellonUniversity:58 10% users were surprised this app wrote contents to their SD card. 25% users were surprised this app sent their approximate location to dictionary.com for searching nearby words. 85% users were surprised this app sent their phones unique ID to mobile ads providers. 0% users were surprised this app could control their audio settings. See all 90% users were surprised this app sent their precise location to mobile ads providers. 95% users were surprised this app sent their approximate location to mobile ads providers. 95% users were surprised this app sent their phones unique ID to mobile ads providers. 0% users were surprised this app can control camera flashlight. 59. 2013CarnegieMellonUniversity:59 Our Study on App Privacy Showed crowd workers screenshots and description of app (from Google Play) 56 of top 100 Android Apps Showed permissions one at a time Only those related to privacy Expectation Condition Why they think the app uses permission How comfortable they were with it Purpose Condition We gave an explanation (based on our analysis) Asked how comfortable they were with it 60. 2013CarnegieMellonUniversity:60 Results for Location Data (N=20 per app, Expectations Condition) App Comfort Level (-2 2) Maps 1.52 GasBuddy 1.47 Weather Channel 1.45 Foursquare 0.95 TuneIn Radio 0.60 Evernote 0.15 Angry Birds -0.70 Brightest Flashlight Free -1.15 Toss It -1.2 61. 2013CarnegieMellonUniversity:61 Showing Purpose Lowers Concerns All differences statistically significant Big increases for dictionary, Shazam, Air Control Lite, and others (> 1.0) App Comfort w/ Purpose Comfort w/o Purpose Device ID 0.47 ( =0.30) -0.10 ( =0.41) Contact List 0.66 ( =0.22) 0.16 ( =0.54) Network Location 0.90 ( =0.53) 0.65 ( =0.55) GPS Location 0.72 ( =0.62) 0.35 ( =0.73) 62. 2013CarnegieMellonUniversity:62 Ongoing Work Scaling up analysis 600k+ apps on Android market Static & dynamic analysis + clustering to build models of apps Ex. Games that use location data -1.3 Gort tool 63. 2013CarnegieMellonUniversity:63 Summary Smartphones offer big opportunity to understand human behavior at unprecedented fidelity and scale Augmented Social Graph Urban Analytics CrowdScanning 64. 2013CarnegieMellonUniversity:64 Thanks! More info at cmuchimps.org or email [email protected] Special thanks to: Army Research Office National Science Foundation Alfred P. Sloan Foundation DARPA Google CMU Cylab Join our community for researchers at: www.reddit.com/r/pervasivecomputing 65. 2013CarnegieMellonUniversity:65 66. 2013CarnegieMellonUniversity:66 66 Using features such as location entropy significantly improves performance over shallow features such as number of co-locations 67. 2013CarnegieMellonUniversity:67 67 68. 2013CarnegieMellonUniversity:68 Using Location Data to Infer Friendships 2.8m location sightings of 489 users of Locaccino friend finder in Pittsburgh Place entropy for inferring social quality of a place #unique people seen in a place 0.0002 x 0.0002 lat/lon grid, ~30m x 30m Cranshaw et al, Bridging the Gap Between Physical Location and Online Social Networks, Ubicomp 2010 69. 2013CarnegieMellonUniversity:69 Insert graph here Describe entropy 70. 2013CarnegieMellonUniversity:70 Inferring Friendships 67 different machine learning features Location diversity (and entropy) Intensity and Duration Specificity (TF-IDF) Graph structure (overlap in friends) 92% accuracy in predicting friend/not Location entropy improves performance over shallow features like #co-locations 71. 2013CarnegieMellonUniversity:71 Most Unexpected Uses (N=20 per app, Expectations Condition) Found strong correlation between expectations & comfort level (r=0.91) Apps using Contact List Comfort Level (-2 2) Backgrounds HD Wallpaper -1.35 Pandora -0.70 GO Launcher EX -0.75 72. 2013CarnegieMellonUniversity:72 Insert graph here Describe entropy Co-location data to infer friendship Using place entropy, accuracy of 92% Can also infer number of friends 73. 2013CarnegieMellonUniversity:73 Topic Modeling (LDA) 74. 2013CarnegieMellonUniversity:74 Brooklyn Queens Expressway