Big Data and Big Cities
-
Upload
regional-science-academy -
Category
Education
-
view
57 -
download
1
Transcript of Big Data and Big Cities
![Page 1: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/1.jpg)
BigDataandBigCi+es:ThePromisesandLimita+onsof
ImprovedMeasuresofCityLife
EdGlaeser,ScoCKominers,MikeLucaandNikhilNaik
![Page 2: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/2.jpg)
Outline
• Part1:BigDataandUrbanQues+ons– Toomuchbigthinkonbigdata
• Part2:MeasuringCityLifewhendataismissing– Lookforwardtomylecturelater– MeasuringtheimpactofwaterinZambia
• Part3:UsingBigDatatoImproveCityServices– Modestmodelontournamentsvs.consultants– ReportonahygienetournamentinBoston
![Page 3: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/3.jpg)
BigDataandBigQues+onsaboutCi+es
• Howdoesurbandevelopmentimpacttheeconomy?– Shockstopeoplevs.shockstoplace
• Howthephysicalcityinteractwithsocialoutcomes?– Measuringthephysicalcitywithbigdata
• Howmuchdopeoplevalueurbanameni+es?– Measuringameni+esandbeCercon+ngentvalua+on
• Howcanpublicpolicyimprovethequalityofurbanspace?– Merginggovernmentac+onswithphysicalmeasures
![Page 4: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/4.jpg)
ExamplesofBigData
• Muchfinergeographicrecords(theIRSdata)• Similardatafromprivateproviders(corelogic)• Noveldatasetsontradi+onaloutcomes(Zoona)
• Noveldatasetsonrela+velynewthings(Yelp)• Completelydifferentdataonthingswehadbarelythoughtaboutbefore(GoogleStreetview)
![Page 5: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/5.jpg)
What’sItGoodFor• Bigdatadoesnotintrinsicallysolveanyofthecausalinferenceissuesthatwehavelongworriedabout.
• Itdoesmakeitpossibletomeasuremorethings(hygiene,streetscapes)inmoreplacesinmoreways.
• IRSrecordsprovidethemother-of-all-panelsets,whichispar+cularlyusefulforspa+alinterven+ons– Therightwaytojudgeempowermentzones,forexample,wouldbetousethepanelstructure
![Page 6: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/6.jpg)
LedAstrayBy“Bigger”Data(.3)
![Page 7: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/7.jpg)
MeasuringthePreviouslyUnmeasurable
• Weareusedtohavingpublicsourcesfordataonthemostbasiceconomicoutcome:income
• Thisistypicallynottrueinthedevelopingworld,especiallysub-SaharanAfrica.
• Especiallyfalseforausablepanel• Example#1Zoonadata,waterandhealth• Example#2GoogleStreetview:essen+allynightlightsonsteroids
![Page 8: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/8.jpg)
ZoonainZambia
![Page 9: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/9.jpg)
![Page 10: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/10.jpg)
![Page 11: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/11.jpg)
![Page 12: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/12.jpg)
![Page 13: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/13.jpg)
MeasuringStreetscapes(withNikhilNaik)
![Page 14: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/14.jpg)
CrowdsourcingCityGovernment:UsingTournamentstoImproveInspec+onAccuracy
EdwardL.GlaeserAndrewHillis
ScoCDukeKominersMichaelLuca
![Page 15: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/15.jpg)
BigData:Consumerreviewwebsites
• Partofcrowdsourcingmovement.
![Page 16: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/16.jpg)
Yelp
• Luca2011:highra+ngsincreaserevenueforindependentrestaurants
• ChevalierandMayzlin2006:Barnes&Noble,Amazonandonlinebookorders
• Ghoseetal2011:TripAdvisorandhotelreserva+ons
![Page 17: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/17.jpg)
![Page 18: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/18.jpg)
YelpSearch
![Page 19: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/19.jpg)
Restaurant’s Yelp Page
![Page 20: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/20.jpg)
Somebackground
• LosAngelesin1997…– Pos+ngà
• higherscores• lowerratesoffoodborneillness• JinandLeslie(2003)
– Majorsuccessstoryofdisclosure
• NYCin2010• Yet,alothaschanged…
![Page 21: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/21.jpg)
TheRiseofTournaments
• Now,organiza+onscanoutsourcelarge-scalepredic+onproblemsviaopentournaments!– e.g.,210predic+ontournamentsonKaggle,withprizesrangingfrom$0to$500,000.
![Page 22: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/22.jpg)
TheRiseofTournaments
• Now,organiza+onscanoutsourcelarge-scalepredic+onproblemsviaopentournaments!– Returnsarenotjustcash–alsorecogni+on,jobinterviews,cer+fica+on,sa+sfac+on,andlearning.
![Page 23: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/23.jpg)
AnEconomicDesignQues+on
(When)canopentournamentshelpsolvepublicproblems?
![Page 24: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/24.jpg)
Theory
• Tournamentsmakesensewhen:– theprobabilityofabreakthrough,𝜑,ishigh;– whenthebaselinelow-skilledoutcome,𝑞 ,isnotthatbad;– andwhenthebestoutcome,𝑞↓𝑚𝑎𝑥 ,ispar+cularlygood.
• Wageinequalitymakestournamentsmoreappealing.
• TournamentsareunaCrac+veforensuring¯𝑞 .
� TournamentsmaybebecomingmoreaCrac+ve!
![Page 25: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/25.jpg)
Theory→Prac+ce
� TournamentsmaybebecomingmoreaCrac+ve!
� Weranone!
![Page 26: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/26.jpg)
Conjecture
• Inspec+onanddisclosurepoliciescanbeenhancedbyworkingwithsocialmedia:– Socialmediaisapoten+alplarormfordisclosure– Op+maldisclosureisafunc+onofwhatpeoplearesayingonsocialmedia
• Designdisclosureofhygieneviola+onsthroughYelpplarormUseYelpreviewtexttoguideinspec+ons.– Inspec+onsarefairlyrandom,buttheydon’thavetobe!
![Page 27: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/27.jpg)
Whyrestauranthygieneinspec+ons?
• Dataandtechnologyhavechanged– Policyhasremainedthesame
• Disclosureside– MarketwithveryliCleinforma+on– Earlysuccessstoryofdisclosure(JinandLeslie2003),soknownpoten+alimpact
• Idealsesngforinforma+ondesignques+ons– Whatcondi+onscausepos+ngtowork?– Whatarethebehavioralfactorsunderlyingcustomerresponse?
• Scopeforimprovingpolicy– DaiandLuca2016
![Page 28: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/28.jpg)
HygieneInspec+ons• Processandscoringvaries(some+mesalot)bycity• InSF:
– restaurantsinspectedroughly2Xperyear.– viola+onsclassifiedasmajor(lotsofrats)andminor(arat)– finalscorebetween0and100
• InBoston:– Restaurantsinspectedatleastonceperyear– Viola+onsclassifiedasminor,major,andsevere– Un+lnow,nogrades
• Goal:– Iden+fyrisks– Shutdownworstoffenders,enforcecleanup
![Page 29: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/29.jpg)
Essen+allyapredic+onproblem
• Whichrestaurantismostlikelytohaveaviola+on?
• Bytarge+nginspec+ons,canbemoreefficient:– Iden+fymorerisks,or,– Reducenumberofinspec+ons
• Eg:1randomannualinspec+onforeachrestaurant,plustargeted
![Page 30: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/30.jpg)
Treatment: Inspection Results on Yelp
![Page 31: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/31.jpg)
Arehygienescorespredictable?
• Yelpreviewersprovidelotsofnewinforma+on,but…
• Poten+alpiralls:– Fakereviews– Selec+on– Hygienemaynotfactorintoreviews
![Page 32: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/32.jpg)
Distribution of Hygiene Scores
![Page 33: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/33.jpg)
Hygiene Scores by Restaurant Price
![Page 34: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/34.jpg)
Yelp Ratings Predict Hygiene Scores
![Page 35: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/35.jpg)
Upda+ngtheInspec+onProcess
• Layeringonuseoftext,canpredictroughly85%ofrestaurantsintotop/boComhalfofscores(Kang,Kuznetsova,Luca,andChoi2013)
• Relatedpilots
![Page 36: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/36.jpg)
Tournament:
• CosponsoredwithYelp• SupportedbyCityofBoston• CombinedYelpdatawithBostoninspec+onresults:– Objec+vetopredictviola+ons.– Weightschosenbycity(minor=1,major=2,severe=5).
– EvaluatedusingRMSLE
![Page 37: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/37.jpg)
Tournament:Rewards
PlacePrize
Amount1st $3,000
2nd $1,000
3rd $1,000
PrizemoneyprovidedbyYelp
![Page 38: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/38.jpg)
![Page 39: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/39.jpg)
Compe++onProcess
![Page 40: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/40.jpg)
Target:Inspec+onViola+ons
![Page 41: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/41.jpg)
Target:Inspec+onViola+ons
![Page 42: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/42.jpg)
Target:Inspec+onViola+ons
![Page 43: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/43.jpg)
Target:Inspec+onViola+ons
![Page 44: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/44.jpg)
Target:Inspec+onViola+ons
![Page 45: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/45.jpg)
Results
• >500signups• Developmentphase:
– ~55completedatleastoneentry– ~450setsofpredic+ons
• Evalua+onphase:– 23submiCedfinalalgorithms– Duringthis+me,Bostoninspected364restaurants
![Page 46: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/46.jpg)
TheWinner
![Page 47: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/47.jpg)
TheRunnerUp
![Page 48: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/48.jpg)
GainsforBoston:~40%
Tocatch3,604weightedviola+ons,inspectthismanyrestaurants:
![Page 49: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/49.jpg)
GainsforBostonIfchoosingthe364restaurantswiththehighestpredictedviola+ons,expecttoobtaintotalviola+ons:
![Page 50: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/50.jpg)
Ongoingwork
• Launchingatrial– Startsthismonth
• Incorpora+ngintoday-to-dayinspec+ons• Ongoingchallenges:
– Othercitygoals?– Gamability?– Transferability?
![Page 51: Big Data and Big Cities](https://reader033.fdocuments.net/reader033/viewer/2022042722/588811571a28ab083c8b4e67/html5/thumbnails/51.jpg)
Epilogue
• ResultsoftheAlgorithmweregiventoinspectorstoimproveaccuracy.
• Thenwelookedathowtheydidusingtheirownbestprac+cesvs.thefancyalgorithmvs.areallysimplealgorithm.
• Thefancyalgorithmdoeshelp–butthesimplealgorithmgetsmostofthewaythere.
• Insomethings,gesngthebasicsrightisfarmoreimportantthantoomuchfancymath.