¿Qué es un mashup? Mashup (aplicación web híbrida) Mashup (aplicación web híbrida)
Innovation in the Programmable Web: Characterizing the Mashup Ecosystem
-
Upload
shuli-yu -
Category
Technology
-
view
6.683 -
download
0
description
Transcript of Innovation in the Programmable Web: Characterizing the Mashup Ecosystem
Innovation in the Innovation in the Innovation in the Innovation in the
– Shuli Yu –
School of Information Systems,
Singapore Management University
Characterizing the Mashup EcosystemCharacterizing the Mashup EcosystemCharacterizing the Mashup EcosystemCharacterizing the Mashup Ecosystem
+
The Mashup EcosystemThe Mashup EcosystemThe Mashup EcosystemThe Mashup Ecosystem
• A “mashupmashupmashupmashup is a web application that combines datadatadatadata from more than one source into a single integrated tool”(Wikipedia, 2008)
+integrate
API
API
Mashup
Developers
Individual consumers
Enterprises
+ + + +
Research on mashupsResearch on mashupsResearch on mashupsResearch on mashups
• Mashups: Unit level: Unit level: Unit level: Unit level characteristics
– Comparing the technologies and architecture and examining how they can be improved (Jackson and Wang, 2007; Liu, Hui, Sun and Liang, 2007)
– Classification schemes
• Industry verticals (Wikipedia, 2008)
• Involvement in the application stack (Hinchcliffe, 2006)
• Specific stakeholdersstakeholdersstakeholdersstakeholders
– Usage of mashups in particular domains
• Cartography (Pietroniro and Ficheter, 2007), libraries in healthcare (Cho, 2007)and digital journals (Kulathuramaiyer, 2007)
– Copyright and policy implications of remixing content (O’Brian and Fitzgerald, 2006; Goodman and Moed, 2006)
Research AgendaResearch AgendaResearch AgendaResearch Agenda
• Characterize the mashup ecosystem
– Describe how the network has evolved
• Growth
• Network metrics
– Determine what makes an API successful
m
API m
API
API
API
m
m
m
���� 2-mode network
Data source: Data source: Data source: Data source: ProgrammableWebProgrammableWebProgrammableWebProgrammableWeb
Data source: Data source: Data source: Data source: ProgrammableWebProgrammableWebProgrammableWebProgrammableWeb
Data source: Data source: Data source: Data source: ProgrammableWebProgrammableWebProgrammableWebProgrammableWeb
Research ApproachResearch ApproachResearch ApproachResearch Approach
• Network structureNetwork structureNetwork structureNetwork structure
– Relationship between APIs and mashups
• Attributes: Possible success factorsAttributes: Possible success factorsAttributes: Possible success factorsAttributes: Possible success factors
– Date created
– Category
– Rating
m
API m
API
API
API
m
m
m
GrowthGrowthGrowthGrowth
Cumulative API and Mashup Growth
0
500
1000
1500
2000
2500
3000
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
APIs
Mashups
Network snapshots @ 1 month intervals: Sep 2005 to Dec 2007
GrowthGrowthGrowthGrowth
Growth rate of APIs and Mashups (Number of new APIs or Mashups per month)
0
50
100
150
200
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
APIs
Mashups
2222----mode matrix of APIs and mashupsmode matrix of APIs and mashupsmode matrix of APIs and mashupsmode matrix of APIs and mashupsAPIs
m
Network snapshots @ 6 month intervals: Dec 2005 to Dec 2007
Visualizing the 2Visualizing the 2Visualizing the 2Visualizing the 2----mode Mashup and API Network: mode Mashup and API Network: mode Mashup and API Network: mode Mashup and API Network: Layout by nodeLayout by nodeLayout by nodeLayout by node----repulsion with equal edge length biasrepulsion with equal edge length biasrepulsion with equal edge length biasrepulsion with equal edge length bias
Note: Square nodes denote APIs and circle nodes denote mashupsNote: Square nodes denote APIs and circle nodes denote mashupsNote: Square nodes denote APIs and circle nodes denote mashupsNote: Square nodes denote APIs and circle nodes denote mashups
Dec 2005
Dec 2006
Dec 2007
API Tier 1:
- Google Maps
API Tier 2:
•All popular APIs here
•Social/community, Search
API Tier 3:
• Less popular?
• News feeds, online retail, music
Dec 2007
Selected APIs and their
corresponding mashups
2
Affiliation matrix of APIsAffiliation matrix of APIsAffiliation matrix of APIsAffiliation matrix of APIs
API
API
API
API2
2
m
API m
API
API
API
m
m
m
2-Mode Network
API Affiliation Network
Affiliation matrix of APIsAffiliation matrix of APIsAffiliation matrix of APIsAffiliation matrix of APIsAPIs
APIs
Network snapshots @ 3 month intervals: Dec 2005 to Dec 2007
Visualizing the API Affiliation Network: Visualizing the API Affiliation Network: Visualizing the API Affiliation Network: Visualizing the API Affiliation Network: Layout by principal component analysisLayout by principal component analysisLayout by principal component analysisLayout by principal component analysis
Note: Size of nodes are proportionate to their degree Note: Size of nodes are proportionate to their degree Note: Size of nodes are proportionate to their degree Note: Size of nodes are proportionate to their degree (number of links to other APIs)(number of links to other APIs)(number of links to other APIs)(number of links to other APIs)
Dec 2005
June 2006
Dec 2006
June 2007
Dec 2007
API Affiliation Network MetricsAPI Affiliation Network MetricsAPI Affiliation Network MetricsAPI Affiliation Network Metrics
• Degree
– Network connectivity over time
• Small Worlds
– Clustering coefficient
– Path length
• Scale Free
– Degree frequency distribution
DegreeDegreeDegreeDegree
• DegreeDegreeDegreeDegree: Number of other APIs that are connected to a particular API via one or more mashups � Steady increase before there is a plateau
• Normalized degreeNormalized degreeNormalized degreeNormalized degree: Mean degree divided by the maximum possible degree expressed as a percentage � Constant throughout
Degree over time
0
2
4
6
8
10
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
Freeman Degree
Normalized Degree
Small WorldsSmall WorldsSmall WorldsSmall Worlds
• Clustering Coefficient (CC)Clustering Coefficient (CC)Clustering Coefficient (CC)Clustering Coefficient (CC)
– Extent to which nodes in a graph tends to create a unified group with many internal connections but few connections leading out of the group
• Characteristic Path Length (CPL) Characteristic Path Length (CPL) Characteristic Path Length (CPL) Characteristic Path Length (CPL)
– Measurement of the average distance required to pass from node to node (Ravid and Rafaeli, 2004)
• Small World networksSmall World networksSmall World networksSmall World networks have
– High degree of clustering
– Short path lengths
Regular Small World Random
Source: Complex Science for a Complex World, Figure 5.6. http://epress.anu.edu.au/cs/mobile_devices/ch05s03.html
Small WorldsSmall WorldsSmall WorldsSmall Worlds
• To classify a network as Small World, compare the CC and CPL with a random network of similar density:
– High degree of clustering: CCsw >> CCrandom � CCCCCCCCswswswsw/CC/CC/CC/CCrandomrandomrandomrandom > 1> 1> 1> 1
– Short path lengths: CPLsw≈ CPLrandom � CPLCPLCPLCPLswswswsw/CPL/CPL/CPL/CPLrandomrandomrandomrandom = 1= 1= 1= 1
2.282
2.240
2.237
2.223
2.243
2.206
2.228
2.284
2.355
CPLsw
0.414
0.428
0.448
0.458
0.418
0.395
0.399
0.500
0.320
CCsw
0.76492.983228.77660.0144Dec-07
0.78442.855724.87840.0172Sep-07
0.80822.768022.46190.0199Jun-07
0.82222.703819.97820.0229Mar-07
0.69653.220523.77560.0176Dec-06
0.65133.387020.95260.0189Sep-06
0.66063.372917.33930.0230Jun-06
0.48374.722228.78050.0174Mar-06
0.073831.922125.55170.0125Dec-05
CPLsw/ CPLrandomCPLrandomCCsw/ CCrandomCCrandom
Small WorldsSmall WorldsSmall WorldsSmall Worlds
• To classify a network as Small World, compare the CC and CPL with a random network of similar density:
– High degree of clustering: CCsw >> CCrandom � CCCCCCCCswswswsw/CC/CC/CC/CCrandomrandomrandomrandom > 1> 1> 1> 1
– Short path lengths: CPLsw≈ CPLrandom � CPLCPLCPLCPLswswswsw/CPL/CPL/CPL/CPLrandomrandomrandomrandom =< 1=< 1=< 1=< 1
CC/CCrandom
0
5
10
15
20
25
30
35
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
CPL/CPLrandom
0
0.2
0.4
0.6
0.8
1
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
Degree distribution: Scale FreeDegree distribution: Scale FreeDegree distribution: Scale FreeDegree distribution: Scale Free
• Scale freeScale freeScale freeScale free networks have power law degreepower law degreepower law degreepower law degree distributions:
– Frequency = b0 + Degree-b1
– Few nodes with that are highly connected hubs compared to a large number of nodes that are less connected
– Network structure and dynamics are independent of network size
Degree distribution of API-Affliation network (Dec 2007)
y = 32.165x-0.7602
R2 = 0.7245
0.1
1
10
100
1 10 100 1000
Degree (log)
Fre
quency (lo
g)
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• The frequency distribution of the 2222----mode APImode APImode APImode API----mashupmashupmashupmashupnetwork can also be analyzed
Frequency = bFrequency = bFrequency = bFrequency = b0000 + Degree+ Degree+ Degree+ Degree----bbbb1111
– Degree: Number of mashups created from APIsDegree: Number of mashups created from APIsDegree: Number of mashups created from APIsDegree: Number of mashups created from APIs– Frequency: Number of APIs with a particular degreeFrequency: Number of APIs with a particular degreeFrequency: Number of APIs with a particular degreeFrequency: Number of APIs with a particular degree
• In this case, would the 2-mode API-mashup distribution fit a Power LawPower LawPower LawPower Law or Long Tail distributionLong Tail distributionLong Tail distributionLong Tail distribution?
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• Possible types of distributions
– Power lawPower lawPower lawPower law
• Where small occurrences are common and large instances are rare � large number of APIs with only few mashups, compared to a small number of APIs with many mashups
• Similar to markets that are dominated by a few popular products, e.g. a brick and mortar bookstore that sells large quantities of bestseller novels
– Long TailLong TailLong TailLong Tail (Anderson, 2004)
• Large number of low frequency occurrences that cumulatively outweigh the initial portion of high frequency occurrences when aggregated
• Common in online retail: Product selection not limited by physical storage restrictions, logistics and holding costs; and consumers can easily find specific products by searching online or acting on recommendations (Brynjolfsson, Hu and Simester, 2007) � Overall high volume of sales from niche products.
• Mashup ecosystem is entirely virtual and has the above characteristics.
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• The frequency distribution of the 2222----mode APImode APImode APImode API----mashupmashupmashupmashup network can also be analyzed
– Frequency = b0 + Degree-b1
– Degree: Number of mashups created from APIs
– Frequency: Number of APIs with a particular degree
Degree distribution of APIs in 2-mode network Dec 2007
1
11
21
31
41
51
61
71
81
91
101
0 200 400 600 800 1000 1200 1400
Number of Mashups
API Frequency
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• Degree distribution (logged on both scaleslogged on both scaleslogged on both scaleslogged on both scales)
– Note: Fitting a line could result in a slope that is too shallow (Adamic, 2000)
Degree distribution of APIs in 2-mode network Dec 2007
1
10
100
1 10 100 1000 10000
Number of Mashups
API Frequency
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• Cumulative frequencyCumulative frequencyCumulative frequencyCumulative frequency distribution (logged on both scales)
– Fit line to this instead � Likely that it is a power lawpower lawpower lawpower law distribution: Tail is not long enough
Cumulative frequency distribution of APIs at Dec 2007
y = 401.05x-0.8618
R2 = 0.9865
0.1
1
10
100
1000
1 10 100 1000 10000
x (Number of mashups)
Culm
ulative frequency of APIs
with =< x m
ashups
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• Frequency = b0 + Degree-b1: Exponent decreases over time
– Fewer APIs with a lot of mashups (high degree) but more APIs with less mashups.(low degree) in December 2007 compared to December 2005.
– Connections becoming more mesh-like (more evenly distributed), and less hub-and-spoke like (unevenly distributed) � Tail getting longer
Degree distribution of APIs in 2-mode network
-20
0
20
40
60
80
100
1 10 100 1000 10000
Degree (log)
Frequency
Dec-07
Dec-05
Log. (Dec-07)
Log. (Dec-05)
Degree distribution: 2Degree distribution: 2Degree distribution: 2Degree distribution: 2----modemodemodemode
• Why does the distribution change?Why does the distribution change?Why does the distribution change?Why does the distribution change? Could be due to the effect of several forces:
– Number of APIs with fewer mashups (low degree) could be increasing at a rapid rate• Easy to join network; competitors actively promote APIs
– Number of APIs with more mashups (high degree) could be increasing at a slow rate• Total number of APIs that have many mashups is constrained
– Both: Combination of the above two forces
Factors predicting API successFactors predicting API successFactors predicting API successFactors predicting API success
• Measure of API success: Many mashups
• Possible factors
– TimeTimeTimeTime: First mover advantage
– CategoryCategoryCategoryCategory: Certain categories have an advantage
– Market concentrationMarket concentrationMarket concentrationMarket concentration on entry: Monopolized vs dispersed
– RatingRatingRatingRating: Higher rating indicates that API has something special
• Other factors
– Technology compatibility (data format, protocols, authentication); licensing structure and fees
Ranking of Top 20 APIsRanking of Top 20 APIsRanking of Top 20 APIsRanking of Top 20 APIs
Possible time and category advantages?Possible time and category advantages?Possible time and category advantages?Possible time and category advantages?
Top 20 APIs: Number of mashupsTop 20 APIs: Number of mashupsTop 20 APIs: Number of mashupsTop 20 APIs: Number of mashups
Market ConcentrationMarket ConcentrationMarket ConcentrationMarket Concentration
• Herfindahl Index– 0: Not concentrated, market share evenly distributed
– 1: Highly concentrated and monopolized market
Overall Herfindahl Index
0.2
0.22
0.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
Market ConcentrationMarket ConcentrationMarket ConcentrationMarket Concentration
• Category effects: Herfindahl Index– Top 8 categories with the most of APIs at Dec 2007 (>20 APIs)
Herfindahl Index by Category
0
0.2
0.4
0.6
0.8
1
Dec-05 Mar-06 Jun-06 Sep-06 Dec-06 Mar-07 Jun-07 Sep-07 Dec-07
Messaging
Photos
Mapping
Music
Shopping
Reference
Internet
Search
Factors predicting API successFactors predicting API successFactors predicting API successFactors predicting API success
• Category effects: Herfindahl Index– Shopping, Reference and Music
• Start off with one API monopolizing the market�lose market share to other newer entrants; Results in an eventual less concentrated market space.
– Internet and Messaging
• Begin with a few APIs sharing market space� one of the existing firms increases concentration � firm loses share such that the market became less concentrated.
– Photos, Mapping and Search
• Consistent market share structure from Dec 05 to 07, but at different levels
• Photos: Highly monopolized by Flicker with few other dominant APIs
• Mapping: Google Maps had the largest share, but other APIs like MS Virtual Earth, Yahoo Maps and GeoNames also had significant numbers of mashups
• Search: Highly dispersed by the various text, image and other search APIs from Google and Yahoo.
Regression Model: Time SeriesRegression Model: Time SeriesRegression Model: Time SeriesRegression Model: Time Series
MashupsMashupsMashupsMashupstttt = α
+ β0 MashupsMashupsMashupsMashupstttt----1111+ β1 OverallOverallOverallOverallHerfindahlHerfindahlHerfindahlHerfindahltttt----1111 + β2 CategoryHerfindahlCategoryHerfindahlCategoryHerfindahlCategoryHerfindahltttt----1111+ β3 RatingRatingRatingRating
+ β4 MappingMappingMappingMapping + β5 ShoppingShoppingShoppingShopping + β6 SearchSearchSearchSearch + β7 InternetInternetInternetInternet + β8 MusicMusicMusicMusic + β9 ReferenceReferenceReferenceReference + β10 PhotosPhotosPhotosPhotos + β11 MessagingMessagingMessagingMessaging
.996a .991 .991 4.837
R R Square
Adjusted
R Square
Std. Error of
the Estimate
Top 8 categories with
the most number of
APIs > 20
Regression Model: Time SeriesRegression Model: Time SeriesRegression Model: Time SeriesRegression Model: Time SeriesCoefficientsa
-10.130 1.444 -7.014 .000
1.166 .002 .995 523.283 .000
31.034 4.353 .014 7.130 .000
.254 .375 .001 .676 .499
.150 .085 .003 1.753 .080
.608 .470 .003 1.293 .196
.907 .338 .005 2.686 .007
.679 .474 .003 1.432 .152
.027 .445 .000 .062 .951
.041 .505 .000 .082 .935
-.005 .463 .000 -.011 .991
.995 .555 .003 1.793 .073
.262 .560 .001 .467 .640
(Constant)
PreviousMashups
Previous*HerfTotal
Previous*HerfInCat
Rating
CatSearch
CatMapping
CatShopping
CatInternet
CatMusic
CatReference
CatPhotos
CatMessaging
Model
1
B Std. Error
Unstandardized
Coefficients
Beta
Standardized
Coefficients
t Sig.
Dependent Variable: CurrentMashupsa.
Discussion of findingsDiscussion of findingsDiscussion of findingsDiscussion of findings
• Steady growth of mashups and APIs
� Not booming
• Structural changes:– Few APIs with many mashups; many APIs with few or no mashups (long tail), and over time, fewer APIs with a lot of mashups but more APIs with less mashups
– Overall, market is less concentrated, but exact pattern of concentration depends on specific categories
� Difficult to become an established player especially late in the game
Discussion of findingsDiscussion of findingsDiscussion of findingsDiscussion of findings
• Connections between different APIs reaching plateau � Same popular APIs connected with each other
� Suggests compatibility limitations between APIs (functional, technology, licensing constraints)
• First mover advantage� Release APIs early on
• Importance of category and function� Certain categories might be better
ConclusionConclusionConclusionConclusion
• Mashup ecosystem still in its infancystill in its infancystill in its infancystill in its infancy
– Patterns exist � but difficult to predict and generalize
• Future Research
– Case studies of individual categories or APIs
– Comparison between certain groups
AcknowledgementsAcknowledgementsAcknowledgementsAcknowledgementsThanks to
• Jason WoodardJason WoodardJason WoodardJason Woodard – for your guidance and support throughout this project, you made the process really enjoyable and I’ve learnt so much from you!
• John MusserJohn MusserJohn MusserJohn Musser – for making the project possible by generously allowing us to access data from www.programmableweb.com
• Darshan Darshan Darshan Darshan SantaniSantaniSantaniSantani – for helping immensely with data extraction
ReferencesReferencesReferencesReferences
• Adamic, Lada A. 2000. Zipf, Power-law, Pareto - A ranking tutorial. Information Dynamics Lab, HP Labs.http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html#ap1
• Anderson, Chris. 2004. The Long Tail. Wired, http://www.wired.com/wired/archive/12.10/tail_pr.html
• Cho, Allan. 2007. An introduction to mashups for health librarians. Journal of the Canadian Health Libraries Association / Journal de l'Association des bibliothèques de la santé du Canada 28:19-22.
• Goodman, Elizabeth, and Andrea Moed. 2006. Community in Mashups: The Case of Personal Geodata. http://mashworks.net/images/5/59/Goodman_Moed_2006.pdf
• Hinchcliffe, Dion. 2006a. Is IBM making enterprise mashups respectable? Enterprise Web 2.0. http://blogs.zdnet.com/Hinchcliffe/?p=49 (Accessed February 19, 2008).
• Jackson, Collin, and Helen J. Wang. 2007. Subspace: secure cross-domain communication for web mashups. In Proceedings of the 16th international conference on World Wide Web, 611-620, Banff, Alberta, Canada: ACM
ReferencesReferencesReferencesReferences
• Kilkki, Kalevi. 2007. A practical model for analyzing long tails. First Monday 12, no. 5. http://firstmonday.org/issues/issue12_5/kilkki/index.html
• Kulathuramaiyer, Narayanan. 2007. Mashups: Emerging Application Development Paradigm for a Digital Journal. Journal of Universal Computer Science 13, no. 4:531-542.
• Liu, Xuanzhe, Yi Hui, Wei Sun, and Haiqi Liang. 2007. Towards Service Composition Based on Mashup. 332-339
• Mashup (web application hybrid) - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)
• O'Brien, Damien S, and Brian F Fitzgerald. 2006. Mashups, remixes and copyright law. http://eprints.qut.edu.au/archive/00004239/
• Ravid, Gilad, and Sheizaf Rafaeli. 2004. Asynchronous discussion groups as Small World and Scale Free Networks. http://www.firstmonday.dk/issues/issue9_9/ravid/index.html