Dbpedia Statistics
-
Upload
bubbasmith66 -
Category
Documents
-
view
233 -
download
0
Transcript of Dbpedia Statistics
-
8/10/2019 Dbpedia Statistics
1/12
DBpedia Usage Reportusage data as of 2014-11-15
covering DBpedia 3.3 (2009) to 3.9 (2013)
a periodic report onthe DBpedia SPARQL endpoint,,
and associated Linked Data deployment
Some of the statistics in this document were previously published as part of:
DBpedia - A Large - scale , Multilingual nowledge Base !"tracted from #i$ipedia
by %ens Lehmann, &obert 'sele, Ma" %a$ob, An(a %ent)sch, Dimitris onto$ostas, *ablo + Mendes,Sebastian ellmann, Mohamed Morsey, *atric$ van leef, S.ren Auer, /hristian Bi)er
publised 2015-01-0!
http://dbpedia.org/sparqlhttp://dbpedia.org/sparqlhttp://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://dbpedia.org/sparqlhttp://dbpedia.org/sparqlhttp://dbpedia.org/sparqlhttp://dbpedia.org/sparqlhttp://dbpedia.org/sparqlhttp://dbpedia.org/sparqlhttp://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0http://dbpedia.org/sparql -
8/10/2019 Dbpedia Statistics
2/12
Infrastructure
irtuoso ! Anytime Query ! "unctionality
#$$P Statistics
#$$P Lo%s
&um'er of #its
&um'er of isits
(eneral $rends
#its per )ndpoint
#its per Statement $ype
#its per *ey+ord
DIS$I&$
"IL$)R
"unctions like -&A$ , -&$AI&S , ISIRI
.se of ()- o'/ects
(R-.P B0
LI1I$ 2 -""S)$
-P$I-&AL-RD)R B0
.&I-&
Query lause Patterns
Additional $opics of Interest
1emory "ra%mentation
irtuoso 3 1is 45 onfi%uration
http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.212a8w3i3no3http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.p4h4qgt1ihj2http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ube4xkxy7nx8http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.eoeimywvctg1http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.eoeimywvctg1http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.94c6cnfqo00phttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.94c6cnfqo00phttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.zh199d83uf32http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.zh199d83uf32http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ghgduxt6du73http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.3o5ln63oduoghttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.j7kwowqlfasahttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.w5elmic59eaehttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.u94fvpf0sx70http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.pfzoosxos2b8http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.unvmqofo9060http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.domsaea3x0l3http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.whrixjxkedjxhttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.212a8w3i3no3http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.blcjhqymhhb4http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.p4h4qgt1ihj2http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.p4h4qgt1ihj2http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ube4xkxy7nx8http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ube4xkxy7nx8http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.eoeimywvctg1http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.eoeimywvctg1http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.eoeimywvctg1http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.94c6cnfqo00phttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.94c6cnfqo00phttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.94c6cnfqo00phttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.zh199d83uf32http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.zh199d83uf32http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ghgduxt6du73http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ghgduxt6du73http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.ghgduxt6du73http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.3o5ln63oduoghttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.3o5ln63oduoghttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.3o5ln63oduoghttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.3o5ln63oduoghttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.j7kwowqlfasahttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.w5elmic59eaehttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.w5elmic59eaehttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.8pxkpncx8lwihttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.u94fvpf0sx70http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.u94fvpf0sx70http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.pfzoosxos2b8http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.unvmqofo9060http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.unvmqofo9060http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.unvmqofo9060http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.udlx2deo9ub0http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.domsaea3x0l3http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.domsaea3x0l3http://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.whrixjxkedjxhttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.whrixjxkedjxhttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.whrixjxkedjxhttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.whrixjxkedjxhttp://var/www/apps/conversion/tmp/scratch_1/HYPERLINK%23h.whrixjxkedjx -
8/10/2019 Dbpedia Statistics
3/12
Infrastructure
DBpedia is comprised of67 irtuoso .ni8ersal Ser8er Instance3s5 44 handlin% SPARQL endpoint and Linked Data
Deployment, supportin% ne%otia'le RD" and other document formats7 A Physical omputer 44 hosted in -penLink Soft+are9s data center
As the si:e of the DBpedia dataset has increased, and its use 'y the Linked Data community%ro+n, irtuoso soft+are and computin% hard+are ha8e 'een mi%rated to increasin%ly morepo+erful 38irtual5 machines, as outlined in the ta'le 'elo+6
DBpedia
Version
Virtuoso
VersionProcessors
Allocated
Processor CoresRAM
3.3 3.4;= Ro+ Store, 4
node lusterA1D -pteron ?@@, @? (h: @ (B
3.5 3.;= Ro+ Store, 4
node lusterIntel Ceon )@, @@E (h: ? ? (B
3.!;= Ro+ Store, 4
node lusterIntel Ceon )4@;, @ (#: ? ; (B
3."
E= olumn
Store, Sin%le4
Ser8er
Intel Ceon )4@;, @ (#: ? ; (B
Prior to DBpedia F, +e used the irtuoso ;= Ro+ Store )n%ine, in a four4node Shared4&othin% luster confi%uration As of DBpedia F, +e mo8ed to the ne+er irtuoso E= olumnStore )n%ine, operatin% in Sin%le4Ser8er 3ie, one node, no clusterin%5 mode
$he irtuoso ;= Ro+ Store )n%ine luster pro8ides paralleli:ation of Guery e=ecution, e8en+hen the cluster nodes are on the same machine, and hori:ontal scale4out A irtuoso luster3Ro+ Store or olumn Store, 8; or 8E5 can 'e %ro+n to satisfy desired response times for %i8enRD" dataset collections
$he irtuoso E= olumn Store )n%ine pro8ides similar paralleli:ation to the ; luster setup,'ut its 8ectored e=ecution model does so +ith a Sin%le4Ser8er setup In addition, it le8era%escolumn4+ise stora%e and key compression for hi%hly compact +orkin% sets
-
8/10/2019 Dbpedia Statistics
4/12
DBpedia9s irtuoso confi%uration 3follo+in% some re8isions discussed later in this document5no+ includes6
7 Query Cost Estimation Timeout of 120 seconds $his is the Guery plan optimi:ationthreshold that comes into play durin% the early sta%es of solution construction
7 Query Execution Timeout of 120 seconds. $his is the Guery solution preparation
threshold If the timeout stops e=ecution 'efore the solution is complete 44 ie, if thesolution is partial 44 this is si%nified to the Guery client 8ia #$$P response headers
7 Maximum SPARQL query soution !a"#"a resut set$ si%e of 10&000 ro's. $his is thema=imum num'er of solution ro+s 3SELECTGueries5 or statements 3CONSTRUCTorDESCRIBEGueries5 returned per Guery solution retrie8al round4trip
Virtuoso #An$ti%e &uer$# 'unctionalit$
$he !Anytime Query! is a core feature of irtuoso that ena'les it to handle the challen%esinherent in pro8idin% a pu'licly accessi'le interface for ad4hoc Gueryin%, at He' scale $his
feature allo+s any SPARQL4 and #$$P4protocol4sa88y user a%ent 3a2k2a client5 to issue lon%4runnin% and2or lar%e4solution Gueries, of +hich the complete solutions +ould e=ceed confi%uredGuery timeout and2or result set limits, and to recei8e partial solutions conformin% to thosethreshholds, +hile also ena'lin% the use of LIMITand OSETto create +indo+s 3a2k2acursors5 that slide throu%h the set of data that constitutes the Guery9s complete solution (ote)E*en '+ie ,a-in- t+rou-+ a ,artia query soution& irtuoso continues to 'or# to'ards acom,ete soution in t+e /ac#-round.
())P *tatistics
())P +ogs
$he #$$P ser8er lo% files used in this report e=clude traffic %enerated 'y67 IP addresses that +ere temporarily rate limited after their 'urst period7 IP addresses that +ere 'anned after misuse7 Applications, Spiders, and other cra+lers that +ere 'locked after freGuently hittin% the
rate limiter or %enerally claimed too many resources
$he system uses a com'ination of fire+all rules and ALs 3Access ontrol Lists5 to Guickly dropsuch connections, so le%itimate users of d'pediaor% can connect and perform their lookup $osa8e time, these dropped connections are not recorded in the lo% files
$he data +as e=tracted from reports %enerated 'y He'ali:er 8 @@
,u%-er of (its
In the ta'le 'elo+, the Duration (Days)column represents the num'er of days for +hich lo%s+ere analy:ed, +hich may not ha8e 'een all days that DBpedia 8ersion +as li8e A !hit! is anyreGuest from an #$$P client
DBpediaVersion
DurationDa$s/
)otal (its+ogged for
Period
A0erage (itsper Da$
Median (itsper Da$
*tandardDe0iation
Ma1i%u% (itson a *ingle
Da$
http://webalizer.linux-mirror.org/http://webalizer.linux-mirror.org/http://webalizer.linux-mirror.org/http://webalizer.linux-mirror.org/ -
8/10/2019 Dbpedia Statistics
5/12
3.3 @F !"#$$%#&!' ())#*%% (%%#)+$ %**#!!% %#)%!#&)!
3.4 %*&%!#!)+ %#'%'#&"! %#%$*!) )&%#''$ '#)(%#$&(
3.5 @@ '*'#*!*#'(! %#%''#$%' %#+)""" )*$#"&" '#!+*#('+
3.2 ; '%!#%(*#&*( %#)'*#)&& %#'*$#(&+ '&$#!"& '#"!+)%
3. @? &!"#))*#$(& '#+*)!! %#!)+#('* %#+&(#)!* *#*()#"()
3.! F; &(+#""+#"%+ '#!%+#"%+ '#(%(#((& %#+*$"+ (#$(*#"!+
3." %#+$'#)!!#*"+ )#+)"'* '#*$*'* %#&')#''' %+#*!"#"**
$he increasin% popularity of DBpedia is clearly 8isi'le in this %raph
-
8/10/2019 Dbpedia Statistics
6/12
,u%-er of Visits
In the ta'le 'elo+, the Duration (Days)column represents the num'er of days for +hich lo%s+ere analy:ed, +hich may not ha8e 'een all days that DBpedia 8ersion +as li8e A !hit! is anyreGuest from an #$$P client
DBpediaVersion
DurationDa$s/
)otal Visits+ogged for
Period
A0erageVisits
per Da$
Median Visitsper Da$
*tandardDe0iation
Ma1i%u%Visits on a*ingle Da$
3.3 @F %#'&(#(&% !#(&+ !#((& '#+)$ %)#%'$
3.4 %#())#'!$ %%#)'! %%#"() %#&"$ %"#%!*
3.5 @@ "#%*%#'$) %$#&!' %$#!+' '#!$! ')#%'!
3.2 ; )#'%'#$!( %!#"(% %(#$$" $!% &$#+$"
3. @? $#*)'#+'( ')#!(' ''#'$' %+#("% %'(#*$!
3.! F; )#)+'#**( %$#*&% %$#(%% '#!$+ '(#"%$
3." (#(+!#%&% ''#+'$ %*#%!( !#+(% &'#''(
A%ain, a %raph of this data clearly sho+s DBpedia9s increasin% popularity
$he sudden drop in 8isits4per4day 'et+een the E and the ? datasets is e=plained 'y thecom'ination of a fe+ factors6
7 Some applications started to use their o+n pri8ate DBpedia endpoint7 -ther applications that had 'een a'usin% the DBpedia endpoint +ere 'locked7 Lan%ua%e4specific DBpedia endpoints emer%ed and took on some of the 'urden
$he a8era%e hits per day +ere unchan%ed 'y the decrease in 8isits per day
-
8/10/2019 Dbpedia Statistics
7/12
eneral )rends
(its per ndpoint
$he DBpedia ser8er does not pro8ide only a SPARQL endpoint, 'ut also ser8es as a LinkedData #u', returnin% resources in a num'er of different formats
"or each dataset +e selected days +orth of lo% files at random and processed those in orderto sho+ the 8arious endpoints called
ndpoint 3.3 3.4 3.5 3.2 3. 3.! 3."
/class %)*#$&' ''+#($+ %)$#*)% %($#$&* )+*#*!& '"*#!"* ))'#'$!
/data %#)&"#((" '#$*%#%*% '#''!#(') '#&"$#!!' )#(%)#!"$ "#'"$#"%& $#(""#))'
/fct '#+') %+#*'$ %%#("% %*#%"$ "#+)) %&!(' %#+"'#%*'
/ontology *+#'&" %$*#)(+ %"%#!') %(*#'&+ %%("+ !$#$(+ $'#!'%
/page '#$)"#+"+ "#(+'#"%* %#*+!#&&$ %#$*(#%** )#(*(#"*% (#)((#%') +$)#)'!
/property ')+#!&+ )%%#"!' %)(#'!) %($(+ %(*%% %'*#")" (+#**(
/resource '#!(*"' "#+*+#'%* '#&&"#+*$ '#)+!#+%% "#")$#%+! (#%")#+%& )#!(*#*!'
/sparql '#+!+#)*( "#%((#')% '#'$&($ !#%%'#+&' %!+'#%%) %"()(! %"#'*(#)+"
other %%%#$*) %**#+)) %"$#(!+ *'#%+$ '$&!+ '(+#)'" *"*#+*(
total 9,618,605 16,541,529
9,422,519 16,286,073
28,709,718
35,142,280
32,331,203
(its per *tate%ent )$pe
&e=t +e focused on the calls to the /sparqlendpoint and counted the num'er of statements pertype As the lo% files only record the full SPARQL Guery on a ,ETreGuest, all the -UTreGuestsare counted as un#no'n
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS &!(+ '$!#"*% )$+#+%+ )'$#+'! %&*#($% $((#'%! %#("(#)$!
!"#S$%&!$ &+"' )%#+&( %)#!!* %%#")$ ''#)+" )(#(&% '&+#!$&
'(S!%)*( %+#*%* (#$%' "#"!! $#&'' $'#++( %%%#'!' "("#$%"
S(+(!$%#*!+#($
!)#$$'#('& %#$&(#$&! *#+)+#'+" %%#'+"#')! %)#&%&(+ %%#'&('$
unno6n ((#(** '+$#)&$ ''!#"%+ ()(#*$% "#"&"#*+' %#%))#&"( &&*#$)+
total 2,090,387
4,177,231
2,265,576
9,112,052 15,902,113
15,475,379
14,387,304
-
8/10/2019 Dbpedia Statistics
8/12
(its per 7e$6ord
"inally +e analy:ed each SPARQL Guery and counted the use of some common key+ords andconstructions
DISTINCT
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS + + + + + + +
!"#S$%&!$ & + + + + ( +
'(S!%)*( + + + + + % +
S(+(!$ )$*#!") "%$#+)$ '*$#&"+ %#&$+#$(" %#"*$#(!$ )#"))#""" %+#*(+'"
FILTER
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS (%( )*% '#&%* )#+'( %#""$ %"!* %#()$#)("
!"#S$%&!$ "'#!*% '!#*)) !#!++ %#)*" $#'*' '%#+*% !(+
'(S!%)*( %" %( )& '" &! %% ')
S(+(!$ *$"#"(! &++#*"+ &'(#++& '#+'*#)!' )#'$(#("$ "#**)#%*+ $#&(%#)+!
unctions i#e CONCAT, CONTAINS, ISIRI
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS $!* %(' '% )! $& "(( %$%
!"#S$%&!$ "'#!$& '!#($( !#($' &)$ )#!*% '+#)$) %#')(
'(S!%)*( & %" )& '% "* + !
S(+(!$ %$$#(%& '&'#%%' )*!#$"& %#(%'#**& '#*$%#*+( )#"!)#!++ "#*!%#%""
se of GEOo/ects
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS + + + + + + '%
!"#S$%&!$ '+ ) + &() '#)() $(+ )!
'(S!%)*( + + + + + + +
S(+(!$ &'"#'&" '&"#$)! $&$#"$$ &)$#++* %#+)$#"&' %#'"(#'"$ &**#%)(
-
8/10/2019 Dbpedia Statistics
9/12
GROUP BY
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS + + + + + + +
!"#S$%&!$ + + + + + + +
'(S!%)*( + + + + + + +
S(+(!$ "+ %!* &' !&! &(* '%#()% %++#+)&
LIMIT / OFFSET
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS + + + + ' + +
!"#S$%&!$ %! ! (" $** %'#!++ %!#$)& &+#$)'
'(S!%)*( ' ") ) + )( '#"+$ !!
S(+(!$ *$#++% ')!#%!* %!%#!*( *"'#!($ *&(#!&* %#+&$#"$' '#'$!#(+!
OPTIONAL
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS ($ + + + %#%$& % +
!"#S$%&!$ %' %*& *#!&% "#(*' ()% %*#"+( )*#(*(
'(S!%)*( + + + + + + $
S(+(!$ &(*#($& *('#(+$ (*"#&$' '#%+*#'(* %#*(%#!%* '#)%(#(*' )#!%'#&*"
ORDER BY
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS + + + + + + +
!"#S$%&!$ + % % )) %"% '+ '+
'(S!%)*( % % + + + & %
S(+(!$ ")#&&$ "*)% )'#++( '$+#&'% %'!#)($ '%+#++' %$+#+)%
UNION
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
AS %#'$* %$#!!+ * ($ )"#"!& "#&)) )#*+"
!"#S$%&!$ ' %%+ &*+ !$" *!" %#$%" "'
'(S!%)*( ' &*" %#)(( )&% + + "
S(+(!$ $%#$'& %%&'( ")$#*+* !+$#$*& %#)&*#$') '#(*$#+*' )#($!#$(+
-
8/10/2019 Dbpedia Statistics
10/12
&uer$ Clause Patterns
It is also interestin% to look at the percenta%es of the sample set includin% each SPARQL Gueryclause "or instance, the ,ROU- Bclause +as apparently an o8erlooked feature, untilDBpedia F +as released, +hile the ORDER Bclause remains infreGuently used 3e8en insituations +here OSETand LIMITare used for pa%in%5
*tate%ent )$pe 3.3 3.4 3.5 3.2 3. 3.! 3."
')S$)#!$ %!.&% %%.)$ %(.'! %!."" %).'( '&."+ %$.$$
)+$(% "&.(' %).$( )%.(! '&.'$ '!.%( )$.%) &*.)*
unct-ons *.*' $.** ').&% '%.)) '&.&" '&.*& ")."$
.(" oects '(.() $.!& )!.$+ $.$( !.'& !.') &.')
.%"& * +.++ +.+% +.++ +.+% +.+% +.%$ ")."$
+))$ /"S($
".&& $.&) %%.&* %+.&+ (.$$ (.*' '+.%$
"$)"#A+ )+.$% ').*) "(.)) '$.'& %$.(% %(.%& )".($
"%'(% * '.)+ %.'& %.!) ).'" %.%& %.&& %."'
)"# ).'$ ).%& '$.)& %%.'! %'.%) '+.$% )).)!
-
8/10/2019 Dbpedia Statistics
11/12
Additional )opics of Interest
Me%or$ 'rag%entation
In recent times 3leadin% up to the preparation of this report5, a particular user a%ent +as issuin%the follo+in% Guery a'out times e8ery hour6
DEINE o0tp0t:1or2at 3CS43DEINE sql:sig5al67oid67ariables %DEINE i5p0t:de1a0lt6graph60ri 8http://dbpedia.org9SELECT p# l5a2e# o;;0patio5# ge5der#
< gro0p=;o5;at < altNa2e > separator?3@3 A S altNa2es AERE p
a dbpedia6oFl:-erso5 >rd1s:label 5a2e .
BIND < CONCT < 5a2e# 3G3# LN,
rd1s:label altNa2e O-TIONL p
dbpedia6oFl:o;;0patio5 o;;0patio5 O-TIONL p
dbpedia6oFl:ge5der ge5der ,ROU- B p l5a2e o;;0patio5 ge5der altNa2eOSET %%*!((*LIMIT %++++
It may not 'e o'8ious at a Guick %lance, 'ut the full solution of this Guery +ould ne8er ha8e
more records than the reGuested ,?F,EE? record OSET44 so it +ould ne8er return anyrecords to the client, ne8er mind approachin% or e=ceedin% the reGuested LIMITof ,&onetheless, the Guery had to 'e processed each time it +as recei8ed
Query patterns like this led irtuoso, or more accurately, the standard glibmemory allocator32allo;5, to create a fra%mented memorystate 3at the operatin% system le8el5 en route tomemory e=haustion and ine8ita'le in8ocation of Linu=9s out4of4memory 3--15 process killer
$he resultant memory fra%mentation couldn9t 'e 'e addressed +ith 2allo; $herefore, Linu='uilds of irtuoso no+ incorporate the $LS" 3 $+o 4Le8el Se%re%ate "it5allocator
&ote6 $his allocator is still in a testin% phase, 'ut so far it appears to 'e ha8in% the desiredeffectJ ie, these kinds of Gueries are effecti8ely controlled 'y restrictions in the I&I fileJS-RKLsection !MaxQueryCostEst!ato"T!eand MaxQueryCostEst!ato"T!e$&com'ined +ith irtuosoKs Anytime QueryM functionality
http://stackoverflow.com/questions/3770457/what-is-memory-fragmentationhttp://stackoverflow.com/questions/3770457/what-is-memory-fragmentationhttp://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://stackoverflow.com/questions/3770457/what-is-memory-fragmentationhttp://stackoverflow.com/questions/3770457/what-is-memory-fragmentationhttp://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/http://www.gii.upv.es/tlsf/ -
8/10/2019 Dbpedia Statistics
12/12
Virtuoso Mis8/Configuration
A num'er of I&I file parameters +ere inad8ertently set to 8alues inappropriate to the DBpediaser8ice and its host en8ironment $hese 8alues also contri'uted to the o8erall insta'ility inrecent months
$he inappropriate I&I file settin%s included N7 'efault)solat-on 2 %(A'!")$$('
$his is not appropriate for a read4only data'ase that pro8ides ad +ocGueryin% o8er the+e', 'ecause it introduces si%nificant o8erhead due to AID implications
7 %esultSeta%os 50000$his encoura%ed Gueries constructed +ith +holesale data e=traction in mind, at thee=pense of other users $he current settin% of , may 'e further reduced to O,,in the near future
urrent settin%s are sho+n 'elo+
J-ara2eters...De1a0ltIsolatio5 ? % > RED=UNCOMMITTED# Fas RED=COMMITTED