Overview of how data on the Web of Data can be consumed (first and foremost Linked Data) and implications for the development of usage mining approaches.
References:
Elbedweihy, K., Mazumdar, S., Cano, A. E., Wrigley, S. N., & Ciravegna, F. (2011). Identifying Information Needs by Modelling Col...
Overview of how data on the Web of Data can be consumed (first and foremost Linked Data) and implications for the development of usage mining approaches.
References:
Elbedweihy, K., Mazumdar, S., Cano, A. E., Wrigley, S. N., & Ciravegna, F. (2011). Identifying Information Needs by Modelling Collective Query Patterns. COLD, 782.
Elbedweihy, K., Wrigley, S. N., & Ciravegna, F. (2012). Improving Semantic Search Using Query Log Analysis. Interacting with Linked Data (ILD 2012), 61.
Raghuveer, A. (2012). Characterizing machine agent behavior through SPARQL query mining. In Proceedings of the International Workshop on Usage Analysis and the Web of Data, Lyon, France.
Arias, M., Fernández, J. D., Martínez-Prieto, M. A., & de la Fuente, P. (2011). An empirical study of real-world SPARQL queries. arXiv preprint arXiv:1103.5043.
Hartig, O., Bizer, C., & Freytag, J. C. (2009). Executing SPARQL queries over the web of linked data (pp. 293-309). Springer Berlin Heidelberg.
Verborgh, R., Hartig, O., De Meester, B., Haesendonck, G., De Vocht, L., Vander Sande, M., ... & Van de Walle, R. (2014). Querying datasets on the web with high availability. In The Semantic Web–ISWC 2014 (pp. 180-196). Springer International Publishing.
Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., & Van de Walle, R. (2014, April). Web-Scale Querying through Linked Data Fragments. In LDOW.
Luczak-Rösch, M., & Bischoff, M. (2011). Statistical analysis of web of data usage. In Joint Workshop on Knowledge Evolution and Ontology Dynamics (EvoDyn2011), CEUR WS.
Luczak-Rösch, M. (2014). Usage-dependent maintenance of structured Web data sets (Doctoral dissertation, Freie Universität Berlin, Germany), http://edocs.fu-berlin.de/diss/receive/FUDISS_thesis_000000096138.
Size: 5.32 MB
Language: en
Added: Apr 21, 2016
Slides: 51 pages
Slide Content
Web of Data Usage Mining
Markus Luczak-Roesch
@mluczak | http://markus-luczak.de
What you should learn:
• describe the architectural differences between content
negotiation and Linked Data queries;
• develop applications that use different strategies to
consume Linked Data;
• develop usage mining methods that exploit the atomic parts
of the SPARQL query language.
Linked Data principles
1. Use URIs as names for
“Things” (resources).
2. Use HTTP URIs to allow the
access to resources on the
Web.
3. On resource access, deliver
meaningful information
conforming to Web standards
(RDF, SPARQL).
4. Set RDF links to resources
published by other parties to
allow the discovery of more
resources.
http://dbpedia.org/resource/Berlin
"
"
"
"
yago-res:BerlinS
owl:sameAsP
dbpedia:Berlin O
h"p://www.w3.org/DesignIssues/LinkedData.html
Content Negotiation
Linked Data principles
1. Use URIs as names for
“Things” (resources).
2. Use HTTP URIs to allow the
access to resources on the
Web.
3. On resource access, deliver
meaningful information
conforming to Web standards
(RDF, SPARQL).
4. Set RDF links to resources
published by other parties to
allow the discovery of more
resources.
h"p://www.w3.org/DesignIssues/LinkedData.html
http://dbpedia.org/resource/Berlin
"
"
"
"
yago-res:BerlinS
owl:sameAsP
dbpedia:Berlin O
Content Negotiation
Linked Data principles
1. Use URIs as names for
“Things” (resources).
2. Use HTTP URIs to allow the
access to resources on the
Web.
3. On resource access, deliver
meaningful information
conforming to Web standards
(RDF, SPARQL).
4. Set RDF links to resources
published by other parties to
allow the discovery of more
resources.
h"p://www.w3.org/DesignIssues/LinkedData.html
http://dbpedia.org/resource/Berlin
"
"
"
"
yago-res:BerlinS
owl:sameAsP
dbpedia:Berlin O
Content Negotiation
Linked Data principles
1. Use URIs as names for
“Things” (resources).
2. Use HTTP URIs to allow the
access to resources on the
Web.
3. On resource access,
deliver meaningful
information conforming to
Web standards (RDF,
SPARQL).
4. Set RDF links to resources
published by other parties to
allow the discovery of more
resources.
h"p://www.w3.org/DesignIssues/LinkedData.html
http://dbpedia.org/resource/Berlin
"
"
"
"
yago-res:BerlinS
owl:sameAsP
dbpedia:Berlin O
Content Negotiation
Linked Data principles
1. Use URIs as names for
“Things” (resources).
2. Use HTTP URIs to allow the
access to resources on the
Web.
3. On resource access, deliver
meaningful information
conforming to Web standards
(RDF, SPARQL).
4. Set RDF links to resources
published by other parties
to allow the discovery of
more resources.
h"p://www.w3.org/DesignIssues/LinkedData.html
http://dbpedia.org/resource/Berlin
"
"
"
Linked Data Infrastructure
Image source: Tom Heath and ChrisVan Bizer (2011) Linked Data: Evolving the Web into a Global Data Space (1st ediVon). Synthesis Lectures on the SemanVc Web: Theory and Technology, 1:1, 1-136. Morgan & Claypool.
Consuming Linked Data
• stateless
• request-response
t
Client Server
request
response
TCP life cycle
derived from R. Tolksdorf
Open connection
Close connection
Consuming Linked Data
GET / HTTP/1.1
User-Agent: Mozilla/5.0 … Firefox/10.0.3
Host: markus-luczak.de:80
Accept: */*
HTTP/1.1 200 OK
Server: Apache/2.0.49
Content-Language: en
Content-Type: text/html
Content-length: 2990
<!DOCTYPE html>
<html xml:lang="en"
…
Client
Server
derived from R. Tolksdorf
Server
Consuming Linked Data
Representation 1
index.html
Representation 2
index.rdf
Information
Resource
http://example.com/content/index
Client
HTTP GET
Consuming Linked Data
• Discover URIs
– Lookup services
• http://rkbexplorer.com
– Web of Data search engines
• http://sindice.com
• http://ws.nju.edu.cn/falcons/objectsearch/index.jsp
Consuming Linked Data
• Discover additional data for the resource at hand
• follow links („follow your nose“)
– rdfs:seeAlso
– owl:sameAs
• Co-Reference services
– http://sameas.org
• Web of Data search engines
Linked Data
Source: http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/
The server can trace this usage.
Linked Data is queryable
?s
“Markus Luczak-Roesch“
foaf:name
h"p://markus-luczak.de#me
?o
foaf:knows
SPARQL-recap
• Basic principle: pattern matching
– describe pattern
– query RDF triple set („RDF graph“)
– matching subset comes into results
?s
http://dbpedia.org/resource/Berlin
SPARQL queries on the Web
• RESTful service endpoint
GET /sparql?query=PREFIX+rdf… HTTP/1.1
Host: dbpedia.org
h"p://www.w3.org/TR/rdf-sparql-XMLres/ h"p://www.w3.org/TR/rdf-sparql-json-res/
Querying Linked Data
dbp:Klaus_Wowereit
dbp:Berlin
dbp:birthPlace
dbp:Reinhard_Mey
http://www.markus-luczak.de/me
dbp:birthPlace
Querying Linked Data
• distribution of data creates challenges for querying them
• Query approaches
– follow-up queries ß application-dependent, proprietary
– query a central data repository (e.g. LOD cache) ß trivial
– federated queries ß more interesting
• idea: query a mediator that distributes the sub-queries and returns
aggregated result (as of SPARQL 1.1)
– link traversal ß very interesting
• idea: follow links in the results retrieved from a source to expand the data
dynamically
Dataset
User Client/ApplicaVon
Query Pa"ern Access
Resource Centered
Access
HTTP
Query Processing
Graph CreaVon and
Content NegoVaVon
GET /resource/resA
GET /sparql?query=SELECT…
applicaVon/rdf+xml, …
Evaluate and perform query, create result set
Process and select result
text/xml, …
Data Publisher
Data Consumer
Data Publisher
Data Consumer
h"p://www.flickr.com/photos/therichbrooks/4040197666/, CC-BY 2.0, h"ps://creaVvecommons.org/licenses/by/2.0/
A game of pairs with SPARQL
SPARQL queries are self-descriptive data
themselves
{
?s1 foaf:name “Markus Luczak-Roesch”.
?s1 rdf:type dbp:Person
}
TP
TP
BGP
Statistical analysis
(a)SWC (b)DBpedia (c)LGD
Abbildung 20:Nutzung der Konzepte der Multi-Ontologien (Kanten sind ausgeblendet),
Quelle: eigene Darstellung
dieser Datensets besitzt noch ein großes Verbesserungspotential. Beispielsweise sind die
M¨oglichkeiten gegeben, eine h¨ohere Anzahl an speziellen Konzepten zu nutzen. Eben-
so k¨onnen theoretisch mehr Konzepte aus anderen Bereichen als Personen, Orte und
Dokumente verwendet werden.
Experiment 3In diesem Experiment wurde evaluiert, welche Ergebnisse mit dem ent-
wickelten ubKCE Algorithmus in Abh¨angigkeit der Gewichtung der Aspekte Dichte und
Nutzung erhalten werden. Damit soll es m¨oglich sein, die Frage zu beantworten, ob sich
Schl¨usselkonzepte¨uberhaupt auf Basis von Nutzungsdaten ermitteln lassen. Dies kann
eindeutig mit Ja beantwortet werden. Schon anhand der Ergebnisse von Experiment 2
ließ sich erkennen, dass viele der KCE - Schl¨usselkonzepte ebenfalls am st¨arksten von
Nutzern des Datensets verwendet werden. Je nach gew¨ahlter Gewichtung der Aspekte
variiert jedoch die
¨
Ubereinstimmung der Ergebnisse von ubKCE mit den von KCE er-
mittelten Schl¨usselkonzepten. So werden beispielsweise bei einer Gewichtung von 100%
Nutzung zu 0% Dichte im Allgemeinen andere Schl¨usselkonzepte als bei einem 30:70
Verh¨altnis ermittelt. Bez¨uglich der
¨
Ubereinstimmung zu KCE kann gesagt werden, dass
diese im Allgemeinen steigt, je h¨oher die Gewichtung der Dichte in ubKCE ist. Dies ist
nicht verwunderlich, da der Dichte-Aspekt aus KCE¨ubernommen wurde.
[27] gab zur Evaluation des KCE - Algorithmus an, dass es ab einer 70%
¨
Ubereinstimmung
der KCE - Ergebnisse zu den von Experten ermittelten Schl¨usselkonzepten nicht mehr
m¨oglich ist zu erkennen, ob ein bestimmtes Ergebnis von einem Menschen oder vom
Algorithmus bestimmt wurde. Um eine Gewichtung zu finden, die m¨oglichst gut mit
dem KCE - Ergebnis¨ubereinstimmt, jedoch gleichzeitig so stark wie m¨oglich auf dem
Nutzungsaspekt beruht, werden auch im Rahmen der vorliegenden Arbeit diese 70%
¨
Ubereinstimmung angestrebt. Nach Aussage von [27]bedeutetdiese
¨
Ubereinstimmung,
dass man nicht unterscheiden k¨onnte, ob ein Ergebnis von einem Experten, von KCE
oder von ubKCE stammt.
Durch die unterschiedlichen Gewichtungen der Aspekte kann die Zusammenfassung der
91
Source: Masterthesis of Markus Bischoff
Estimating the effects of change
Usage-dependent maintenance of structured Web data sets
to be added to the DBpedia 3.4 data set conforming to our approach
16
.
Table 7.14: Recommended predicates to be added to the data set and the estimated
e↵ects of change.
Primitive to addE↵ects of changeExists in data set
dbp:manufacturer 0.004505372 x
dbp:firstFlight 0.004505372 x
dbp:introduced 0.004505372 x
dbp:nationalOrigin0.004505372
dbo:thumbnail 0.021986718 x
dbo:director 0.025047524
dbp:director 0.02503915 x
dbp:abstract 0.025797024 x
dbo:starring 0.034066643
dbp:starring 0.034066643 x
dbp:stars 0.034066643 x
skos:Concept 0.040946128 x
skos:broader 0.04116386 x
dbp:redirect 0.066441677 x
Since this change recommendation is only additive it is clear that no negative
e↵ects are estimated. However, it is possible to estimate the positive potential of a
change and consequently to prioritize the changes to be performed in case of conflict-
ing or contradicting recommendations.
More complex and also subtractive change recommendations may emerge from
additive ones. This is typified by the recommendation to add dbo:director and
dbp:director for example to the data set which appear to be contradicting. Hence,
they should be either matched to each other by an owl:equivalentProperty relation
or one of the two should be eliminated.
7.3.3 Further data set analysis
Our case study has shown how the usage-dependent data set maintenance approach
performs in the context of a cross-domain data set like DBpedia. We will now present
results from our studies with SWDF and LGD as two di↵erent domain-specific data
sets.
16
To save space we apply the following namespace prefixes in addition to the ones defined b efore:
dbo:http://dbpedia.org/ontology/,dbp:http://dbpedia.org/property/.
178
Usage-dependent maintenance of structured Web data sets
Figure 7.20: Visualization of association rules computed by application of the unknown predicates restriction in the
context of the LGD log file (size: support, color: lift).
184
LGD
Linked Data
Source: http://wifo5-03.informatik.uni-mannheim.de/bizer/pub/LinkedDataTutorial/
The server can trace this usage.
SPARQL
7. Evaluation
The visualization shows how primitives on the left hand side (LHS) of a rule imply
particular ones on the right hand side (RHS) and which likelihood such an associa-
tion has. In our specific case this allows us to analyze which primitives are queried
together frequently in failing queries. We spot two characteristic usage patterns: (1)
the properties and classes queried in the context ofhttp://dbpedia.org/ontology/
Aircraft; (2) the properties and classes queried in the context of an object variable.
These can be further analyzed by exporting the association rules to GraphML and vi-
sualizing the network by use of a network visualization and analysis tool like Gephi
15
for example. Figure 7.13 depicts one filtered network representation for our example
case. Nodes with a degree lower than 5 are filtered out (k-core network withk= 5)
to derive a well-arranged visualization of the most important primitives in failing
queries. Nodes represent LHS and RHS of the computed rules. Edges point from the
LHS to the RHS of the particular rules.
^"SUHGBYDULDEOHKWWSGESHGLDRUJSURSHUW\QDPH`
^"SUHGBYDULDEOHKWWS[POQVFRPIRDIGHSLFWLRQ`
^KWWSGESHGLDRUJRQWRORJ\$LUFUDIW`
^KWWSGESHGLDRUJSURSHUW\DEVWUDFWKWWSGESHGLDRUJSURSHUW\QDPH`
^KWWSGESHGLDRUJSURSHUW\DEVWUDFWKWWS[POQVFRPIRDIGHSLFWLRQ`
^KWWSGESHGLDRUJSURSHUW\ILUVW)OLJKW`
^KWWSGESHGLDRUJSURSHUW\LQWURGXFHG`
^KWWSGESHGLDRUJSURSHUW\PDQXIDFWXUHU`
^KWWSGESHGLDRUJSURSHUW\QDPHKWWSGESHGLDRUJSURSHUW\W\SH`
^KWWSGESHGLDRUJSURSHUW\QDPHKWWS[POQVFRPIRDIGHSLFWLRQ`
^KWWSGESHGLDRUJSURSHUW\QDWLRQDO2ULJLQ`
^KWWSGESHGLDRUJSURSHUW\W\SHKWWS[POQVFRPIRDIGHSLFWLRQ`
Figure 7.13: Filtered visualization of the association rule network (k-core 5 filter
applied to reduce nodes with degree lower than 5).
Table 7.14 lists the an exemplary set of primitives which would be recommended
15
http://gephi.org/
177
{
?s1 foaf:name “Markus Luczak-Roesch”.
?s1 rdf:type dbp:Person
}
h"p://markus-luczak.de#me
“Markus Luczak-Roesch“
rdf:type
foaf:name
foaf:Person
✔
✗
✗
query applied to dataset
The server can trace detailed usage.
Linked Data Fragments
Querying Datasets on the Web with High Availability 5
generic requests
high client effort
high server availability
specific requests
high server effort
low server availability
data
dump
Linked Data
document
sparql
result
triplepattern
fragments
various types of
Linked Data Fragments
Fig. 1:Allhttptriple interfaces offer Linked Data Fragments of a dataset. They differ
in the specificity of the data they contain, and thus the effort needed to create them.
3.2 Formal definitions
As a basis for our formalization, we use the following concepts of therdfdata
model [16] and thesparqlquery language [12]. We writeU,B,L,andVto
denote the sets of alluris, blank nodes, literals, and variables, respectively.
Then,T=(U[B)⇥U⇥(U[B[L)is the (infinite) set of allrdftriples.Any
tupletp2(U[V)⇥(U[V)⇥(U[L[V)is atriple pattern.Anyfinitesetof
such triple patterns is abasic graph pattern(bgp). Any more complexsparql
graph pattern,typicallydenotedbyP,combinestriplepatterns(orbgps) using
specific operators [12, 20]. The standard (set-based) query semantics forsparql
defines thequery resultof such a graph patternPover a set ofrdftriples
G✓Tas a set that we denote by[[P]]Gand that consists of partial mappings
µ:V!(U[B[L), which are calledsolution mappings.Anrdftripletis
amatching triplefor a triple patterntpif there exists a solution mappingµ
such thatt=µ[tp], whereµ[tp]denotes the triple (pattern) that we obtain by
replacing the variables intpaccording toµ.
For the sake of a more straightforward formalization, in this paper, we as-
sume without loss of generality that every datasetGpublished via some kind of
fragments on the Web is a finite set of blank-node-freerdftriples; i.e.,G✓T
⇤
whereT
⇤
=U⇥U⇥(U[L).Eachfragmentofsuchadatasetcontainstriples
that somehow belong together; they have been selected based on some condition,
which we abstract through the notion of a selector:
Definition 1 (selector).Aselectoris a partial functions:2
T
!{true,false}.
A more concrete type of this abstract notion are triple pattern selectors, which
select triples that match a certain triple pattern:
Definition 2 (triple pattern selector).Given a triple patterntp, thetriple
pattern selectorfortpis the selectorstpthat, for any singleton set{t}✓2
T
,is
defined by
stp({t})=
(
trueiftis a matching triple fortp,
falseelse.
When publishing data on the Web, we should equip its representations with
hypermedia controls [1, 8, 9]. We encounter them on a daily basis when browsing
htmlpages; they are usually present as hyperlinks or forms. What all these
controls have in common is that, given some (possibly empty) input, they result
in our browser performing a request for a specificurl.
Definition 3 (control).Acontrolis a function that maps from some set toU.
Pre-print of a paper accepted to the International Semantic Web Conference 2014 (ISWC 2014).
The final publication is available at link.springer.com.
xxx.xxx.xxx.xxx - - [17/Oct/2014:07:43:02 +0000]
"GET /2014/en?subject=&predicate=&object=dbpedia%3AAustin HTTP/1.1" 200
1309 "http://fragments.dbpedia.org /2014/en" …
10 Ruben Verborgh et al.
Data:(predecessorIp,bgpB={tp1,...,tpn}withn!2,startpage!0)
1I nil;c the triple pattern control in the control setC0of!0;
2FunctionBasicGraphPatternIterator.next()
3 µ nil;
4 whileµ=nildo
5 whileI=nildo
6 µp Ip.next();
7 returnnilifµp=nil;
8 " {!
i
1|!
i
1=httpGETfirst fragment page usingurlc(µp[tpi])};
9 ✏ isuch thatcnt!
i
1
=min({cnt!
1
1
,...,cnt!
n
1
});
10 I✏ TriplePatternIterator(StartIterator(),µp[tp✏],!
✏
1);
11 I BasicGraphPatternIterator (I✏,{µ[tp]|tp2B\{tp✏}},!
✏
1);
12 µ I.next();
13 returnµ[µp;
Algorithm 1:For all mappingsµpof a predecessorIp,abgpiterator for
apatternB={tp1,...,tpn}creates a triple pattern iteratorI✏for the least
frequent patterntp✏,passedtoabgpiterator for the remainder ofP.
fetches the first page of the correspondingldf. This page contains thecntmeta-
data, which tells us how many matches the dataset has for each triple pattern.
The pattern is then decomposed by evaluating it usinga)atriplepatterniter-
ator for the triple pattern with the smallest number of matches, andb)anew
bgpiterator for the remainder of the pattern. This results in adynamicpipeline
foreachof the mappings of its predecessor, as visualized in Fig. 2. Each pipeline
is optimizedlocallyfor a specific mapping, reducing the number of requests.
To evaluate asparqlquery over a triple pattern fragment collection, we pro-
ceed as follows. For eachbgpof the query, abgpiterator is created. Dedicated
iterators are necessary for othersparqlconstructs such asUNIONandOPTIONAL,
but their implementation need not beldf-specific; they can reuse the triple
pattern fragmentbgpiterators. The predecessor of the first iterator is a start
iterator. We continuously pull solution mappings from the last iterator in the
pipeline and output them as solutions of the query, until the last iterator re-
sponds withnil. This pull-based process is able to deliver results incrementally.
...
B
00
={ Drago_Ibler a Architect. }
Alen_Peternac
Drago_Ibler
Juraj_Neidhardt
...
?personbirthPlaceZagreb.
B
0
={?person a Architect. ?person birthPlace Zagreb. }
Zagreb
Budapest
Rome
...
?citysubject
Capitals_in_Europe.
B={?person a Architect. ?person birthPlace ?city. ?city subject Capitals _in_Europe. }
Fig. 2:Abgpiterator decomposes abgpB={tp1,...,tpn}into a triple pattern
iterator for an optimaltpiand, for each resulting solution mappingµoftpi,creates
abgpiterator for the remaining patternB
0
={tp|tp=µ[tpj]^tpj2B}\{µ[tpi]}.
Pre-print of a paper accepted to the International Semantic Web Conference 2014 (ISWC 2014).
The final publication is available at link.springer.com.
Querying Datasets on the Web with High Availability 9
4.2 Dynamic iterator pipelines
Acommonapproachtoimplementqueryexecutionindatabasesystemsisthrough
iteratorsthat are typically arranged in a tree or apipeline, based on which query
results are computed recursively [10]. Such a pipelined approach has also been
studied for Linked Data query processing [13, 15]. In order to enableincremental
results and allow the straightforward addition ofsparqloperators, we imple-
ment a triple pattern fragments client using iterators.
The previous algorithm, however, cannot be implemented by astaticiterator
pipeline. For instance, consider a query for architects born in European capitals:
SELECT ?person ?city WHERE {
?person a dbpedia-owl:Architect. #tp1
?person dbpprop:birthPlace ?city. #tp2
?city dc:subject dbpedia:Capitals _in_Europe. # tp3
} LIMIT 100
Suppose the pipeline begins by finding?citymappings fortp3. It then needs
to choose whether it will next considertp1ortp2. The optimal choice, however,
differs depending on the value of?city:
–Fordbpedia:Paris,thereare±1,900 matches fortp2,and±1,200 matches
fortp1, so there will be lesshttprequests if we continue withtp1.
–Fordbpedia:Vilnius,thereare164matchesfortp2,and±1,200 matches for
tp1, so there will be lesshttprequests if we continue withtp2.
With a static pipeline, we would have to choose the pipeline structure in advance
and subsequently reuse it.
In order to generate an optimized pipeline for each (sub-)query, we propose
a divide-and-conquer strategy in which a query is decomposeddynamicallyinto
subqueries depending on partial solution mappings. The main function of an
iterator isnext(), which either returns a mapping ornilif no mappings are left.
We first introduce a trivialstart iterator, which outputs the empty map-
pingµ0on the first call tonext(),andnilon all subsequent calls.
Next, we implement a previously definedtriple pattern iterator[15] for triple
pattern fragments. This iteratorItpis initialized with a predecessor iteratorIp,
atriplepatterntp,andapage!0of an arbitrary triple pattern fragment of a col-
lectionF. The iterator then extends mappings from its predecessor by reading
triples from theldfcorresponding to triple patterntp. Theurlof thisldfis re-
trieved through the collection control in the start page!0.EachcalltoItp.next()
results in mappings fortpinF,dependingonthepredecessor’smappings.
To solvebgpsofsparqlqueries, we introduce a triple pattern fragment
bgpiterator.Suchabgpiteratoris initialized with a predecessorIp,abgpB=
{tp1,...,tpn},andanarbitrarytriplepatternfragmentpage!0of a collectionF.
For an empty pattern (n=0), abgpiterator is equal to a start iterator. For
apatternlengthn=1,itisconstructedbycreatingatriplepatterniterator
for(Ip,tp1,!0).Forn!2,abgpiterator uses Algorithm 1.
bgpiterators evaluate abgpby recursively decomposing it into smaller itera-
tors. For each triple pattern in thebgpmapped by each result ofIp,theiterator
Pre-print of a paper accepted to the International Semantic Web Conference 2014 (ISWC 2014).
The final publication is available at link.springer.com.
Wikidata
• API access to
• items
• edit history
• items’ discussions
• items’ access statistics
• and more
• Linked Data interface
• MediaWiki API
• Wikidata Query
• SPARQL
• Linked Data Fragments
Access to more than
“just” usage.
Thank you very much!
@mluczak | http://markus-luczak.de
h"p://www.flickr.com/photos/therichbrooks/4040197666/, CC-BY 2.0, h"ps://creaVvecommons.org/licenses/by/2.0/
References
• Luczak-Rösch, M., & Bischoff, M. (2011). Statistical analysis of web of data usage. In Joint Workshop on Knowledge Evolution and
Ontology Dynamics (EvoDyn2011), CEUR WS.
• Luczak-Rösch, M. (2014). Usage-dependent maintenance of structured Web data sets (Doctoral dissertation, Freie Universität Berlin,
Germany), http://edocs.fu-berlin.de/diss/receive/FUDISS_thesis_000000096138.
• Elbedweihy, K., Mazumdar, S., Cano, A. E., Wrigley, S. N., & Ciravegna, F. (2011). Identifying Information Needs by Modelling Collective
Query Patterns. COLD, 782.
• Elbedweihy, K., Wrigley, S. N., & Ciravegna, F. (2012). Improving Semantic Search Using Query Log Analysis. Interacting with Linked Data
(ILD 2012), 61.
• Raghuveer, A. (2012). Characterizing machine agent behavior through SPARQL query mining. In Proceedings of the International
Workshop on Usage Analysis and the Web of Data, Lyon, France.
• Arias, M., Fernández, J. D., Martínez-Prieto, M. A., & de la Fuente, P. (2011). An empirical study of real-world SPARQL queries. arXiv
preprint arXiv:1103.5043.
• Hartig, O., Bizer, C., & Freytag, J. C. (2009). Executing SPARQL queries over the web of linked data (pp. 293-309). Springer Berlin
Heidelberg.
• Verborgh, R., Hartig, O., De Meester, B., Haesendonck, G., De Vocht, L., Vander Sande, M., ... & Van de Walle, R. (2014). Querying
datasets on the web with high availability. In The Semantic Web–ISWC 2014 (pp. 180-196). Springer International Publishing.
• Verborgh, R., Vander Sande, M., Colpaert, P., Coppens, S., Mannens, E., & Van de Walle, R. (2014, April). Web-Scale Querying through
Linked Data Fragments. In LDOW.