Searching Linked Data with Spinque

spinque 523 views 20 slides Jan 30, 2015
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

Spinque @ Search Engines Amsterdam (SEA)
http://www.meetup.com/SEA-Search-Engines-Amsterdam/events/216345662/

Spinque is a spin-off company from CWI that builds on the research into Databases and Information Retrieval integration. We build tailor made search engines over connected datasets. With th...


Slide Content

Searching Linked Data with Spinque Michiel Hildebrand, Wouter Alink, Roberto Cornacchia, Arjen de Vries Search Engines Amsterdam, January 30 2015

background concept product Information Retrieval and DB integration Cornacchia et al. Flexible and efficient IR using Array Databases . VLDB ‘08 Journal Mühleisen et a l. Column Stores as an IR Prototyping Tool . ECIR’14 & SIGIR’14 Search by Strategy Alink et al. Searching CLEF-IP by strategy . CLEF’09 PatOlympics, 2010 and 2011 Tailored access to connected datasets Koninklijke Bibliotheek, Wageningen Universiteit, Beeld&Geluid, Elsevier, Heineken, ...

Heterogenous Data Hang Li et al. A new approach to intranet search based on information extraction. CIKM’05 Complex information needs SQL CSV XML HTML OAI JSON

Heterogenous University Data Financial administration (ERP) Contract administration (CMS) Contract documents (CMS attachments) Publication database (Institutional Repository) Publication documents (Institutional Repository PDFs) Employee database (address lists, ERP+CMS) Companies (CMS + ERP + document mentions) Subsidy database (CMS) Departments (address lists, CMS) Web addresses (extracted from documents) Topic (assigned to publications) Research programmes (dependent on funding scheme) Complex information needs What funding schemes are the primary source of income? Can we move to Europe when Dutch funding dries up? Who has active relations with partner X? “Valorisation”; new national funding requirements What industry sectors do we depend upon? How many projects in smart cities? Green energy? Cloud computing? Etc. How are strategic decisions implemented? Has objective “move from Telecom toward ICT” been achieved, and how does it develop over time?

Heterogenous University Data Harvest and link data, model as a graph Complex information needs Search by Strategy

Project by topic Search in attachments of projects Search for project contracts (by metadata) Traverse from attachments to projects & combine results

Topic expert Search objects about topic Expand with neighbours in and out Return related persons Ranked by tf-idf on relations

Norbert Fuhr, Thomas Rölleke. A Probabilistic Relational Algebra for the Integration of Information Retrieval and Database Systems (1994)

API STRATEGY EDITOR COMPILER INDEXING PIPELINE SQL CSV HTML OAI XML APPLICATIONS

Search by Strategy (visual) modelling of search processes Rank. Everything. Always. all-round probabilistic search Many strategies, one data model many search engines, one index

Components Supporting the Open Data Exploitation

API SQL CSV HTML OAI XML STRATEGY EDITOR COMPILER APPLICATIONS INDEXING PIPELINE

Application front-end 400 lines of Javascript

autocompletion Application back-end 3 search strategies location search location + text search

API Builder for Open Data? Supporting (search) application developers Gregory Grefenstette. Search-based applications. 2010 Jamie Callan. Search Engine Support For Software Applications. CIKM 2010 Keynote Who builds search strategies? Developers are not IR specialists Domain specialists neither How to handle schema-mess ? in a heterogeneous dataspace

Happy alignments are all alike, every unhappy alignment is unhappy in it’s own way Jacco van Ossenbruggen 2012 (improvisation on Anna Karenina, Leo Tolstoy 1887)

Alignment strategies Interactive vocabulary alignment, Jacco van Ossenbruggen, Michiel Hildebrand, Victor de Boer, TPDL 2011 Coming soon Spinque Alignment Service Beeld&Geluid, Naturalis, Rijksdienst Cultureel Erfgoed (RCE)

www.spinque.com [email protected] comsode.eu