High Performance Methods for Linked Open Data Connectivity Analytics
Abstract
:1. Introduction
- we show why plain SPARQL is not enough for performing such measurements,
- we exploit the aforementioned catalogs, for constructing in parallel (by using MapReduce [10] techniques) semantically enriched indexes for entities, classes, properties and literals, and an entity-based triples index, for enabling the assessment of connectivity of a specific entity, and immediate access to the available information for that entity,
2. Background and Related Work
2.1. Background
2.1.1. RDF and Linked Data
2.2. MapReduce Framework
2.3. Related Work
2.3.1. Measurements at LOD Scale
2.3.2. Global RDF Cross-Dataset Services
2.3.3. Indexes for RDF Datasets by using Parallel Frameworks
3. Requirements and Problem Statement
3.1. Notations
3.2. Requirements
3.3. Problem Statement
4. Why Plain SPARQL Is Not Enough
5. Global Semantics-Aware Indexing for the LOD Cloud Datasets
5.1. Partitioning the Different Sets of Elements
5.2. Equivalence Relationships
- Entity Equivalence Catalog (EntEqCat): For each of the entities we assign a unique ID, where denotes this set of identifiers (i.e., a binary relation ). This catalog will be exploited for replacing each URI that occur in a triple with an identifier. For constructing this catalog, we read the URIs of each dataset (marked in bold in Figure 1) and the owl:sameAs relationships (see the upper right side of Figure 1), and we compute the transitive, symmetric and reflexive closure of owl:sameAs relationships, for finding the classes of equivalence. Finally, all the entities belonging to the same class of equivalence will be assigned the same identifier, e.g., see the EntEqCat in the running example of Figure 1.
- Property Equivalence Catalog (PropEqCat): For each of the properties , we store a unique ID, where denotes this set of identifiers (i.e., a binary relation ). As we shall see, this catalog is used for replacing the property of each triple with an identifier. For constructing it, one should read the properties of each dataset (underlined in Figure 1), the owl:equivalentProperty relationships (see the upper right side of Figure 1), and compute the closure of that relationships for producing the classes of equivalence for properties. Afterwards, all the properties belonging to the same class of equivalence are assigned the same identifier, e.g., in Figure 1, we can observe the PropEqCat of our running example.
- Class Equivalence Catalog (ClEqCat): For any class , we store a unique ID, where denotes this set of identifiers (i.e., a binary relation ). We will exploit this catalog for replacing each class occurring in triples with an identifier. For constructing it, one should read the classes (marked in italics in Figure 1), the owl:equivalentClass relationships, and compute their closure for finding the classes of equivalence. Finally, all the classes that refer to the same thing will take the same identifier. The resulted ClEqCat for our running example can be seen in Figure 1.
5.3. Creation of Semantics-Aware RDF Triples
Algorithm 1: Creation of Real World Triples. | |
Input: All triples and equivalence catalogs, EntEqCat, PropEqCat, and ClEqCat | |
Output: Real World Triples | |
1 function Mapper () | |
2 forall do | |
3 if then | |
4 if then | |
5 emit ; | // PropEqCat used and literal converted |
6 else if then | |
7 emit ; | // PropEqCat and ClEqCat used |
8 else if then | |
9 emit ; | // PropEqCat used |
10 else if then | |
11 emit | |
12 | |
13 function SubjectReducer (URI key, ) | |
14 forall do | |
15 if () then | |
16 | |
17 store ; | // All conversions finished. |
18 else | |
19 emit ; | // Object Replacement Needed |
20 emit | |
21 | |
22 function ObjectReducer (URI key, | |
23 forall do | |
24 | |
25 store ; | // All conversions finished. |
5.4. Constructing Semantics-Aware Indexes
5.4.1. Entity-Triples Index
Algorithm 2: Construction of Entity-Triples Index. |
Input: Real World Triples Output: Entity-Triples Index 1 function Entity-Triples Index-Mapper () 2 forall do 3 if then 4 emit 5 if then 6 emit 7 8 function Entity-Triples Index-Reducer () 9 10 forall do 11 if () then 12 13 else 14 if () then 15 16 else 17 18 store |
5.4.2. Semantically Enriched Indexes for Specific Sets of Elements
- Entity Index: it is a function P(D), where for a , , i.e., for each different real world entity, this index stores all the datasets where it occurs (see the Entity Index in Figure 1).
- Property Index: it is a function P(D), where for a , , i.e., it stores all the datasets where each different real world property occurs (see the Property Index in Figure 1).
- Class Index: it is a function P(D), where for a , , i.e., it stores the datasets where a real world class occurs (see the Class Index in Figure 1).
- Literals Index: it is a function P(D), where for a , , i.e., it stores all the datasets where a converted literal occurs (see the Literals Index in Figure 1).
Algorithm 3: Creation of a Semantically-Enriched Inverted Index for any set of elements. |
Input: Real World Triples Output: An inverted index for a set of specific elements 1 function Inverted Index-Mapper () 2 forall do 3 /*For constructing the Entity Index, include lines 4-7 */ 4 if then 5 emit 6 if then 7 emit 8 /*For constructing the Property Index, include lines 9-10*/ 9 if then 10 emit 11 /*For constructing the Class Index, include lines 12-13*/ 12 if then 13 emit 14 /*For constructing the Literals Index, include lines 15-16*/ 15 if then 16 emit 17 18 function Inverted Index-Reducer () 19 20 forall do 21 22 store |
6. Lattice-Based Connectivity Measurements for any Measurement Type
7. Experimental Evaluation
7.1. Comparative Results
7.2. Connectivity Measurements for LOD Cloud Datasets.
7.2.1. Conclusions about the Connectivity at LOD Scale
8. Discussion
Author Contributions
Funding
Conflicts of Interest
References
- Dong, X.L.; Berti-Equille, L.; Srivastava, D. Data fusion: Resolving conflicts from multiple sources. In Handbook of Data Quality; Springer: Berlin, Germany, 2013; pp. 293–318. [Google Scholar]
- Mountantonakis, M.; Tzitzikas, Y. How Linked Data can Aid Machine Learning-Based Tasks. In Proceedings of the International Conference on Theory and Practice of Digital Libraries, Thessaloniki, Greece, 18–21 September 2017; Springer: Berlin, Germany, 2017; pp. 155–168. [Google Scholar]
- Ristoski, P.; Paulheim, H. RDF2VEC: RDF graph embeddings for data mining. In Proceedings of the International Semantic Web Conference, Kobe, Japan, 17–21 October 2016; Springer: Berlin, Germany, 2016; pp. 498–514. [Google Scholar]
- Mountantonakis, M.; Tzitzikas, Y. On Measuring the Lattice of Commonalities Among Several Linked Datasets. Proc. VLDB Endow. 2016, 9, 1101–1112. [Google Scholar] [CrossRef]
- Mountantonakis, M.; Tzitzikas, Y. Scalable Methods for Measuring the Connectivity and Quality of Large Numbers of Linked Datasets. J. Data Inf. Qual. 2018, 9. [Google Scholar] [CrossRef]
- Paton, N.W.; Christodoulou, K.; Fernandes, A.A.; Parsia, B.; Hedeler, C. Pay-as-you-go data integration for linked data: opportunities, challenges and architectures. In Proceedings of the 4th International Workshop on Semantic Web Information Management, Scottsdale, AZ, USA, 20–24 May 2012; p. 3. [Google Scholar]
- Christophides, V.; Efthymiou, V.; Stefanidis, K. Entity Resolution in the Web of Data. Synth. Lect. Semant. Web 2015, 5, 1–122. [Google Scholar] [CrossRef]
- Ermilov, I.; Lehmann, J.; Martin, M.; Auer, S. LODStats: The data web census dataset. In Proceedings of the International Semantic Web Conference, Kobe, Japan, 17–21 October 2016; Springer: Berlin, Germany, 2016; pp. 38–46. [Google Scholar]
- Prud’ Hommeaux, E.; Seaborne, A. SPARQL Query Language for RDF. W3C Recommendation, 15 January 2008. [Google Scholar]
- Dean, J.; Ghemawat, S. MapReduce: Simplified data processing on large clusters. Commun. ACM 2008, 51, 107–113. [Google Scholar] [CrossRef]
- Antoniou, G.; Van Harmelen, F. A Semantic Web Primer; MIT Press: Cambridge, MA, USA, 2004. [Google Scholar]
- Rietveld, L.; Beek, W.; Schlobach, S. LOD lab: Experiments at LOD scale. In Proceedings of the International Semantic Web Conference, Bethlehem, PA, USA, 11–15 October 2015; Springer: Berlin, Germany, 2015; pp. 339–355. [Google Scholar]
- Fernández, J.D.; Beek, W.; Martínez-Prieto, M.A.; Arias, M. LOD-a-lot. In Proceedings of the International Semantic Web Conference, Vienna, Austria, 21–25 October 2017; pp. 75–83. [Google Scholar]
- Nentwig, M.; Soru, T.; Ngomo, A.C.N.; Rahm, E. LinkLion: A Link Repository for the Web of Data. In The Semantic Web: ESWC 2014 Satellite Events; Springer: Berlin, Germany, 2014; pp. 439–443. [Google Scholar]
- Schmachtenberg, M.; Bizer, C.; Paulheim, H. Adoption of the linked data best practices in different topical domains. In The Semantic Web–ISWC 2014; Springer: Berlin, Germany, 2014; pp. 245–260. [Google Scholar]
- Auer, S.; Demter, J.; Martin, M.; Lehmann, J. LODStats-an Extensible Framework for High-Performance Dataset Analytics. In Knowledge Engineering and Knowledge Management; Springer: Berlin, Germany, 2012; pp. 353–362. [Google Scholar]
- Giménez-Garcıa, J.M.; Thakkar, H.; Zimmermann, A. Assessing Trust with PageRank in the Web of Data. In Proceedings of the 3rd International Workshop on Dataset PROFIling and fEderated Search for Linked Data, Anissaras, Greece, 30 May 2016. [Google Scholar]
- Debattista, J.; Lange, C.; Auer, S.; Cortis, D. Evaluating the Quality of the LOD Cloud: An Empirical Investigation. Accepted for publication in Semant. Web J.. 2017. [Google Scholar]
- Debattista, J.; Auer, S.; Lange, C. Luzzu—A Methodology and Framework for Linked Data Quality Assessment. J. Data Inf. Qual. (JDIQ) 2016, 8, 4. [Google Scholar] [CrossRef]
- Mountantonakis, M.; Tzitzikas, Y. Services for Large Scale Semantic Integration of Data. ERCIM NEWS, 25 September 2017; 57–58. [Google Scholar]
- Vandenbussche, P.Y.; Atemezing, G.A.; Poveda-Villalón, M.; Vatant, B. Linked Open Vocabularies (LOV): A gateway to reusable semantic vocabularies on the Web. Semant. Web 2017, 8, 437–452. [Google Scholar] [CrossRef]
- Valdestilhas, A.; Soru, T.; Nentwig, M.; Marx, E.; Saleem, M.; Ngomo, A.C.N. Where is my URI? In Proceedings of the 15th Extended Semantic Web Conference (ESWC 2018), Crete, Greece, 3–7 June 2018. [Google Scholar]
- Mihindukulasooriya, N.; Poveda-Villalón, M.; García-Castro, R.; Gómez-Pérez, A. Loupe-An Online Tool for Inspecting Datasets in the Linked Data Cloud. In Proceedings of the International Semantic Web Conference (Posters & Demos), Bethlehem, PA, USA, 11–15 October 2015. [Google Scholar]
- Glaser, H.; Jaffri, A.; Millard, I. Managing Co-Reference on the Semantic Web; Web & Internet Science: Southampton, UK, 2009. [Google Scholar]
- Käfer, T.; Abdelrahman, A.; Umbrich, J.; O’ Byrne, P.; Hogan, A. Observing linked data dynamics. In Proceedings of the Extended Semantic Web Conference, Montpellier, France, 26–30 May 2013; Springer: Berlin, Germany, 2013; pp. 213–227. [Google Scholar]
- Käfer, T.; Umbrich, J.; Hogan, A.; Polleres, A. Towards a dynamic linked data observatory. In Proceedings of the LDOW at WWW, Lyon, France, 16 April 2012. [Google Scholar]
- McCrae, J.P.; Cimiano, P. Linghub: A Linked Data based portal supporting the discovery of language resources. In Proceedings of the SEMANTiCS (Posters & Demos), Vienna, Austria, 15–17 September 2015; Volume 1481, pp. 88–91. [Google Scholar]
- Vandenbussche, P.Y.; Umbrich, J.; Matteis, L.; Hogan, A.; Buil-Aranda, C. SPARQLES: Monitoring public SPARQL endpoints. Semant. Web 2016, 8, 1049–1065. [Google Scholar] [CrossRef]
- Yumusak, S.; Dogdu, E.; Kodaz, H.; Kamilaris, A.; Vandenbussche, P.Y. SpEnD: Linked Data SPARQL Endpoints Discovery Using Search Engines. IEICER Trans. Inf. Syst. 2017, 100, 758–767. [Google Scholar] [CrossRef]
- Papadaki, M.E.; Papadakos, P.; Mountantonakis, M.; Tzitzikas, Y. An Interactive 3D Visualization for the LOD Cloud. In Proceedings of the International Workshop on Big Data Visual Exploration and Analytics (BigVis’2018 at EDBT/ICDT 2018), Vienna, Austria, 26–29 March 2018. [Google Scholar]
- Ilievski, F.; Beek, W.; van Erp, M.; Rietveld, L.; Schlobach, S. LOTUS: Adaptive Text Search for Big Linked Data. In Proceedings of the International Semantic Web Conference, Kobe, Japan, 17–21 October 2016; pp. 470–485. [Google Scholar]
- Fernández, J.D.; Martínez-Prieto, M.A.; Gutiérrez, C.; Polleres, A.; Arias, M. Binary RDF representation for publication and exchange (HDT). Web Semant. Sci. Serv. Agents World Wide Web 2013, 19, 22–41. [Google Scholar] [CrossRef]
- Erling, O.; Mikhailov, I. Virtuoso: RDF support in a native RDBMS. In Semantic Web Information Management; Springer: Berlin, Germany, 2010; pp. 501–519. [Google Scholar]
- Aranda-Andújar, A.; Bugiotti, F.; Camacho-Rodríguez, J.; Colazzo, D.; Goasdoué, F.; Kaoudi, Z.; Manolescu, I. AMADA: Web data repositories in the amazon cloud. In Proceedings of the 21st ACM International Conference on Information and knowledge management, Maui, HI, USA, 29 October–2 November 2012; pp. 2749–2751. [Google Scholar]
- Papailiou, N.; Konstantinou, I.; Tsoumakos, D.; Koziris, N. H2RDF: adaptive query processing on RDF data in the cloud. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 April 2012; pp. 397–400. [Google Scholar]
- Punnoose, R.; Crainiceanu, A.; Rapp, D. Rya: A scalable RDF triple store for the clouds. In Proceedings of the 1st International Workshop on Cloud Intelligence, Istanbul, Turkey, 31 August 2012; p. 4. [Google Scholar]
- Schätzle, A.; Przyjaciel-Zablocki, M.; Dorner, C.; Hornung, T.D.; Lausen, G. Cascading map-side joins over HBase for scalable join processing. arXiv 2012, arXiv:1206.6293v1. [Google Scholar]
- Kaoudi, Z.; Manolescu, I. RDF in the clouds: A survey. VLDB J. 2015, 24, 67–91. [Google Scholar] [CrossRef]
- Tzitzikas, Y.; Lantzaki, C.; Zeginis, D. Blank node matching and RDF/S comparison functions. In Proceedings of the International Semantic Web Conference, Crete Greece, 31 May 2012; Springer: Berlin, Germany, 2012; pp. 591–607. [Google Scholar]
- Rastogi, V.; Machanavajjhala, A.; Chitnis, L.; Sarma, A.D. Finding connected components in map-reduce in logarithmic rounds. In Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), Brisbane, Australia, 8–11 April 2013; pp. 50–61. [Google Scholar]
- Jech, T. Set Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Okeanos Cloud Computing Service. Available online: http://okeanos.grnet.gr (accessed on 29 May 2018).
- DBpedia. Available online: http://dbpedia.org (accessed on 29 May 2018).
- Yago. Available online: http://yago-knowledge.org (accessed on 29 May 2018).
- Freebase. Available online: http://developers.google.com/freebase/ (accessed on 29 May 2018).
- Wikidata. Available online: http://www.wikidata.org (accessed on 29 May 2018).
- The British Library. Available online: http://bl.uk (accessed on 29 May 2018).
- Bibliothèque Nationale de France. Available online: http://www.bnf.fr (accessed on 29 May 2018).
- The Virtual International Authority File. Available online: http://viaf.org (accessed on 29 May 2018).
- JRC-Names. Available online: http://ec.europa.eu/jrc/en/language-technologies/jrc-names (accessed on 29 May 2018).
- OpenCyc. Available online: http://www.cyc.com/opencyc/ (accessed on 29 May 2018).
- ImageSnippets. Available online: http://www.imagesnippets.com/ (accessed on 29 May 2018).
- VIVO Wustl. Available online: http://old.datahub.io/dataset/vivo-wustl (accessed on 29 May 2018).
- Food and Agriculture Organization of the United Nations. Available online: http://www.fao.org/ (accessed on 29 May 2018).
- VIVO Scripps. Available online: http://vivo.scripps.edu/ (accessed on 29 May 2018).
- Mountantonakis, M.; Minadakis, N.; Marketakis, Y.; Fafalios, P.; Tzitzikas, Y. Quantifying the connectivity of a semantic warehouse and understanding its evolution over time. Int. J. Semant. Web Inf. Syst. (IJSWIS) 2016, 12, 27–78. [Google Scholar] [CrossRef]
- Alexander, K.; Cyganiak, R.; Hausenblas, M.; Zhao, J. Describing Linked Datasets with the VoID Vocabulary; W3C Interest Group Note; W3C: Cambridge, MA, USA, 2011. [Google Scholar]
- Library of Congress Linked Data Service. Available online: http://id.loc.gov/ (accessed on 29 May 2018).
- Deutschen National Bibliothek. Available online: http://www.dnb.de (accessed on 29 May 2018).
- Radatana. Available online: http://data.bibsys.no/ (accessed on 29 May 2018).
- GeoNames Geographical Database. Available online: http://www.geonames.org/ (accessed on 29 May 2018).
- Linked Movie Data Base (LMDB). Available online: http://linkedmdb.org/ (accessed on 29 May 2018).
Tool | Number of RDF Datasets | URI Lookup | Keyword Search | Connectivity | Dataset Discovery | Dataset Visualization | Dataset Querying | Dataset Evolution |
---|---|---|---|---|---|---|---|---|
LODsyndesis [4,5,20] | 400 | ✔ | ✔ | ✔ | ✔ | |||
LODLaundromat [12] | >650,000 (documents) | ✔ | ✔ | ✔ | ||||
LOD-a-Lot [13] | >650,000 (documents) | ✔ | ||||||
LODStats [8,16] | 9960 | ✔(Schema) | ✔ | ✔ | ||||
LODCache | 346 | ✔ | ✔(via SPARQL) | ✔ | ||||
LOV [21] | 637 (vocabularies) | ✔(Schema) | ✔ | ✔ | ||||
WIMU [22] | >650,000 (documents) | ✔ | ||||||
Loupe [23] | 35 | ✔ | ||||||
sameAs.org [24] | >100 | ✔ | ||||||
Datahub.io | 1270 | ✔ | ✔ | |||||
LinkLion [14] | 476 | ✔ | ✔ | |||||
DyLDO [25,26] | 86,696 (documents) | ✔ | ||||||
LODCloud [15] | 1184 | ✔ | ✔ | ✔ | ||||
Linghub [27] | 272 | ✔ | ✔ | ✔ | ||||
SPARQLES [28] | 557 | ✔ | ✔ | |||||
SpEnD [29] | 1487 | ✔ | ✔ |
Element | Datasets Where an Element Occurs |
---|---|
Entity u | |
Property p | |
Class c | |
Literal l | |
Triple t |
Formula | Which to Use. |
---|---|
Domain | |D| | |Triples| | |Entities| | |Literals| | |Unique Sub.| | |Unique Obj.| |
---|---|---|---|---|---|---|
Cross-Domain (CD) | 24 | 971,725,722 | 199,359,729 | 216,057,389 | 125,753,736 | 308,124,541 |
Publications (PUB) | 94 | 666,580,552 | 127,624,700 | 155,052,015 | 120,234,530 | 271,847,700 |
Geographical (GEO) | 15 | 134,972,105 | 40,185,923 | 25,572,791 | 20,087,371 | 47,182,434 |
Media (MED) | 8 | 74,382,633 | 16,480,681 | 9,447,048 | 14,635,734 | 20,268,515 |
Life Sciences (LF) | 18 | 74,304,529 | 10,050,139 | 10,844,398 | 9,464,532 | 18,059,307 |
Government (GOV) | 45 | 59,659,817 | 6,657,014 | 7,467,560 | 10,978,458 | 14,848,668 |
Linguistics (LIN) | 85 | 20,211,506 | 3,825,012 | 2,808,717 | 2,946,076 | 6,381,618 |
User Content (UC) | 14 | 16,617,837 | 7,829,599 | 901,847 | 3,904,463 | 8,708,650 |
Social Networks (SN) | 97 | 3,317,666 | 762,323 | 853,416 | 506,525 | 1,512,842 |
All | 400 | 2,021,772,367 | 412,775,120 | 429,005,181 | 308,419,818 | 691,140,591 |
Index/Catalog | Execution Time (96 Machines) | Size on Disk |
---|---|---|
Equivalence Catalogs | 9.35 min | 24 GB |
Real World Triples | 33.5 min | 82.4 GB |
Entity-Triples Index | 17 min | 70.3 GB |
URI Indexes | 13.2 min | 6 GB |
Literals Index | 8.5 min | 16 GB |
All | 81.55 min | 198.7 GB |
Measurement | Time for 45 Pairs | Time for 120 Triads |
---|---|---|
Common Entities | 44.9 min | 87.55 min |
Common URIs (without closure) | 15 min | 29.1 min |
Common Triples | 50 min | 92 min |
Common Triples (without closure) | 1.45 min | 7 min |
Common Literals | 6.8 min | 15 min |
Connectivity Measurement | Direct Counts List Size (% of Index Size) | Number of Subsets Measured | Execution Time |
---|---|---|---|
Common RW Entities | 21,781 (0.006%) | 18,531,752 | 51 s |
Common RW Triples | 4700 (0.0002%) | 1,776,136 | 3 s |
Common Literals | 318,978 (0.09%) | 4,979,482 (pairs and triads) | 328 s |
Category | Value |
---|---|
owl:sameAs Triples | 44,853,520 |
owl:sameAs Triples Inferred | 73,146,062 |
RW Entities having at least two URIs | 26,124,701 |
owl:equivalentProperty Triples | 8157 |
owl:equivalentProperty Triples Inferred | 935 |
RW Properties having at least two URIs | 4121 |
owl:equivalentClass Triples | 4006 |
owl:equivalentClass Triples Inferred | 1164 |
RW Classes having at least two URIs | 2041 |
Category | Exactly in 1 Dataset | Exactly in 2 Datasets | ≥ 3 Datasets |
---|---|---|---|
RW Entities | 339,818,971 (92.27%) | 21,497,165 (5.83%) | 6,979,109 (1.9%) |
Literals | 336,915,057 (88.88%) | 29,426,233 (7.77%) | 12,701,841 (3.35%) |
RW Triples | 1,811,576,438 (99.2%) | 10,300,047 (0.56%) | 4,348,019 (0.24%) |
RW Properties | 246,147 (99.37%) | 569 (0.23%) | 997 (0.4%) |
RW Classes | 542,549 (99.68%) | 1096 (0.2%) | 605 (0.11%) |
RW Subject-Object Pairs | 1,622,784,858 (97.18%) | 37,962,509 (2.27%) | 9,241,676 (0.55%) |
Category | Connected Pairs | Connected Triads | Disconnected Datasets (of 400) |
---|---|---|---|
Real World Entities | 9075 (11.3%) | 132,206 (1.24%) | 87 (21.75%) |
Literals | 62,266 (78%) | 4,917,216 (46.44%) | 3 (0.75%) |
Real World Triples | 4468 (5.59%) | 35,972 (0.33%) | 134 (33.5%) |
Real Subject-Object Pairs | 7975 (10%) | 107,083 (1%) | 129 (32.2%) |
Real World Properties | 19,515 (24.45%) | 569,708 (5.38%) | 25 (6.25%) |
Real World Classes | 4326 (5.42%) | 53,225 (0.5%) | 107 (26.7%) |
Datasets of subset B | Common RW Triples |
---|---|
1: {DBpedia,Yago,Wikidata} | 2,683,880 |
2: {Freebase,Yago,Wikidata} | 2,653,641 |
3: {DBpedia,Freebase,Wikidata} | 2,509,702 |
4: {DBpedia,Yago,Freebase} | 2,191,471 |
5: {DBpedia,Yago,Freebase,Wikidata} | 2,113,755 |
6: {DBpedia,Wikidata,VIAF} | 396,979 |
7: {bl.uk,DBpedia,Wikidata} | 92,462 |
8: {BNF,Yago,VIAF} | 52,420 |
9: {bl.uk,DBpedia,VIAF} | 24,590 |
10: {DBpedia,Wikidata,JRC-names} | 18,140 |
Position | Dataset-RW Entities in Datasets | Dataset-RW Triples in Datasets | Dataset-Literals in Datasets | |||
---|---|---|---|---|---|---|
1 | Wikidata | 4,580,412 | Wikidata | 4,131,042 | Yago | 9,797,331 |
2 | DBpedia | 4,238,209 | DBpedia | 3,693,754 | Freebase | 8,653,152 |
3 | Yago | 3,643,026 | Yago | 3,380,427 | Wikidata | 8,237,376 |
4 | Freebase | 3,634,980 | Freebase | 3,143,086 | DBpedia | 7,085,587 |
5 | VIAF | 3,163,689 | VIAF | 485,356 | VIAF | 3,907,251 |
6 | id.loc.gov | 2,722,156 | bl.uk | 125,484 | bl.uk | 1,819,223 |
7 | d-nb | 1,777,553 | bnf | 55,237 | GeoNames | 1,501,854 |
8 | bnf | 1,056,643 | JRC-names | 28,687 | id.loc.gov | 1,272,365 |
9 | bl.uk | 1,051,576 | Opencyc | 26,310 | bnf | 968,119 |
10 | GeoNames | 554,268 | LMDB | 20,465 | radatana | 957,734 |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mountantonakis, M.; Tzitzikas, Y. High Performance Methods for Linked Open Data Connectivity Analytics. Information 2018, 9, 134. https://doi.org/10.3390/info9060134
Mountantonakis M, Tzitzikas Y. High Performance Methods for Linked Open Data Connectivity Analytics. Information. 2018; 9(6):134. https://doi.org/10.3390/info9060134
Chicago/Turabian StyleMountantonakis, Michalis, and Yannis Tzitzikas. 2018. "High Performance Methods for Linked Open Data Connectivity Analytics" Information 9, no. 6: 134. https://doi.org/10.3390/info9060134
APA StyleMountantonakis, M., & Tzitzikas, Y. (2018). High Performance Methods for Linked Open Data Connectivity Analytics. Information, 9(6), 134. https://doi.org/10.3390/info9060134