DBpedia (from “DB” for ” database “) is a project aimed at extracting structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web . [2] DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets . [3] Tim Berners-Lee described DBpedia as one of the most famous parts of the decentralized Linked Data effort. [4]

Background

The project was initiated by the University of Berlin and Leipzig University , in collaboration with OpenLink Software , [5] and the first publicly available dataset was published in 2007. It is made available under free licenses ( CC-BY-SA ) , Allowing others to reuse the dataset; it does not use HOWEVER year open datalicense to Waive the sui generis database rights .

Wikipedia articles consist of mostly free text, but also include structured information in the articles, such as ” infobox ” tables (the pull-out panels that appear in the top right of The mobile versions ), categorization information, images, geo-coordinates and links to external Web pages . This structured information is extracted and put in a uniform dataset which can be queried.

Dataset

In September 2014, version 2014 was released. [6] Compared to previous versions, one of the main changes was the way abstract texts got extracted. By running a local mirror of Wikipedia and retrieving the rendered abstracts from it, the extracted texts got considerably cleaner. Furthermore, a new data set Containing happy Extracted from Wikimedia Commons Was Introduced. The whole DBpedia data set describes 4.58 million entities, out of which 4.22 million are classified in a consistent ontology , including 1,445,000 persons, 735,000 seats, 123,000 music albums, 87,000 movies, 19,000 video games, 241,000 organizations, 251,000 species and 6,000 diseases. [7] The data set features labels and abstracts for these entities in up to 125 different languages; 25. 2 million links to images and 29.8 million links to external web pages. In addition, it contains around 50 million links into other RDF datasets, 80.9 million links to Wikipedia categories, and 41.2 million YAGO2 categories. [7] The DBpedia project uses the Resource Description Framework (RDF) to represent the extracted information and consists of 3 billion RDF triple, 580 million extracted from the English edition of Wikipedia and 2.46 billion from other language editions. [7] [7] The DBpedia project uses the Resource Description Framework (RDF) to represent the extracted information and consists of 3 billion RDF triple, 580 million extracted from the English edition of Wikipedia and 2.46 billion from other language editions. [7] [7] The DBpedia project uses the Resource Description Framework (RDF) to represent the extracted information and consists of 3 billion RDF triple, 580 million extracted from the English edition of Wikipedia and 2.46 billion from other language editions. [7]

From this data set, information spread across multiple pages can be extracted, for example book authorship can be put together from pages about the work, or the author. Further explanation needed ]

One of the challenges in extracting information from Wikipedia is that the same concepts can be expressed using different parameters in infobox and other templates, such as |birthplace=and |placeofbirth=. Because of this, queries about where people were born would have to search for both of these properties in order to get more complete results. As a result, the DBpedia Mapping Language has been developed to help in mapping these properties to an ontology while reducing the number of synonyms. Due to the broad diversity of infoboxes and properties in use on Wikipedia, the process of developing and improving these mappings has been opened to public contributions. [8]

Examples

DBpedia extracts factual information from Wikipedia, Wikipedia, the free encyclopedia. Data is accessed using an SQL -like query language for RDF called SPARQL . For example, suppose You Were interested in the Japanese shōjo manga series Tokyo Mew Mew , and wanted to find the kinds of other works written by ict illustrator. DBpedia combines information from Wikipedia’s entries on Tokyo Mew , Mia Ikumi and on works such as Super Doll Licca-chan and Koi Cupid .

PREFIX DBPROP : <http://dbpedia.org/property/>
PREFIX db : <http://dbpedia.org/resource/>
SELECT who? , WORK? , Genre? WHERE {
 db : Tokyo_Mew_Mew DBPROP : author who? .
 • WORK dbprop : author ? Who .
 OPTIONAL { ? WORK DBPROP : kind kind? } .
}

Use cases

DBpedia has a broad scope of entities covering different areas of human knowledge. This makes it a natural hub for connecting datasets, where external datasets could link to its concepts. [9] The DBpedia dataset is interlinked on the RDF level with various other Open Data datasets on the Web. DBpedia data with data from these datasets. As of September 2013 , there are more than 45 million interlinks between DBpedia and external datasets including: Freebase , OpenCyc , UMBEL , GeoNames , MusicBrainz , CIA World Fact Book , DBLP , Project Gutenberg , DBtune Jamendo , Eurostat , UniProt , Bio2RDF , and US Census data. [10] [11]The Thomson Reuters OpenCalais initiative , the Linked Open Data Project of the New York Times , the Zemanta API and DBpedia Spotlight also include links to DBpedia. [12] [13] [14] The BBC uses DBpedia to help organize its content. [15] [16] Faviki uses DBpedia for semantic tagging. [17] Samsung also includes DBpedia in its ”

Such a rich source of structured cross-domain knowledge is fertile ground for Artificial Intelligence systems. DBpedia Was used as one of the sources of knowledge in IBM Watson ‘s Jeopardy! Winning system [18]

Amazon provides a DBpedia Public Data Set that can be integrated into Amazon Web Services applications. [19]

DBpedia Spotlight

In June 2010 Researchers from the Web Based Systems Group at the Free University of Berlin started a project named DBpedia Spotlight, a tool for annotating mentions of DBpedia resources in text. This provides a solution for linking information to the Linked Open Data Cloud through DBpedia. DBpedia Spotlight performs named entity extraction , including entity detection and name resolution (in other words, disambiguation). It can also be used for named entity recognition , amongst other information extraction tasks. DBpedia Spotlight to be customizable for many use cases. Instead of focusing on a few entity types,

DBpedia Spotlight is publicly available as a web service for testing purposes or Java / Scala API licensed via the Apache License . The DBpedia Spotlight distribution also includes a jQuery plugin that allows developers to annotate pages anywhere on the web by adding one line to their page. [20] Customers are also available in Java or PHP . [21] The tool handles various languages ​​through its demo page [22] and web services. Internationalization has a Wikipedia. [23]

See also

  • Babelnet
  • Semantic MediaWiki
  • Wikidata

References

  1. Jump up^ “Dbpedia.org on Alexa” . Alexa Internet . Amazon.com . Retrieved 7 September 2016 .
  2. Jump up^ Bizer, Christian; Lehmann, Jens; Kobilarov, Georgi; Auer, Soren; Becker, Christian; Cyganiak, Richard; Hellmann, Sebastian (September 2009). “DBpedia – A crystallization point for the Web of Data” (PDF) . Web Semantics: Science, Services and Agents on the World Wide Web . 7 (3): 154-165. ISSN  1570-8268 . Doi: 10.1016 / j.websem.2009.07.002 .
  3. Jump up^ “Komplett verlinkt – Linked Data” (in German). 3sat . 2009-06-19 . Retrieved 2009-11-10 .
  4. Jump up^ “Sir Tim Berners-Lee Talks with Talis about the Semantic Web” . Talis. 7 February 2008.
  5. Jump up^ “Credits” . DBpedia. Archived from the original on 21 September 2014 . Retrieved 2014-09-09 .
  6. Jump up^ “Changelog” . DBpedia. September 2014 . Retrieved 9 September 2014 .
  7. ^ Jump up to:c “DBpedia Version 2014 released” . DBpedia . Retrieved 9 September 2014 .
  8. Jump up^ “DBpedia Mappings” . Mappings.dbpedia.org . Retrieved 2010-04-03 .
  9. Jump up^ E. Curry, A. Freitas, and S. O’Riain,”The Role of Community-Driven Data Curation for Enterprises,”in Linking Enterprise Data, D. Wood, Ed Boston, MA. Springer US, 2010 , Pp. 25-47.
  10. Jump up^ “Statistics on links entre Data sets” , SWEO Community Project: Linking Open Data on the Semantic Web , W3C , retrieved 2009-11-24
  11. Jump up^ “Statistics on Data sets” , SWEO Community Project: Linking Open Data on the Semantic Web , W3C , retrieved 2009-11-24
  12. Jump up^ Sandhaus, Evan; Larson, Rob (2009-10-29). “First 5,000 Tags Released to the Linked Data Cloud” . NY Times Blogs . Retrieved 2009-11-10 .
  13. Jump up^ “Life in the Linked Data Cloud” . Www.opencalais.com . Retrieved 2009-11-10 . Wikipedia has a Linked Data twin called DBpedia. DBpedia has the same structured information as Wikipedia – but translated into a machine-readable format.
  14. Jump up^ “Zemanta talks Linked Data with SDK and commercial API” . blogs.zdnet.com. Archived from the original on 28 February 2010 . Retrieved 2009-11-10 . Zemanta fully supports the Linking Open Data initiative. It is the first API that returns disambiguated entities linked to dbPedia, Freebase, MusicBrainz, and Semantic Crunchbase.
  15. Jump up^ “European Semantic Web Conference 2009 – Georges Kobilarov, Tom Scott, Yves Raimond, Silver Oliver, Chris Sizemore, Michael Smethurst, Christian Bizer and Robert Lee. ” . www.eswc2009.org. Archived from the original on 8 June 2009 . Retrieved 2009-11-10 .
  16. Jump up^ “BBC Learning – Open Lab – Reference” . bbc.co.uk. Archived from the original on 25 August 2009 . Retrieved 2009-11-10 . Dbpedia is a database of Wikipedia. It is used in a lot of projects for a wide range of different reasons. At the BBC we are using it for tagging content.
  17. Jump up^ “Semantic Tagging with Faviki” . www.readwriteweb.com.
  18. Jump up^ David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan David Gondek, Aditya Kalyanpur A. Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty”Building Watson: Year Overview of the DeepQA Project. “In AI Magazine Fall, 2010. Association for the Advancement of Artificial Intelligence (AAAI).
  19. Jump up^ “Amazon Web Services Developer Community: DBpedia” . Developer.amazonwebservices.com . Retrieved 2009-11-10 .
  20. Jump up^ Mendes, Pablo. “DBpedia Spotlight jQuery Plugin” . JQuery Plugins . Retrieved 15 September 2011 .
  21. Jump up^ DiCiuccio, Rob. “PHP Client for DBpedia Spotlight” . GitHub .
  22. Jump up^ “Demo of DBpedia Spotlight” . Retrieved 8 September 2013 .
  23. Jump up^ “Internationalization of DBpedia Spotlight” . Retrieved 8 September 2013 .

Leave a Reply

Your email address will not be published. Required fields are marked *