Querying Wikidata with SPARQL

During a conversation with friends, the question of what cities in the US are sisters with what cities in Japan. How would you know? How would you find out?

Since I'm a programmer, it's tempting to whip out the Python and scrape Wikipedia. However, there's a much better option: using Wikidata.

Wikidata has Wikipedia's data, but structured in a triple store.

Unlike a relational database, a triple store saves data in triplets representing a relationship: subject - predicate - object. And like a relational database, triple store databases have a query language.

Unfortunately, for the sake to namespacing and localization, predicates are assigned alphanumerical codes. It helps to have a Wikidata tab open to look up predicate names.

Wikidata lets you write and run queries at query.wikidata.org. The web interface also comes with helpful visualization options: for example, in this exercise, adding #defaultView:Map plots geographic data on a world map.

When writing these queries, it becomes painfully obvious how messy the real world is. In this example, finding "US cities with Japanese sister cities", the only well defined terms are "United States" and "Japan". There is no predicate for sister cities, only the more general wdt:P190 for twinned administrative body. Similarly, the terminology of city is vague too. New York City is sister cities with Tokyo, but under Japanese definition, Tokyo is a metropolis (都), similar to a prefecture, not a city (市). This is the result of differing government definitions of city and the data entry in Wikidata.

This is what I came up with, learning SPARQL as I went along:

```SPARQL

defaultView:Map

SELECT DISTINCT ?usaPlace ?usaPlaceLabel ?geo ?jpnPlace ?jpnPlaceLabel WHERE { SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } { ?usaPlace (wdt:P17/(wdt:P279)) wd:Q30; wdt:P190 ?jpnPlace. ?usaPlace wdt:P1082 ?population. ?usaPlace wdt:P625 ?geo. ?jpnPlace (wdt:P17/(wdt:P279)) wd:Q17. } } ORDER BY DESC(?population) ```

I ended up with the broader query of location (rather than strictly city) because it gave interesting results


Created . Updated .

Home > Q Science > QA Mathematics Computer science > Querying Wikidata with SPARQL