Wikidata:Request a query

Request a query

This is a page where SPARQL queries [Q114898838] can be requested. Please provide feedback if a query is written for you.

You can also request help to rewrite queries that don't work anymore, due to the WDQS graph split.

For sample queries, see Examples and Help:Dataset sizing. Property talk pages include also summary queries for these.

For help writing your own queries, or other questions about queries, see Wikidata talk:SPARQL query service/queries and Wikidata:SPARQL query service/query optimization.

Help resources about Wikidata Query Service (Q20950365) and SPARQL: Wikidata:SPARQL query service/Wikidata Query Help and Category:SPARQL.

To report an issue about the Query Service (interface, results views, export...) please see Wikidata:Contact the development team/Query Service and search.

Translate this header box!

Start a new discussion

On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/11.

Data matching and superset of WikiBio data set

I would like to download the full article text and other metadata of the same pages as the WikiBio data set https://paperswithcode.com/dataset/wikibio both when it was created -- 2016 with 728,000 records and today with however many there are now. the crucial thing that seems to be missing from the Examples is how to specify that I need the full text of the relevant article.

Query about museum's collection

Hello everyone! I'm looking for a query that can identify all the elements of a wikimedia commons category or a museum collection (that can also capture data such as upload date, license, etc)!

Query a random set of popular entries

I am trying to get a random set of entries from Wikidata that are "popular".

I define popular as having more than 20 site links. And I am interested in all kinds of entries (humans, cats, ... all, there are "filters" to avoid timeouts, but I want to remove them).

So far, this is what I have:

 SELECT ?item ?itemLabel ?sitelinks ?random WHERE {
 
   #Filters tried to avoid Timeout
   #?item wdt:P31 wd:Q146. #Cat, it works
   #?item wdt:P31 wd:Q11762356 . #Valley Glacier, it works
   #?item wdt:P31 wd:Q5 . #Humans, it gives a timeout!
 
   #Minimum sitelinks
   ?item wikibase:sitelinks ?sitelinks .
   FILTER (?sitelinks > 20).
 
   #Random stuff
   BIND(SHA512(CONCAT(STR(RAND()), STR(?item))) AS ?random) .
   SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en" . }
 
 } ORDER BY ?random
 LIMIT 100
 # jsadhflkjsd, change this before each run to bypass WDQS cache

I've got the code for the randomization from here. And I do believe it works.

However, I have issues with timeouts. It looks like it is trying to fetch all the entries that follow the "where" and then limit them to a 100, which is not feasible, obviously.

How should I do it? I'm quite lost at this point.

Yes, the query tries to fetch all matching items before limiting to 100 results. That is necessary to do the sorting. You asked for the 100 items with the lowest sorted values of ?random. To find these, it has to look at all values. Besides that, you can speed up the query by using a rangesafe hint:

SELECT ?item ?itemLabel ?sitelinks ?random
WHERE
{
  #Minimum sitelinks
  ?item wikibase:sitelinks ?sitelinks . hint:Prior hint:rangeSafe true .
  FILTER (?sitelinks > 20 )

  #Random stuff
  BIND(SHA512(CONCAT(STR(RAND()), STR(?item))) AS ?random) .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en" . }
}
LIMIT 100

Try it!

--Dipsacus fullonum (talk) 11:33, 7 April 2022 (UTC)[reply]

That works like a charm! Thanks for that superfast response! Guplem (talk) 11:44, 7 April 2022 (UTC)[reply]

One question: if you remove the "order by", the "randomness" of the Wikidata service will have to be trusted. But I don't even know if it is random! Is it?

Where can I find information about it? It looks random, but I don't know if it will start to fetch always the same group of entries. Guplem (talk) 11:48, 7 April 2022 (UTC)[reply]

The unsorted order may not be very random, I don't know for sure but that's a reasonable assumption. I tweaked the query some more with that in mind, mainly by calling the label service only for the final 100 items. Also RAND() seems to be sufficiently random on its own so that also helps speed things up. The only time you'll see the same items pop up twice in a row is where you re-run an unmodified query, then it will just present you with the results from the previous run. Infrastruktur (talk) 12:31, 7 April 2022 (UTC)[reply]

SELECT ?item ?itemLabel ?sitelinks
WITH { SELECT ?item ?sitelinks WHERE {
  #Minimum sitelinks
  ?item wikibase:sitelinks ?sitelinks . hint:Prior hint:rangeSafe true .
  FILTER (?sitelinks > 20 )

  #Random stuff
  #BIND(RAND() AS ?random) . # Doesn't work
  BIND(SHA512(CONCAT(STR(RAND()), STR(?item))) AS ?random) .
} ORDER BY ?random LIMIT 100 } AS %i
WHERE {
  INCLUDE %i
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en" . }
}

Try it!

@Guplem, Infrastruktur: I think the original use of a hash function is necessary. The RAND() function is only called once, so all items get the same value in Infrastruktur's query as you can see if you add ?random to the two SELECT clauses. You will also see that all 100 results have very low Q-numbers, indicating the selected items are not random at all. There are about 439000 items with over 20 sitelinks, and it seems that the modified query with ordering and limiting in a subquery before finding labels, can run in 30-40 seconds with the original calculation of ?random for a more random result. --Dipsacus fullonum (talk) 12:57, 7 April 2022 (UTC)[reply]

Nice catch. As for finding out for sure what the randomness of unordered results might be, that's implementation specific. So if you can't find it in the Blazegraph wiki, try to search Google Scholar for open-access papers done with Blazegraph. One would expect such a question to pop up in the method section of papers. Infrastruktur (talk) 13:12, 7 April 2022 (UTC)[reply]

I am struggling to follow it all. Using the old random method should work? I am still working on the query and right now, I have this.

It looks random to me, but on my end it looks like the filters are adding quite some work and I can't make it not time out if I go under 30 site links (the original target was 20, but it was arbitrary). I am wondering if there is a better way of filtering out undesired entries. Guplem (talk) 13:35, 7 April 2022 (UTC)[reply]

@Guplem: Filters often make a query slow, and with so many filters I am not surprised for timeouts for lower values of ?sitelinks than 30. I suggest first finding, say, 1000 random items with over 20 sitelinks, and then select 100 of these that passes your filters. That way you don't need to apply the filter to 100 of thousands of items. The prize is there will be less than 100 results in some runs. You can adjust the initial number 1000 to weigh that risc against additional used time for higher numbers. The query will look like this:

SELECT ?item ?itemLabel ?itemDescription ?sitelinks
WITH
{
  SELECT ?item ?sitelinks
  WHERE
  {
    #Minimum sitelinks
    ?item wikibase:sitelinks ?sitelinks.
    hint:Prior hint:rangeSafe true.
    FILTER (?sitelinks > 20 )
  
    #Random stuff
    # BIND(RAND() AS ?random) . # Using this makes it not random
    BIND(SHA512(CONCAT(STR(RAND()), STR(?item))) AS ?random) 
  }
  ORDER BY ?random
  LIMIT 1000
} AS %subquery1
WITH
{
  SELECT ?item ?sitelinks
  WHERE
  {
    INCLUDE %subquery1

    #Filters to remove undesired entries (templates, categories, ...)
    FILTER NOT EXISTS {?item wdt:P31 wd:Q11266439}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q97950663}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q4167836}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q59541917}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q14204246}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q19842659}
    FILTER NOT EXISTS {?item wdt:P373 ?commonsCategory}
    FILTER NOT EXISTS {?item wdt:P301 ?categoryMainTopic}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q15184295}
    FILTER NOT EXISTS {?item wdt:P1423 ?templateHasTopic}
    FILTER NOT EXISTS {?item wdt:P910 ?topicMainCategory}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q20010800}
    FILTER NOT EXISTS {?item wdt:P360 ?isAListOf}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q108783631}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q11753321}
    FILTER NOT EXISTS {?item wdt:P4224 ?categoryContains}
    FILTER NOT EXISTS {?item wdt:P971 ?categoryCombinesTopics}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q97303168}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q59259626}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q110010043}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q1474116}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q15647814}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q19887878}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q107344376}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q36330215}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q14296}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q42032}
    FILTER NOT EXISTS {?item wdt:P2370 ?conversionToSIUnit}
    FILTER NOT EXISTS {?item wdt:P31 wd:Q4167410}
    #FILTER NOT EXISTS {?item wdt:P31 wd:aaa}
  }
  LIMIT 100
} AS %subquery2
WHERE 
{
  INCLUDE %subquery2
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}

Try it!

--Dipsacus fullonum (talk) 14:34, 7 April 2022 (UTC)[reply]

That's smart! That works like a charm and there is no issue on receiving less than 100 results. Thank you very much!!!! Guplem (talk) 15:43, 7 April 2022 (UTC)[reply]

Yes, there is an issue on receiving less than 100 results. If the selection truly is random, then there is even a risc (although with very low probability) that 1000 unwanted items are selected in the first subquery. --Dipsacus fullonum (talk) 15:57, 7 April 2022 (UTC)[reply]

Get all Kind of food of a given language

Hi there, since I am not a programmer, I have laboriously worked out the query. For other queries with similar problems it works, but here I run into a timeout. I try to get all kind of food for german.

 construct{
 ?subclass  ?rpredicate  ?entity.
 ?entity ?predicate ?subclass}
 WHERE {
 ?subclass (wdt:P279*) wd:Q2095.
 { ?entity ?predicate ?subclass. 
 }
 UNION
 { ?subclass ?rpredicate ?entity. 
 }}

The query above is trying to get all triplets where either subject or object is a subclass of food. What is probably many millions triplets, so the query will timeout. Please tell more precisely what you try to query about. We cannot help without knowing the intention. --Dipsacus fullonum (talk) 18:11, 10 April 2022 (UTC)[reply]

I wanted to create a list that includes all types of food (ID are sufficient). Obviously there are too many. But I have no idea how to reduce the amount or break them down into smaller parts. Maybe in the first step it is possible to determine only the all (sub)classes and in the second step for each of the classes the individual members. But I do not know how. Depending on how many classes there are you can't do that by hand anymore. 87.181.42.168 19:51, 11 April 2022 (UTC)[reply]

The query below will list all items that a subclass of food (Q2095) or an instance of a such subclass. First column is the item name, and second column is the item's German label if it exists. Is that fine, or should it be changed?

SELECT DISTINCT ?item ?itemLabel
WHERE
{
  ?item wdt:P31? / wdt:P279* wd:Q2095 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de" . }
}

Try it!

--Dipsacus fullonum (talk) 20:44, 11 April 2022 (UTC)[reply]

Thank you very much Dipsacus fullonum. That is exactly what I am looking for. And it is still extremely fast 87.181.41.160 06:55, 15 April 2022 (UTC)[reply]

List the names of the guests of honour at a convention

For Illumination (Q111529504) return guest of honor (P967) as labels, ie "Paul J. McAuley" and "Geoff Ryman" as separate array items in a json file. This in addition to single value items such as dates, urls, and locationLabel which I can get

I'm surprised extracting multiple values of a property isn't covered in the Examples page, but I couldn't see one. Vicarage (talk) 11:28, 11 April 2022 (UTC)[reply]

SELECT ?gohLabel ?start ?locLabel ?countryLabel ?website
WHERE {
  VALUES ?aconv { wd:Q111529504 }
  ?aconv wdt:P967 ?goh.
  OPTIONAL { ?aconv wdt:P580 ?start. }
  OPTIONAL { ?aconv wdt:P17 ?country. }
  OPTIONAL { ?aconv wdt:P276 ?loc. }
  ?conv ^wdt:P31 ?aconv;
    wdt:P856 ?website.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}

Try it!

There's a "download" button in the middle right of the screen which lets you grab the table in json format.

The data on the conventions is a bit lacking, but you can help by adding more precise information. There's a page at https://eastercon.fandom.com/wiki/History which lists more precise locations and you can use either preferably Fandom article ID (P6262) (consisting of wikiname + ":" + pagename) or official website (P856) to link to the convention pages themselves.

Extracting multiple values is no different from single values. If the triple you match against have no restrictions that would prevent multiple matches, it will do so. Using the "optional" keyword usually takes care of this. Infrastruktur (talk) 13:17, 11 April 2022 (UTC)[reply]

I know about that first page, I wrote it!

The json I get is with your query is a table with information repeated (sorry about the formatting, pre, nowiki etc don't seem to work here

{

"gohLabel": "Paul J. McAuley",

"start": "1992-01-01T00:00:00Z",

"countryLabel": "United Kingdom",

"website": "http://www.eastercon.org/"

},

{

"gohLabel": "Geoff Ryman",

"start": "1992-01-01T00:00:00Z",

"countryLabel": "United Kingdom",

"website": "http://www.eastercon.org/"

}

]

but what I wanted was the GoHLabels as a json array, separate from the date and country, more like

{

"start": "1992-01-01T00:00:00Z",

"countryLabel": "United Kingdom",

"website": "http://www.eastercon.org/",

"GoHs": [

{

"Label": "Paul J. McAuley"

},

{

"Label": "Geoff Ryman"

}

]

}

Vicarage (talk) 13:49, 11 April 2022 (UTC)[reply]

Doubt that can be done without programming. However you can use aggregation to combine the GOHs into a list separated by for instance semicolon.

SELECT
  (GROUP_CONCAT(?gohLabel; SEPARATOR="; ") AS ?guestofhonor)
  (SAMPLE(?start) AS ?start)
  (SAMPLE(?locLabel) AS ?locLabel)
  (SAMPLE(?countryLabel) AS ?countryLabel)
  (URI(CONCAT("https://community.fandom.com/index.php?title=w:c:", SAMPLE(?faid))) AS ?article)
WHERE {
  VALUES ?aconv { wd:Q111529504 }
  OPTIONAL { ?aconv wdt:P967 ?goh. }
  OPTIONAL { ?aconv wdt:P580 ?start. }
  OPTIONAL { ?aconv wdt:P17 ?country. }
  OPTIONAL { ?aconv wdt:P276 ?loc. }
  OPTIONAL { ?aconv wdt:P6262 ?faid. }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" .
    ?goh rdfs:label ?gohLabel.
    ?loc rdfs:label ?locLabel.
    ?country rdfs:label ?countryLabel.
  }
}
GROUP BY ?aconv

Try it!

Infrastruktur (talk) 14:22, 11 April 2022 (UTC)[reply]

I've managed to work with that one. Thanks for your help. Vicarage (talk) 12:17, 12 April 2022 (UTC)[reply]

Oh, and let's know if you need a similar list covering all the Eastercons. Infrastruktur (talk) 13:40, 11 April 2022 (UTC)[reply]

I also wrote https://fancyclopedia.org/Eastercon, I'm trying to formalise the information in wikidata, and expand to other cons worldwide. Vicarage (talk) 13:53, 11 April 2022 (UTC)[reply]

Trying to remove duplicate authors in scientific articles

There's a bot that has caused some issues so long ago on so many items, I presume the only fix is to repair not restore or undo. I have found numerous items where the bot has added author (P50), but not deleted the author name string (P2093) and later ORCIDator efforts have duplicated the author values. See Search for Dark Matter and Large Extra Dimensions in pp Collisions Yielding a Photon and Missing Transverse Energy (Q56742969) as an example. There are many author (P50) values that are entered twice with one listing both object named as (P1932) and series ordinal (P1545) and again with only series ordinal (P1545). I recall there was a way to reorganize values according to the series ordinal (P1545). If I could do that, I could see what was listed twice rather because they'd follow one another rather than having to search each author. If that's not so easy (I'm not that clever with programming), maybe I can run a query that lists both author (P50) and author name string (P2093) and I can manipulate the csv file which I am clever at doing.

So, for the query, can I get all the author (P50) and author name string (P2093) values for Q56742969? Thanks in advance. Trilotat (talk) 20:51, 11 April 2022 (UTC)[reply]

I tried doing the obvious thing making a query that looked for exact matches, however even with the limited amount of author names in that item it made it cry uncle. This is still the method I would recommend though, however you need to do things in several steps. You'll need to grab the list of authors first, then comment out that paragraph and uncomment the next paragraph in the query below to grab a list of author strings. Save both lists to CSV files. Then the third and final step would be to use a tool to compare the sets of strings to find exact matches. There are several tools you could use, but I suspect a spreadhsheet application or OpenRefine would be the easiest ones. Other tools include awk, arq (jena) or any scripting language. Infrastruktur (talk) 12:40, 12 April 2022 (UTC)[reply]

Thank you!! Trilotat (talk) 14:46, 12 April 2022 (UTC)[reply]

SELECT ?authLabel ?authstr ?ord
WHERE {
  VALUES ?item { wd:Q56742969 }

  # Run first, then comment out paragraph
  ?item p:P50 ?auth_st.
  ?auth_st ps:P50 ?auth.
  OPTIONAL { ?auth_st pq:P1545 ?ord. }

  # Run second, uncomment paragraph after first run is complete
  #?item p:P2093 ?authstr_st.
  #?authstr_st ps:P2093 ?authstr.
  #OPTIONAL { ?authstr_st pq:P1545 ?ord. }

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
ORDER BY xsd:integer(?ord)

Try it!

@Trilotat, Infrastruktur: This query will find repeated use of the same series ordinal (P1545) used as qualifiers in either author (P50) or author name string (P2093).

SELECT (GROUP_CONCAT(?authLabel; separator="; ") AS ?author_labels)
       (GROUP_CONCAT(?authstr; separator="; ") AS ?author_strings)
       ?ord (COUNT(?ord) AS ?count)
WHERE
{
  {
    VALUES ?item { wd:Q56742969 }
    ?item p:P50 ?auth_st .
    ?auth_st ps:P50 ?auth .
    ?auth_st pq:P1545 ?ord .
  }
  UNION
  {
    VALUES ?item { wd:Q56742969 }
    ?item p:P2093 ?authstr_st .
    ?authstr_st ps:P2093 ?authstr .
    ?authstr_st pq:P1545 ?ord .
  }
  SERVICE wikibase:label
  {
    bd:serviceParam wikibase:language "en" .
    ?auth rdfs:label ?authLabel . 
  }
}
GROUP BY ?ord
HAVING (?count > 1)
ORDER BY xsd:integer(?ord)

Try it!

--Dipsacus fullonum (talk) 13:58, 12 April 2022 (UTC)[reply]

This was perfect! Thank you. I've bookmarked the query and will reuse as needed! Awesome. Now, on to the next one... Thanks again, Trilotat (talk) 14:48, 12 April 2022 (UTC)[reply]

Help get the dates if available

In my query

# Problem getting start and end year if they exist
SELECT DISTINCT ?mandal ?mandalLabel  ?startyear ?endyear WHERE {
  ?mandal (wdt:P31/(wdt:P279*)) wd:Q817477;
    p:P131 ?districtnode.
  ?districtnode ps:P131 wd:Q15341.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "te,en". }
  OPTIONAL {
     ?districtnode pq:P580 ?startdate. 
     ?districtnode pq:P582 ?enddate. 
  }
  BIND(IF(BOUND(?startdate),year(?startdate),0) as ?startyear)
  BIND(IF(BOUND(?enddate),year(?enddate),0) as ?endyear)     
}
ORDER BY (?mandalLabel)

Try it!

,

I am unable to print the start year and end year for a statement of P131 if available. Can you help fix the query? Thanks.

Arjunaraoc (talk) 02:08, 12 April 2022 (UTC)[reply]

Your optional clause will match only districts that have both a start and an end date. To fix it put "optional" around both the line with the start date and around the line with the end date.

SELECT DISTINCT
  ?mandal ?mandalLabel (YEAR(?startdate) AS ?startyear) (YEAR(?enddate) AS ?endyear)
WHERE {
  ?mandal wdt:P31/wdt:P279* wd:Q817477;
    p:P131 ?districtnode.
  ?districtnode ps:P131 wd:Q15341.
  OPTIONAL { ?districtnode pq:P580 ?startdate. } 
  OPTIONAL { ?districtnode pq:P582 ?enddate. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "te,en". }
}
ORDER BY (?mandalLabel)

Try it!

--Infrastruktur (talk) 11:22, 12 April 2022 (UTC)[reply]

@Infrastruktur, Thanks a lot.--Arjunaraoc (talk) 15:55, 12 April 2022 (UTC)[reply]

Requesting the value of a qualifier that may or may not exist

SELECT ?entity ?desc (year(?date) as ?year) ?work {
  wd:Q238702 p:P166 [ps:P166 ?entity ; pq:P585 ?date ; pq:P1686 ?work1]
  OPTIONAL { ?entity rdfs:label ?desc FILTER((LANG(?desc)) = "en") }
  OPTIONAL { ?work1 rdfs:label ?work FILTER((LANG(?work)) = "en") }
} ORDER BY ?year

Try it!

Returns the one award Brian Aldiss won where the work was named

SELECT ?entity ?desc (year(?date) as ?year){
  wd:Q238702 p:P166 [ps:P166 ?entity ; pq:P585 ?date]
  OPTIONAL { ?entity rdfs:label ?desc FILTER((LANG(?desc)) = "en") }
} ORDER BY ?year

Try it!

Returns a dozen works

How can I get both, so I get a table for the dozen awards, with one of them with the name of the book? My banal questions are because all the examples seem to go straight from trivial to showing-off, a vertical learning curve Vicarage (talk) 17:02, 14 April 2022 (UTC)[reply]

@Vicarage: You can use OPTIONAL sections so possible results are not discarded if some data, like here a qualifier for for work (P1686), isn't present. Your queries above use the so called abbreviated blank node syntax (in lines like wd:Q238702 p:P166 [ps:P166 ?entity ; pq:P585 ?date]). In order to use OPTIONAL sections, you need to first split these to use simple triplets, and explicit name a variable for the statement nodes which was only implicit in the abbreviated syntax. So my query below introduces the variable ?award_statement which is used in the OPTIONAL section to get values for the work. I also simplify how to get English labels by using the label service.

SELECT ?entity ?entityLabel (YEAR(?date) as ?year) ?workLabel
{
  wd:Q238702 p:P166 ?award_statement .
  ?award_statement ps:P166 ?entity .
  OPTIONAL { ?award_statement pq:P585 ?date . }
  OPTIONAL { ?award_statement pq:P1686 ?work . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}
ORDER BY ?year

Try it!

--Dipsacus fullonum (talk) 18:02, 14 April 2022 (UTC)[reply]

Thanks. I managed to use IF(BOUND) to force nulls so the Mediawiki ExternalData keeps the work array in step with the other awards Vicarage (talk) 08:13, 15 April 2022 (UTC)[reply]

help

Creating query for lebanese politicians who have missing articles in english. Im struggling to list out the images of them Prodrummer619 (talk) 17:44, 14 April 2022 (UTC)[reply]

SELECT ?item (COUNT(distinct ?sitelink) as ?count)  WHERE {    ?item wdt:P106 wd:Q82955 .   ?item wdt:P27 wd:Q822 .   ?sitelink schema:about ?item .   FILTER NOT EXISTS { ?wen schema:about ?item . ?wen schema:inLanguage "en" } }GROUP BY ?item ORDER BY DESC(?count)

Try it!

@Prodrummer619: You can use an optional section for getting images to your query, e.g. OPTIONAL { ?item wdt:P18 ?image . } You will need to also add the image variable to the GROUP BY clause, or use an aggregation function (typical SAMPLE) in the SELECT clause.

BTW. The query does find items without an article in English, but it doesn't test what type of article. So any English article e.g. in Wikipedia, Wikiquote or Wikisource will remove an item from results. Is that really your intention? --Dipsacus fullonum (talk) 18:19, 14 April 2022 (UTC)[reply]

Works now! And yes, that's not my intention, however I'm not prioritizing that right now. Thank you for the response. Prodrummer619 (talk) 18:45, 14 April 2022 (UTC)[reply]

Query to produce the following output

Starting from usernames and using the username, usercontributions and the global account I would like to be able to produce the following table at any given point in time via a query:

User	UserContributions	DateFirstEdit	TotalEdits	n_enwikiEdits	n_CommonsEdits	n_WikidataEdits	GlobalAccount	DateLastEdit	DaysEditing
MargaretRDonald	https://en.wikipedia.org/wiki/Special:Contributions/MargaretRDonald	2017-12-07T00:00:00Z	535225	33802	18443	472570	https://en.wikipedia.org/wiki/Special:CentralAuth/MargaretRDonald	2022-04-13T00:00:00Z	1588
Oronsay	https://en.wikipedia.org/wiki/Special:Contributions/Oronsay	2014-10-27T00:00:00Z	238065	75561	6254	154229	https://en.wikipedia.org/wiki/Special:CentralAuth/Oronsay	2022-04-13T00:00:00Z	2725
Cuyahoga44	https://en.wikipedia.org/wiki/Special:Contributions/Cuyahoga44	2019-07-22T00:00:00Z	11	11	0	0	https://en.wikipedia.org/wiki/Special:CentralAuth/Cuyahoga44	2022-04-07T00:00:00Z	990
DrPlantGenomics	https://en.wikipedia.org/wiki/Special:Contributions/DrPlantGenomics	2019-07-25T00:00:00Z	1272	1247	0	0	https://en.wikipedia.org/wiki/Special:CentralAuth/DrPlantGenomics	2022-04-13T00:00:00Z	993

It would be good to be able to do it via a query and not by error-prone hand. MargaretRDonald (talk) 04:08, 16 April 2022 (UTC)[reply]

@MargaretRDonald: You can get editcounts for all WikiMedia projects and a total count from the meta=globaluserinfo API query (e.g. like this) which can be integrated into a SPARQL query. But is DateFirstEdit, DateLastEdit and DaysEditing for a certain wiki (like e.g. the English Wikipedia) or the first and last edit on any MediaWiki project? If it is the former, then that wiki can be queried, but if it the latter, I don't see an easy solution. --Dipsacus fullonum (talk) 07:02, 16 April 2022 (UTC)[reply]

@Dipsacus fullonum: I'd be happy if it was the first and last edit on enwiki and to forget other projects.. Although later I would love to do the same for edits on wikicommons. MargaretRDonald (talk) 07:15, 16 April 2022 (UTC)[reply]

@MargaretRDonald: Here you are:

SELECT
  ?User
  (URI(CONCAT("https://en.wikipedia.org/wiki/Special:Contributions/", ?User)) AS ?UserContributions)
  ?DateFirstEdit ?TotalEdits ?n_enwikiEdits ?n_CommonsEdits ?n_WikidataEdits
  (URI(CONCAT("https://en.wikipedia.org/wiki/Special:CentralAuth/", ?User)) AS ?GlobalAccount)
  ?DateLastEdit
  (xsd:dateTime(?DateLastEdit) - xsd:dateTime(?DateFirstEdit) AS ?DaysEditing)
WHERE
{
  VALUES ?User { "MargaretRDonald" "Oronsay" "Cuyahoga44" "DrPlantGenomics" }
  SERVICE wikibase:mwapi
  {
    # Get edit counts from globaluserinfo API call.    

    # Best supported service is Generator. We don't need or use the generator output,
    # but can construct the API call to also include the meta=globaluserinfo info
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "random" . # This generator doesn't make much output, and doesn't require additional parameters

    # What we really want here:
    bd:serviceParam mwapi:meta "globaluserinfo" . 
    bd:serviceParam mwapi:guiuser ?User .
    bd:serviceParam mwapi:guiprop "merged|editcount" .

    # XPath expressions to extract the results:
    ?TotalEdits wikibase:apiOutput "/api/query/globaluserinfo/@editcount" .
    ?n_enwikiEdits wikibase:apiOutput "/api/query/globaluserinfo/merged/account[@wiki='enwiki']/@editcount" .
    ?n_CommonsEdits wikibase:apiOutput "/api/query/globaluserinfo/merged/account[@wiki='commonswiki']/@editcount" .
    ?n_WikidataEdits wikibase:apiOutput "/api/query/globaluserinfo/merged/account[@wiki='wikidatawiki']/@editcount" .
    bd:serviceParam wikibase:limit "once" .
  }
  SERVICE wikibase:mwapi
  {
    # Get dates for first edit at enwiki using the usercontribs API call.

    # Best supported service is Generator. We don't need or use the generator output,
    # but can construct the API call to also include the list=usercontribs info
    bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "random" . # This generator doesn't make much output, and doesn't require additional parameters
    
    # What we really want here:
    bd:serviceParam mwapi:list "usercontribs" . 
    bd:serviceParam mwapi:ucuser ?User .    
    bd:serviceParam mwapi:ucdir "newer" . # Start with oldest edit
    bd:serviceParam mwapi:ucprop "timestamp" . # Get timestamp
    bd:serviceParam mwapi:uclimit "1" .
    
    # XPath expressions to extract the result:
    ?DateFirstEdit wikibase:apiOutput "/api/query/usercontribs/item/@timestamp" .
    bd:serviceParam wikibase:limit "once" .
  }
  SERVICE wikibase:mwapi
  {
    # Get dates for last at enwiki using the usercontribs API call.

    # Best supported service is Generator. We don't need or use the generator output,
    # but can construct the API call to also include the list=usercontribs info
    bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "random" . # This generator doesn't make much output, and doesn't require additional parameters
    
    # What we really want here:
    bd:serviceParam mwapi:list "usercontribs" . 
    bd:serviceParam mwapi:ucuser ?User .    
    bd:serviceParam mwapi:ucdir "older" . # Start with newest edit
    bd:serviceParam mwapi:ucprop "timestamp" . # Get timestamp
    bd:serviceParam mwapi:uclimit "1" .
    
    # XPath expressions to extract the result:
    ?DateLastEdit wikibase:apiOutput "/api/query/usercontribs/item/@timestamp" .
    bd:serviceParam wikibase:limit "once" .
  }
}

Try it!

--Dipsacus fullonum (talk) 13:15, 16 April 2022 (UTC)[reply]

That is beautiful and amazing @Dipsacus fullonum: I have now tried it with 33 names in the value list and it works perfectly. MargaretRDonald (talk) 21:21, 16 April 2022 (UTC)[reply]

Unable to filter the results

Hi guys, I am trying to query all diseases with a symptom of "headache", the following is my query:

SELECT ?item ?itemLabel ?symptomsLabel WHERE {

 ?item wdt:P31 wd:Q12136 .
 ?item wdt:P780 ?symptoms .
 FILTER (REGEX(?symptoms, "headache"))
 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }

}

But the query returns me 0 record, can I know which part I did wrong?

Hi, your filter gives 0 results because ?symptoms will be bound to IRI values for items, and they doesn't contain any strings. For the appraoch with a REGEX filter to work, you would need an extra triple going from the symptoms item to either its label, description or aliases in some specific language. But that would be very inefficient. I propose to directly match the item for headache, headache (Q86):

SELECT ?item ?itemLabel
WHERE
{
 ?item wdt:P31 wd:Q12136 .
 ?item wdt:P780 wd:Q86 .
 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

--Dipsacus fullonum (talk) 05:53, 18 April 2022 (UTC)[reply]

Thanks for the help!

By the way, if you know any query, where I can get the ID, for example: Q86, from using the "Heart disease" string?

I am learning to write query without knowing the item's ID, which I will query from HTTP request rather than from the wikidata query service where we can press CTRL + SPACE to get the ID.

Thanks in advanced. 161.139.102.162 08:10, 18 April 2022 (UTC)[reply]

I am not sure if I understand the request. But you can find any items which have the string "Heart disease" as its English label with this query:

SELECT ?item ?itemLabel ?instance_ofLabel
WHERE
{
  ?item rdfs:label "Heart disease"@en .
  ?item wdt:P31 ?instance_of .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

Note the @en suffix to the string giving the language code for English. Nothing is found if you omit a language code as all labels are language coded. I added the corresponding values for instance of (P31) in the query so you can see that the found items are all scholarly articles. The disease isn't found because the English label for the item heart disease (Q190805)) is in lowercase as recommended in Help:Label#Capitalization. --Dipsacus fullonum (talk) 08:58, 18 April 2022 (UTC)[reply]

Yes, you did understand the request and manage to solve it.

Really appreciate the help! 161.139.102.162 09:13, 18 April 2022 (UTC)[reply]

Last revision

How to get output item and ID of the item's last revision? Thanks. Игорь Темиров (talk) 11:49, 18 April 2022 (UTC)[reply]

@Игорь Темиров:

#title: Get revision ID for last revision of some items
SELECT ?item ?revid
WHERE
{
  VALUES ?item_strings { "Q2" "Q5" }
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Generator" .
    bd:serviceParam mwapi:generator "revisions" .
    bd:serviceParam mwapi:prop "revisions" . 
    bd:serviceParam mwapi:titles ?item_strings .
    ?item wikibase:apiOutputItem mwapi:title .
    ?revid wikibase:apiOutput "revisions/rev/@revid" .
  }
}

Get revision ID for last revision of some items

--Dipsacus fullonum (talk) 12:54, 18 April 2022 (UTC)[reply]

Bravo! Thank you very much! Игорь Темиров (talk) 17:03, 18 April 2022 (UTC)[reply]

two or more pictures in one item via one property

We want to have a list, containing all Upper Lusatian house (Q1362233) like f.e. Horní Podluží 56 (Q111523584) containing two images, because we want to restructure our data :) . Regards, thank you very much! Conny (talk) 06:48, 20 April 2022 (UTC)[reply]

@Conny: Counts:

SELECT ?item ?itemLabel (count(?stat) as ?count ) WHERE {
  ?item wdt:P31 wd:Q1362233.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de,cs,en" . }
  
  ?item p:P18 ?stat.
  ?stat ps:P18 ?img.
} group by ?item ?itemLabel having (?count>1) order by desc(?count)

Try it!

image pairs - note some items will have multiple rows

SELECT ?item ?itemLabel ?img ?img2  WHERE {
  ?item wdt:P31 wd:Q1362233.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de,cs,en" . }
  
  ?item p:P18 ?stat.
  ?stat ps:P18 ?img.
  ?item p:P18 ?stat2.
  ?stat2 ps:P18 ?img2.
  filter(str(?stat) < str(?stat2))
}

Try it!

--Tagishsimon (talk) 08:25, 20 April 2022 (UTC)[reply]

Thank you so much Tagishsimon :) . Regards, Conny (talk) 14:25, 20 April 2022 (UTC).[reply]

Frwiki subways articles on a map

Hello, I'd like a map of (current) subways stations with frwiki article being dotted as green + their frwiki link and without frwiki articles being dotted as red with their (?en? wiki link). Is it feasible ? Bouzinac ^{💬●✒️●💛} 13:17, 20 April 2022 (UTC)[reply]

@Bouzinac:

#defaultView:Map{"hide": ["?rgb", "?coordinates"]}
SELECT DISTINCT ?item ?itemLabel ?coordinates ?wiki ?rgb
WHERE
{
  ?item wdt:P31 / wdt:P279* wd:Q928830 . # subway station
  ?item wdt:P17 wd:Q142 . # in France
  ?item wdt:P625 ?coordinates .
  OPTIONAL
  {
    ?frwiki schema:about ?item .
    ?frwiki schema:isPartOf <https://fr.wikipedia.org/> .
  }
  OPTIONAL
  {
    ?enwiki schema:about ?item .
    ?enwiki schema:isPartOf <https://en.wikipedia.org/> .
  }
  BIND (IF(BOUND(?frwiki), "00FF00", "FF0000") AS ?rgb)
  BIND (COALESCE(?frwiki, ?enwiki) AS ?wiki)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "fr,en" . }
}

Try it!

--Dipsacus fullonum (talk) 14:53, 20 April 2022 (UTC)[reply]

Very nice! I've amended here https://w.wiki/55LL with a yellow color (enwiki) and a red color (neither fr nor enwiki) ? Bouzinac ^{💬●✒️●💛} 15:17, 20 April 2022 (UTC)[reply]

Discovered today that you can plot interconnecting lines on the map as well, which is very cool. Here's the query adapted to show Paris metro just as an example. Edit: I'm aware the lines are doubled. Cookie to whoever fixes that. Infrastruktur (talk) 16:42, 20 April 2022 (UTC)[reply]

#defaultView:Map{"hide": ["?rgb", "?coord1","?coord2","?line"]}
SELECT DISTINCT ?station1 ?station1Label ?coord1 ?wiki ?rgb ?line
WHERE {
  ?station1 wdt:P31/wdt:P279* wd:Q928830;
    wdt:P361 wd:Q50716;
    wdt:P625 ?coord1.
  OPTIONAL {
    ?station1 wdt:P197 ?station2.
    ?station2 wdt:P625 ?coord2.
    ?station1 p:P625 [ ps:P625 []; psv:P625 [ wikibase:geoLongitude ?coord1lon; wikibase:geoLatitude ?coord1lat; ] ] .
    ?station2 p:P625 [ ps:P625 []; psv:P625 [ wikibase:geoLongitude ?coord2lon; wikibase:geoLatitude ?coord2lat; ] ] .
    BIND(CONCAT('LINESTRING (', STR(?coord1lon), ' ', STR(?coord1lat), ',', STR(?coord2lon), ' ', STR(?coord2lat), ')') AS ?str) .
    BIND(STRDT(?str, geo:wktLiteral) AS ?line) .
  }
  OPTIONAL {
    ?frwiki schema:about ?station1 .
    ?frwiki schema:isPartOf <https://fr.wikipedia.org/> .
  }
  OPTIONAL {
    ?enwiki schema:about ?station1 .
    ?enwiki schema:isPartOf <https://en.wikipedia.org/> .
  }
  BIND (IF(BOUND(?frwiki), "006400", IF(BOUND(?enwiki), "F7E35F", "FF0000")) AS ?rgb)
  BIND (COALESCE(?frwiki, ?enwiki) AS ?wiki)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "fr,en" . }
}

Try it!

@Infrastruktur:: Nice! We have seen maps using well-known text representation of geometry (Q4018860) before on this page, but a reminder of the possibilities is fine. You could add a simple filter: FILTER (STR(?station1) < STR(?station1)) but some lines might disappear where adjacent station (P197) are not used for both stations. The property has a suggestion constraint (Q62026391) of symmetric constraint (Q21510862), but that isn't a guarantee for completeness, so it is probably easiest to just leave the double lines. --Dipsacus fullonum (talk) 18:13, 20 April 2022 (UTC)[reply]

Yes, I would see Paris with this template {{Metro}} to see that , eg Map of Paris Métro (query). I would welcome any suggestion to improving the inside query :) Bouzinac ^{💬●✒️●💛} 05:08, 21 April 2022 (UTC)[reply]

That looks even better, and I guess the lines thing was old news. So yeah there are a couple of things I would do. First thing is put optional around the part that makes the line and requests P197, since if a terminal isn't connected it would be useful if it shows up in the map so that it can be fixed. Also there are places where you request both wdt and p for the same predicate, I would replace the wdt with p here to allow for reuse of data, but remember to add a wikibase:BestRank. Infrastruktur (talk) 13:02, 21 April 2022 (UTC)[reply]

Query timeout limit reached

I’m trying to find all subjects and objects with the property “shares border with (P47)” on Wikipedia in Arabic (I need to retrieve the Arabic wikipedia Title), but it times out when I added the : ^schema:about schema:isPartOf Lines..

Is there any possible way to get the results with timing out?

The Query: https://w.wiki/55Un

SELECT ?sbj_ar ?obj_ar WHERE {

 ?sbj wdt:P47 ?obj .
 
 ?sbj ^schema:about ?sbj_url_ar .
 ?sbj_url_ar schema:isPartOf <https://ar.wikipedia.org/>; schema:name ?sbj_ar .
 
 ?obj ^schema:about ?obj_url_ar .
 ?obj_url_ar schema:isPartOf <https://ar.wikipedia.org/>; schema:name ?obj_ar .

}

Thank you!

Is there any possible way to get the results without timing out? I doubt it because P47 is used over 600,000 times and more than 2 million items are linked to arwiki. I suggest trying to limit the results some way, e.g. by country, type of items or something else. --Dipsacus fullonum (talk) 08:27, 21 April 2022 (UTC)[reply]

How to optimise this query please ?

Hi all, So I have this query https://w.wiki/55Uu that works pretty well. Basically, it shows the number of Qitems with P269 broken down according to different person identifiers (Q36218176) somewhat related to France. I would like my query to be a little wider and to deal with all identifiers (Q19847637) and/or not related to France. The problem is that every time I try I reach a timeout. From the error messages I got I understand that you can't have nested INCLUDE, but I may be wrong. Does anyone have a solution ? Thanks !

Server Timeout Issue

Hello all! I'm running into an issue with the following query where the server times out:

SELECT
  ?person ?personLabel (GROUP_CONCAT(?citizenshipLabel;SEPARATOR=', ') AS ?citizenship)
  WHERE
  {
    ?person wdt:P3171 ?value .
    OPTIONAL { ?person wdt:P27 ?value2 . 
             ?value2 rdfs:label ?citizenshipLabel . FILTER (LANG(?citizenshipLabel)='en') }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
  }
  GROUP BY ?person ?personLabel

Try it!

I'm attempting to write a query that returns the citizenship of Olympians, with the eventual hope of adding other parameters (like sex or gender), but even just the combination of these two properties returns an error (even after attempting to limit the number of rows). I know this is a known limitation of current server resources, but if anybody knows any workarounds that might help, I'd be immensely grateful!

This seems to work, but don't ask me why it's faster. :P Got it down to about 30s, but adding gender made it dangerously close to timing out again. You can see where I've commented it out. Probably the only way you'll be able to have more columns is if you limit the working set by some criteria, gender for instance. Infrastruktur (talk) 16:39, 21 April 2022 (UTC)[reply]

Thank you very much for your helpful response! Tjg6ph (talk) 23:02, 21 April 2022 (UTC)[reply]

SELECT
  ?person (SAMPLE(?personLabel) AS ?personLabel) (SAMPLE(?genderLabel) AS ?genderLabel) (GROUP_CONCAT(?citizenshipLabel;SEPARATOR=', ') AS ?citizenship)
WHERE {
  ?person wdt:P3171 [] .
  OPTIONAL { ?person wdt:P27 ?value2 . }
  #OPTIONAL { ?person wdt:P21 ?gender . }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
    ?person rdfs:label ?personLabel.
    #?gender rdfs:label ?genderLabel.
    ?value2 rdfs:label ?citizenshipLabel.
  }
}
GROUP BY ?person

Try it!

SELECT
  ?person (SAMPLE(?personLabel) AS ?personLabel) (GROUP_CONCAT(?citizenshipLabel;SEPARATOR=', ') AS ?citizenship)
WHERE {
  ?person wdt:P3171 [];
    wdt:P21 wd:Q6581072. # Only females
  OPTIONAL { ?person wdt:P27 ?value2 . }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
    ?person rdfs:label ?personLabel.
    ?value2 rdfs:label ?citizenshipLabel.
  }
}
GROUP BY ?person

Try it!

Federation question

Here's a query (which works) to find the most common classes of the values of depicts (P180) statements on Commons:

# Most common classes for "depicts" (P180) on Commons
SELECT (SUM(?count) AS ?sum) ?class ?class_label (SAMPLE(?value) AS ?instance) (SAMPLE(?value_label) AS ?instance_label) (SAMPLE(?example) AS ?sample)

WITH { 
  SELECT (COUNT(DISTINCT(?file)) AS ?count) ?value (SAMPLE(?file) AS ?example) WHERE {
     service bd:sample {
       ?file wdt:P180 ?value . 
        bd:serviceParam bd:sample.limit 1000 .
        bd:serviceParam bd:sample.sampleType "RANDOM" .
     }  
  } GROUP BY ?value
  ORDER BY DESC(?count)
#  LIMIT 2000           
} AS %values
 
WHERE {
    INCLUDE %values .
    service <https://query.wikidata.org/sparql> {
      ?value wdt:P31 ?class
      OPTIONAL {?value rdfs:label ?value_label FILTER (lang(?value_label) = 'en') } .
      OPTIONAL {?class rdfs:label ?class_label FILTER (lang(?class_label) = 'en') } .
    }
} GROUP BY ?class ?class_label 
ORDER BY DESC(?sum)

Try it!

However, the query seems to break (it times out) if I try to do the grouping before the label lookup:

# Most common values for "depicts" (P180)
SELECT ?sum ?class ?class_label ?instance ?instance_label ?sample

WITH { 
  SELECT (COUNT(DISTINCT(?file)) AS ?count) ?value (SAMPLE(?file) AS ?example) WHERE {
     service bd:sample {
       ?file wdt:P180 ?value . 
        bd:serviceParam bd:sample.limit 10 .
        bd:serviceParam bd:sample.sampleType "RANDOM" .
     }  
  } GROUP BY ?value
  ORDER BY DESC(?count)
  LIMIT 2000           
} AS %values
 
WHERE {
    INCLUDE %values .
    service <https://query.wikidata.org/sparql> {
      {
         SELECT (SUM(?count) AS ?sum) ?class (SAMPLE(?value) AS ?instance) (SAMPLE(?example) AS ?sample) WHERE {
            ?value wdt:P31 ?class
         } GROUP BY ?class 
      }
      OPTIONAL {?instance rdfs:label ?instance_label FILTER (lang(?instance_label) = 'en') } .
      OPTIONAL {?class rdfs:label ?class_label FILTER (lang(?class_label) = 'en') } .
    }
} 
ORDER BY DESC(?sum)

Try it!

Can anybody spot what I've got wrong here, or what I need to do to fix it? Possibly it may be as simple as having mis-spelt a variable name somewhere, but I can't see it. Alternatively, is there anything I need to do, to make sure that the federated query can see the values of ?count, ?value and ?example that it needs? It's been a while since I've done anything with federation, have I missed something out?

Thanks for any help you can give, Jheald (talk) 20:38, 21 April 2022 (UTC)[reply]

One way round it is this:

# Most common values for "depicts" (P180)
SELECT ?sum ?class ?class_label ?instance ?instance_label ?sample

WITH { 
  SELECT (COUNT(DISTINCT(?file)) AS ?count) ?value (SAMPLE(?file) AS ?example) WHERE {
     service bd:sample {
       ?file wdt:P180 ?value . 
        bd:serviceParam bd:sample.limit 1000 .
        bd:serviceParam bd:sample.sampleType "RANDOM" .
     }  
  } GROUP BY ?value
  ORDER BY DESC(?count)
#  LIMIT 2000           
} AS %values
 
WITH {
  SELECT (SUM(?count) AS ?sum) ?class (SAMPLE(?value) AS ?instance) (SAMPLE(?example) AS ?sample) WHERE {
    INCLUDE %values .
    service <https://query.wikidata.org/sparql> {
         ?value wdt:P31 ?class
    }
  } GROUP BY ?class 
} AS %classes

WHERE {
  INCLUDE %classes . 
  service <https://query.wikidata.org/sparql> {
      OPTIONAL {?instance rdfs:label ?instance_label FILTER (lang(?instance_label) = 'en') } .
      OPTIONAL {?class rdfs:label ?class_label FILTER (lang(?class_label) = 'en') } .
  }
} 
ORDER BY DESC(?sum)

Try it!

but I'm curious as to why the second query above didn't work, and what I would have needed to have done to make it work, if anyone has any thoughts. Jheald (talk) 20:47, 21 April 2022 (UTC)[reply]

That's a wicked hard problem. I wasn't able to spot any issues with query two but I made two observations. In query one and three you perform the aggregation step on the commons endpoint, where on query two you perform it on the wikidata endpoint which gets a lot more traffic. I also tried converting your third query from using named subqueries to using nested regular subqueries and that worked just fine too, it didn't even look scary. Ok, maybe a little... Infrastruktur (talk) 22:40, 21 April 2022 (UTC)[reply]

@Jheald: I think that the problem with query 2 is that you have ?value wdt:P31 ?class in a nested subquery where the variable ?value from the INCLUDE isn't in scope, so it tries to match all triples with wdt:P31. You cannot move the INCLUDE inside the federation, so I don't have better solutions than your query 1 or 3. --Dipsacus fullonum (talk) 03:11, 22 April 2022 (UTC)[reply]

@Dipsacus fullonum, Infrastruktur: Aaah. Yes, that's it. I've got the INCLUDE %values in the wrong place in #2. And, as you say, if I try to move it inside the federation, the results aren't on wdqs to use, while on the Commons side they never get re-used, so the query then throws the error "Named subquery results are not used by this query: %values". So: #3 it is, then. Thanks everybody. Jheald (talk) 09:40, 22 April 2022 (UTC)[reply]

Wikidata:Request a query

Contents

Data matching and superset of WikiBio data set

Query about museum's collection

Query a random set of popular entries

Get all Kind of food of a given language

List the names of the guests of honour at a convention

Trying to remove duplicate authors in scientific articles

Help get the dates if available

Requesting the value of a qualifier that may or may not exist

help

Query to produce the following output

Unable to filter the results

Last revision

two or more pictures in one item via one property

Frwiki subways articles on a map

Query timeout limit reached

How to optimise this query please ?

Server Timeout Issue

Federation question

Navigation menu