使用SPARQL仅对一个值区分 [英] DISTINCT only on one value with SPARQL
问题描述
我想用SPARQL检索人口超过10万的意大利城市的列表,而我正在使用以下查询:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?city ?name ?pop WHERE {
?city a dbo:Settlement .
?city foaf:name ?name .
?city dbo:populationTotal ?pop .
?city dbo:country ?country .
?city dbo:country dbpedia:Italy .
FILTER (?pop > 100000)
}
例如,在结果中,我得到了两条不同的行(它们代表相同的实体,但名称不同):
http://dbpedia.org/resource/Bologna "Bologna" @en 384038 >
http://dbpedia.org/resource/Bologna 博洛尼亚公社" @en 384038
如何仅在列?city
中使用SELECT DISTINCT
,但仍将其余列作为输出?
您可以使用GROUP BY
按特定列分组,然后使用SAMPLE()
聚合从其他列中选择一个值,例如>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?city (SAMPLE(?name) AS ?cityName) (SAMPLE(?pop) AS ?cityPop)
WHERE
{
?city a dbo:Settlement .
?city foaf:name ?name .
?city dbo:populationTotal ?pop .
?city dbo:country ?country .
?city dbo:country dbpedia:Italy .
FILTER (?pop > 100000)
}
GROUP BY ?city
因此,通过对?city
进行分组,每个城市只能获得一行,因为您已按?city
进行分组,因此无法直接选择不是分组变量的变量.
您必须改为使用SAMPLE()
聚合为最终结果中希望具有的每个非组变量选择一个值.这将选择?name
和?pop
的值之一分别返回为?cityName
和?cityPop
I want to retrieve with SPARQL the list of the italian cities with more than 100k of population and I'm using the following query:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?city ?name ?pop WHERE {
?city a dbo:Settlement .
?city foaf:name ?name .
?city dbo:populationTotal ?pop .
?city dbo:country ?country .
?city dbo:country dbpedia:Italy .
FILTER (?pop > 100000)
}
In the results I get for example in two different lines (which represent the same entity, but with different names):
http://dbpedia.org/resource/Bologna "Bologna"@en 384038
http://dbpedia.org/resource/Bologna "Comune di Bologna"@en 384038
How can I use SELECT DISTINCT
only in the column ?city
but still having as output the outher columns?
You can use GROUP BY
to group by a specific column and then use the SAMPLE()
aggregate to select one of the values from the other columns e.g.
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?city (SAMPLE(?name) AS ?cityName) (SAMPLE(?pop) AS ?cityPop)
WHERE
{
?city a dbo:Settlement .
?city foaf:name ?name .
?city dbo:populationTotal ?pop .
?city dbo:country ?country .
?city dbo:country dbpedia:Italy .
FILTER (?pop > 100000)
}
GROUP BY ?city
So by grouping on the ?city
you get only a single row per city, since you have grouped by ?city
you can't directly select variables that aren't group variables.
You must instead use the SAMPLE()
aggregate to pick one of the values for each of the non-group variables you wish to have in the final results. This will select one of the values of ?name
and ?pop
to return as ?cityName
and ?cityPop
respectively
这篇关于使用SPARQL仅对一个值区分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!