如何使用地名ID丰富地点 [英] How to enrich places with geonames ID

查看：154 发布时间：2020/6/16 18:45:15 sparql wikidata geonames

本文介绍了如何使用地名ID丰富地点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个地方列表，可以使用地名中的ID进行充实. 由于默认情况下，地理名称已嵌入WikiData中，因此我选择使用WikiData端点直接通过SPARQL进行访问.

I have a list of places which I would enrich with the IDs from geonames. Since geonames by default it's embedded into WikiData I chose to go directly via SPARQL using WikiData endpoint.

我的工作流程:

我已将excel文件导入OpenRefine并创建了一个新项目
在OpenRefine中，我创建了图形，然后将其下载为RDF/XML.这是快照:

I have imported the excel file into OpenRefine and created a new project
In OpenRefine I have created my graph, then I have downloaded it as RDF/XML. Here a snapshot:

  <rdf:Description rdf:about="http://localhost:3333/0">
      <rdfs:label>Aïre</rdfs:label>
      <crm:P1_is_identified_by>5A1CE163-105F-4BAF 8BF9</crm:P1_is_identified_by>
  </rdf:Description>

我先将RDF文件导入到本地graphDB中，然后运行联合查询:

I have imported then the RDF file into my local graphDB and I runned the federated query:

PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
     SELECT  *
             WHERE {?place <http://purl.org/NET/cidoc-crm/core#P1_is_identified_by> ?value;
                         rdfs:label ?label_geo.
                     SERVICE <https://query.wikidata.org/sparql> { 
                         ?value wdt:P31/wdt:P279* wd:Q515;
                                rdfs:label ?label; 
                                wdt:P1566 ?id_value. 
                      } 
                   }  

     limit 10

没有结果.

输出应该是这样的:


|-----------------------|------------------|---------------|
|      Oggetto          |    Place         | GeonamesID    |
|-----------------------|------------------|---------------|
|5A1CE163-105F-4BAF 8BF9|      Aïre        |11048419       |
|-----------------------|------------------|---------------|

建议?

非常感谢.

推荐答案

我直接通过客户端解决了问题

I solved the problem directly via client

这是我的管道:

我创建了一个带有地名列表的Excel工作表
我构建了一个Python脚本，该脚本使用excel工作表中的值作为查询参数，并将输出保存在.txt文件中.例如.艾尔， https://www.geonames.org/11048419

import pandas as pd 
import requests
import json
import csv


url = 'http://api.geonames.org/searchJSON?'

#Change df parameters according to excel sheet specification.

df = pd.read_excel('grp.xlsx', sheet_name='Foglio14', usecols="A")

for item in df.place_name:

    df.place_name.head()

    #Change username params with geonames API username

    params ={   'username': "XXXXXXXX", 

                'name_equals': item,

                'maxRows': "1"}

    e = requests.get(url, params=params)

    pretty_json = json.loads(e.text)

    with open("data14.txt", "a") as myfile:

            writer = csv.writer(myfile)

            for item in pretty_json["geonames"]:

                    #print("{}, https://www.geonames.org/{}".format(item["name"], item["geonameId"]))

                    writer.writerow([item["name"], "https://www.geonames.org/{}".format(item["geonameId"])])  #Write row.

    myfile.close()

我已将.txt文件的输出复制到excel工作表的B列中.
我将输出值分成两列.例如

    |---------------------|-----------------------------------|
    |      ColA           |     ColB                          |
    |---------------------|-----------------------------------|
    |         Aïre        | https://www.geonames.org/11048419 |
    |---------------------|-----------------------------------|

由于地名与获得的结果之间没有1:1的对应关系，因此我将这些值对齐.
- 在excel表中，我创建了一个新的空列B
- 在B列中，我写了公式:=IF(ISNA(MATCH(A1;C:C;0));"";INDEX(C:C;MATCH(A1;C:C;0)))，并且迭代了该公式，直到列表结尾
- 然后我创建了一个新的空列C
- 在C列中，我写了公式:=IFERROR(INDEX($E:$E;MATCH($B1;$D:$D;0));"")，并且迭代了该公式，直到列表结尾

Since there is no a 1:1 correspondence between place name and the obtained results I have aligned the values.
- In the excel sheet I have created a new empty column B
- In the column B I wrote the formula: =IF(ISNA(MATCH(A1;C:C;0));"";INDEX(C:C;MATCH(A1;C:C;0))) and I have iterated the formula till the end of the list
- Then I have created a new empty column C
- In the column C I wrote the formula: =IFERROR(INDEX($E:$E;MATCH($B1;$D:$D;0));"") and I have iterated the formula till the end of the list

这是最终结果:

这篇关于如何使用地名ID丰富地点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用地名ID丰富地点 [英] How to enrich places with geonames ID

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何使用地名ID丰富地点 [英] How to enrich places with geonames ID

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭