使用 CKAN API 和 Python 请求库创建 CKAN 数据集 [英] Create CKAN dataset using CKAN API and Python Requests library

查看:48
本文介绍了使用 CKAN API 和 Python 请求库创建 CKAN 数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 CKAN 2.2 版并尝试自动创建数据集和上传资源.我似乎无法使用 python requests 库创建数据集.我收到 400 错误代码.代码:

I am using CKAN version 2.2 and am trying to automate dataset creation and resource upload. I seem to be unable to create a dataset using the python requests library. I am receiving 400 error code. Code:

import requests, json

dataset_dict = {
    'name': 'testdataset',
    'notes': 'A long description of my dataset',
}

d_url = 'https://mywebsite.ca/api/action/package_create'
auth = {'Authorization': 'myKeyHere'}
f = [('upload', file('PathToMyFile'))]

r = requests.post(d_url, data=dataset_dict, headers=auth)

奇怪的是,我能够使用 python requests 库创建新资源并上传文件.代码基于本文档.代码:

Strangely I am able to create a new resource and upload a file using the python requests library. The code is based on this documentation. Code:

import requests, json

res_dict = {
    'package_id':'testpackage',
    'name': 'testresource',
    'description': 'A long description of my resource!',
    'format':'CSV'
}

res_url = 'https://mywebsite.ca/api/action/resource_create'
auth = {'Authorization': 'myKey'}
f = [('upload', file('pathToMyFile'))]

r = requests.post(res_url, data=res_dict, headers=auth, files=f)

我还可以使用 CKAN 文档中的方法使用内置的 Python 库创建数据集.文档:CKAN 2.2

I am also able to create a dataset using the method in the CKAN documentation using built in python libraries. Documentation: CKAN 2.2

代码:

#!/usr/bin/env python
import urllib2
import urllib
import json
import pprint

# Put the details of the dataset we're going to create into a dict.
dataset_dict = {
    'name': 'test1',
    'notes': 'A long description of my dataset',
}

# Use the json module to dump the dictionary to a string for posting.
data_string = urllib.quote(json.dumps(dataset_dict))

# We'll use the package_create function to create a new dataset.
request = urllib2.Request('https://myserver.ca/api/action/package_create')

# Creating a dataset requires an authorization header.
request.add_header('Authorization', 'myKey')

# Make the HTTP request.
response = urllib2.urlopen(request, data_string)
assert response.code == 200

# Use the json module to load CKAN's response into a dictionary.
response_dict = json.loads(response.read())
assert response_dict['success'] is True

# package_create returns the created package as its result.
created_package = response_dict['result']
pprint.pprint(created_package)

我不确定为什么我创建数据集的方法不起作用.package_create 和 resource_create 函数的文档非常相似,我希望能够使用相同的技术.我更愿意使用 requests 包来处理我与 CKAN 的所有交易.有没有人能够成功地使用请求库创建数据集?

I am not really sure why my method of creating the dataset is not working. The documentation for package_create and resource_create functions is very similar and I would expect to be able to use the same technique. I would prefer to use the requests package for all my dealings with CKAN. Has anyone been able to create a dataset with the requests library successfully?

非常感谢任何帮助.

推荐答案

我终于回到这个问题并弄清楚了.Alice 检查编码的建议非常接近.虽然 requests 确实为您进行编码,但它也会根据输入自行决定适合哪种类型的编码.如果文件与 JSON 字典一起传入,requests 会自动执行 CKAN 接受的 multipart/form-data 编码,因此请求成功.

I finally came back to this and figured it out. Alice's suggestion to check the encoding was very close. While requests does do the encoding for you, it also decides on its own which type of encoding is appropriate depending on the inputs. If a file is passed in along with a JSON dictionary, requests automatically does multipart/form-data encoding which is accepted by CKAN therefore the request is successful.

然而,如果我们传递一个 JSON 字典,默认编码是 form 编码.CKAN 需要对没有文件的请求进行 URL 编码(application/x-www-form-urlencoded).为了防止 requests 进行任何编码,我们可以将参数作为字符串传递,然后 requests 将只执行 POST.这意味着我们必须自己对它进行 URL 编码.

However if we pass only a JSON dictionary the default encoding is form encoding. CKAN needs requests without files to be URL encoded (application/x-www-form-urlencoded). To prevent requests from doing any encoding we can pass our parameters in as a string then requests will perform only a POST. This means we have to URL encode it ourselves.

因此,如果我指定了内容类型,将参数转换为字符串并使用urllib进行编码,然后将参数传递给请求:

Therefore if I specify the content type, convert the parameters to a string and encode with urllib and then pass the parameter to requests:

head['Content-Type'] = 'application/x-www-form-urlencoded'
in_dict = urllib.quote(json.dumps(in_dict))
r = requests.post(url, data=in_dict, headers=head)

那么请求就成功了.

这篇关于使用 CKAN API 和 Python 请求库创建 CKAN 数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆