如何从命令行使用BigQuery REST API? [英] How can I use the BigQuery REST API from the command line?

查看:166
本文介绍了如何从命令行使用BigQuery REST API?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图对其中一个BigQuery REST API发出一个简单的GET请求会产生如下错误:

  curl https://www.googleapis.com/bigquery/v2/projects/$PROJECT_ID/jobs/$JOBID 



<输出:

  {
错误:{
错误:[
{
domain:global,
reason:required,
message:需要登录,
locationType: ,
location:Authorization,
...

什么是从命令行调用其中一个REST API的正确方式,例如查询插入 API ? API参考有一个尝试此API ,但这些示例并不直接转换为您可以从命令行运行的内容。 解决方案

使用 bq 工具从命令行工作通常是足够的,或者对于更复杂的用例, BigQuery客户端库支持使用BigQuery从多种语言进行编程。但是,有时候向REST API发出明确的请求以查看某些API在低级别的工作方式仍然有用。首先,确保你有< a href =https://cloud.google.com/sdk/docs/ =nofollow noreferrer>安装了Google Cloud SDK 。这应该包括 gcloud bq 命令行工具。如果您还没有,请在终端上运行以下命令来授权您的帐户:

  gcloud auth login 

这会提示您登录,然后为您提供一个可以粘贴到终端的访问代码。 (确切的过程可能会随着时间的推移而改变)。

现在让我们使用BigQuery REST API尝试一个查询,调用 jobs.query method 。使用您自己的项目名称修改此脚本,您可以从 Google云端控制台中找到该项目名称,然后粘贴脚本进入您的终端:

  PROJECT =YOUR_PROJECT_NAME
QUERY =\SELECT 1 AS x, 'foo'AS y; \
REQUEST ={\kind \:\bigquery#queryRequest \,\useLegacySql\:false,\query \:$ QUERY}
echo $ REQUEST | \
curl -X POST -d @ - -HContent-Type:application / json\
-HAuthorization:Bearer $(gcloud auth print-access-token)\
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries



  {
kind:

如果它有效,您应该看到如下所示的输出: bigquery#queryResponse,
schema:{
fields:[
{
name:x,
type:INTEGER ,
mode:NULLABLE
},
{
name:y,
type:STRING,
mode:NULLABLE
}
]
},
jobReference:{
projectId:< your project ID> ,
jobId:<您的工作ID>
},
totalRows:1,
rows:[
{
f:[
{
v:1
},
{
v:foo
}
]
}
],
totalBytesProcessed:0,
jobComplete:true,
cacheHit:false
}

如果您尚未设置 bq 命令行工具,则可以使用 bq init 从您的终端执行此操作。一旦你有了,你可以尝试使用它运行相同的查询:

  bq query --use_legacy_sql = FalseSELECT 1 AS x ,'foo'AS y; 

您还可以看到 bq 工具通过传递 - apilog = 选项:

  bq --apilog = query --use_legacy_sql = FalseSELECT [1,2,3] AS x; 

现在让我们尝试使用 jobs.insert 方法而不是查询 API。运行这个脚本,用您的项目名称替换 YOUR_PROJECT_NAME

  PROJECT = YOUR_PROJECT_NAME
QUERY =\SELECT 1 AS x,'foo'AS y; \
REQUEST ={\configuration \:{\query \\ \\:{\useLegacySql \:false,\query \:$ {QUERY}}}}
echo $ REQUEST | \
curl -X POST -d @ - -HContent-Type:application / json\
-HAuthorization:Bearer $(gcloud auth print-access-token)\
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs

查询 API不同,它会立即返回响应,您将看到类似于此的结果:

  {
kind:bigquery#job,
etag:\< etag string> \,
id:< project name>:< job ID>,
selfLink:https://www.googleapis.com/bigquery/v2/projects/<project name> ; / jobs /< job ID>,
jobReference:{
projectId:< project name>,
jobId:< job ID>
配置:{
query:{
query:SELECT 1 AS x,'foo'AS y;,
destinationTable:
projectId:< project name>,
datasetId:< anonymous dataset>,
tableId:< anonymous table> ;
},
createDisposition:CREATE_IF_NEEDED,
writeDisposition:WRITE_TRUNCATE,
useLegacySql:false
}
},
status:{
state:RUNNING
},
statistics:{
creationTime:< timestamp millis> ,
startTime:< timestamp millis>
},
user_email:<您的电子邮件地址>
}

请注意状态:

 status:{
state:RUNNING
},

如果您想立即检查工作,则可以使用 jobs.get 方法。与以前类似,请使用上一步输出中的作业ID在终端中运行:

  PROJECT =YOUR_PROJECT_NAME 
JOB_ID =YOUR_JOB_ID
curl -H授权:不记名$(gcloud auth打印访问令牌)\
https://www.googleapis.com/bigquery/ v2 / projects / $ PROJECT / jobs / $ JOB_ID

如果查询完成,您将获得一个响应,表明一样多:

  ... 
status:{
state :完成
},
...

最后,我们可以请使用REST API提取查询结果。

  curl -H授权:承载$(gcloud auth打印访问令牌)\ 
https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries/$JOB_ID

输出将类似于我们使用上面的 jobs.query 方法时:

  {
kind:bigquer y $ getQueryResultsResponse
etag:\< etag string> \,
schema:{
fields:[
{
name:x,
type:INTEGER,
mode:NULLABLE
},
{
name:y,
type:STRING,
mode:NULLABLE
}
]
},
jobReference:{
projectId:< project ID>,
jobId:< job ID>
},
totalRows:1,
rows:[
{
f:[
{
v:1
},
{
v:foo
}
]
}
],
totalBytesProcessed:0,
jobComplete:true,
cacheHit:true
}


Attempting to make a plain GET request to one of the BigQuery REST APIs gives an error that looks like this:

curl https://www.googleapis.com/bigquery/v2/projects/$PROJECT_ID/jobs/$JOBID

Output:

{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "required",
    "message": "Login Required",
    "locationType": "header",
    "location": "Authorization",
  ...

What is the correct way to invoke one of the REST APIs from the command-line, such as the query or insert APIs? The API reference has a "Try this API", but the examples don't translate directly to something you can run from the command-line.

解决方案

As a disclaimer, when working from the command-line, using the bq tool will usually be sufficient, or for more complex use cases, the BigQuery client libraries enable programming with BigQuery from multiple languages. It can still be useful sometimes to make plain requests to the REST APIs to see how certain APIs work at a low level, however.

First, make sure that you have installed the Google Cloud SDK. This should include the gcloud and bq command-line tools. If you haven't already, authorize your account by running this command from your terminal:

gcloud auth login

This should prompt you to log in and then give you an access code that you can paste into your terminal. (The exact process may change over time).

Now let's try a query using the BigQuery REST API, calling the jobs.query method. Modify this script with your own project name, which you can find from the Google Cloud Console, then paste the script into your terminal:

PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"kind\":\"bigquery#queryRequest\",\"useLegacySql\":false,\"query\":$QUERY}"
echo $REQUEST | \
  curl -X POST -d @- -H "Content-Type: application/json" \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries

If it worked, you should see output that looks like this:

{
 "kind": "bigquery#queryResponse",
 "schema": {
  "fields": [
   {
    "name": "x",
    "type": "INTEGER",
    "mode": "NULLABLE"
   },
   {
    "name": "y",
    "type": "STRING",
    "mode": "NULLABLE"
   }
  ]
 },
 "jobReference": {
  "projectId": "<your project ID>",
  "jobId": "<your job ID>"
 },
 "totalRows": "1",
 "rows": [
  {
   "f": [
    {
     "v": "1"
    },
    {
     "v": "foo"
    }
   ]
  }
 ],
 "totalBytesProcessed": "0",
 "jobComplete": true,
 "cacheHit": false
}

If you haven't set up the bq command-line tool, you can use bq init from your terminal to do so. Once you have, you can try running the same query using it:

bq query --use_legacy_sql=False "SELECT 1 AS x, 'foo' AS y;"

You can also see the REST API requests that the bq tool makes by passing the --apilog= option:

bq --apilog= query --use_legacy_sql=False "SELECT [1, 2, 3] AS x;"

Now let's try an example using the jobs.insert method instead of the query API. Run this script, replacing YOUR_PROJECT_NAME with your project name:

PROJECT="YOUR_PROJECT_NAME"
QUERY="\"SELECT 1 AS x, 'foo' AS y;\""
REQUEST="{\"configuration\":{\"query\":{\"useLegacySql\":false,\"query\":${QUERY}}}}"
echo $REQUEST | \
curl -X POST -d @- -H "Content-Type: application/json" \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs

Unlike the query API, which returned a response immediately, you will see a result that looks similar to this:

{
 "kind": "bigquery#job",
 "etag": "\"<etag string>\"",
 "id": "<project name>:<job ID>",
 "selfLink": "https://www.googleapis.com/bigquery/v2/projects/<project name>/jobs/<job ID>",
 "jobReference": {
  "projectId": "<project name>",
  "jobId": "<job ID>"
 },
 "configuration": {
  "query": {
   "query": "SELECT 1 AS x, 'foo' AS y;",
   "destinationTable": {
    "projectId": "<project name>",
    "datasetId": "<anonymous dataset>",
    "tableId": "<anonymous table>"
   },
   "createDisposition": "CREATE_IF_NEEDED",
   "writeDisposition": "WRITE_TRUNCATE",
   "useLegacySql": false
  }
 },
 "status": {
  "state": "RUNNING"
 },
 "statistics": {
  "creationTime": "<timestamp millis>",
  "startTime": "<timestamp millis>"
 },
 "user_email": "<your email address>"
}

Notice the status:

 "status": {
  "state": "RUNNING"
 },

If you want to check on the job now, you can use the jobs.get method. Similar to before, run this from your terminal, using the job ID from the output in the previous step:

PROJECT="YOUR_PROJECT_NAME"
JOB_ID="YOUR_JOB_ID"
curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://www.googleapis.com/bigquery/v2/projects/$PROJECT/jobs/$JOB_ID

If the query is done, you'll get a response that indicates as much:

...
"status": {
 "state": "DONE"
},
...

Finally, we can make a request to fetch the query results, also using the REST API.

curl -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  https://www.googleapis.com/bigquery/v2/projects/$PROJECT/queries/$JOB_ID

The output will look similar to when we used the jobs.query method above:

{
 "kind": "bigquery#getQueryResultsResponse",
 "etag": "\"<etag string>\"",
 "schema": {
  "fields": [
   {
    "name": "x",
    "type": "INTEGER",
    "mode": "NULLABLE"
   },
   {
    "name": "y",
    "type": "STRING",
    "mode": "NULLABLE"
   }
  ]
 },
 "jobReference": {
  "projectId": "<project ID>",
  "jobId": "<job ID>"
 },
 "totalRows": "1",
 "rows": [
  {
   "f": [
    {
     "v": "1"
    },
    {
     "v": "foo"
    }
   ]
  }
 ],
 "totalBytesProcessed": "0",
 "jobComplete": true,
 "cacheHit": true
}

这篇关于如何从命令行使用BigQuery REST API?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆