在导入数据存储备份时,大量查询加载失败,并显示错误字符(ASCII 0) [英] Big query load fails with Bad Character (ASCII 0) while importing Datastore backup

查看:170
本文介绍了在导入数据存储备份时,大量查询加载失败,并显示错误字符(ASCII 0)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能看起来像已经讨论的情况。我正尝试使用Talend tBigQueryBulkExec组件将Google App Engine DataStore备份加载到BQ中,该组件与BQ Shell CLI的作用相同。它连接到BQ并尝试从GCS读取文件并移动到组件设置中给定的定义的Dataset.Tablename。



错误消息:


location:File:0 / Line:8 / Field:1,message:遇到错误字符(ASCII 0):字段以: 原因:无效}



整个消息:

{configuration:{load :{createDisposition:CREATE_NEVER,destinationTable:{datasetId:sample_red,projectId:test,tableId:bqload1},schema:{fields { 名称: 文件, 类型: STRING}]}, skipLeadingRows:1, sourceUris:[ GS:// test.appspot.com/bucket/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUFkbWluX09wZXJhdGlvbhiB64MBDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.Challenge.backup_info ], writeDisposition会: WRITE_TRUNCATE}}, ETAG: \ AJDc2PKvhXhnNlIwTi02BO3aoe8 / 1ZnlNbMA0eEnHxZQC_gKepG8Mio\ , ID: 测试:job_yFJa_JVN0E05GZQZNvtlZR6Bgjo, jobReference:{ 的jobId:job_yFJa_JVN0E05GZQZNvtlZR6 Bgjo,projectId:test},kind:bigquery#job,selfLink: https://www.googleapis.com/bigquery/v2/projects/buckett/jobs/job_yFJa_JVN0E05GZQZNvtlZR6Bgjo ,statistics:{endTime:1427358416307, 开始时间: 1427358414687, 创建时间: 1427358397621, 负载:{ inputFiles: 1, inputFileBytes: 565, outputRows: 0, outputBytes: 0:errorResult:{location:File:0 / Line:11 / Field:1,message:遇到错误字符(ASCII 0):字段以: < \ u000Bcontent>,reason:invalid},errors:[{location:File:0 / Line:5 / Field:1 0):字段以:,reason:invalid},{location:File:0 / Line:6 / Field:1,message :遇到坏字符(ASCII 0):字段以:,reason:invalid},{location:File:0 / Line:8 / Field:1 ,message:遇到坏字符(ASCII 0):字段以:,reason:invalid},{location:File:0 / Line:10 / Field:1 :遇到坏字符(ASCII 0):字段以:,reason:invalid},{location:File:0 / Line:11 / Field:1,message:遇到错误的字符(ASCII 0):字段以:< \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\' }


我从其他文章中看到坏字符ASCII是一个错误,它将在下一个版本中修复,是否没有完成它看起来像你有一个unicode选项卡字符,并且Talend无法正确解析它,因为它需要ASCII码。

文本。



如果转到tBigQueryBulkExec组件的高级设置,应该有一个编码选项。如果你将它设置为utf-8,现在应该可以使用。


This may look like already discussed scenario. I am trying to load Google App Engine DataStore backup into BQ using Talend tBigQueryBulkExec component, which does same as BQ Shell CLI. It connects to BQ and tries to read files from GCS and move to defined Dataset.Tablename as given in component settings.

Error Message:

location":"File: 0 / Line:8 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: ","reason":"invalid"}

Entire message:

{"configuration":{"load":{"createDisposition":"CREATE_NEVER","destinationTable":{"datasetId":"sample_red","projectId":" test","tableId":"bqload1"},"schema":{"fields":[{"name":"file","type":"STRING"}]},"skipLeadingRows":1,"sourceUris":["gs:// test.appspot.com/bucket/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUFkbWluX09wZXJhdGlvbhiB64MBDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.Challenge.backup_info"],"writeDisposition":"WRITE_TRUNCATE"}},"etag":"\"AJDc2PKvhXhnNlIwTi02BO3aoe8/1ZnlNbMA0eEnHxZQC_gKepG8Mio\"","id":" test:job_yFJa_JVN0E05GZQZNvtlZR6Bgjo","jobReference":{"jobId":"job_yFJa_JVN0E05GZQZNvtlZR6Bgjo","projectId":"test"},"kind":"bigquery#job","selfLink":"https://www.googleapis.com/bigquery/v2/projects/buckett/jobs/job_yFJa_JVN0E05GZQZNvtlZR6Bgjo","statistics":{"endTime":"1427358416307","startTime":"1427358414687","creationTime":"1427358397621","load":{"inputFiles":"1","inputFileBytes":"565","outputRows":"0","outputBytes":"0"}},"status":{"errorResult":{"location":"File: 0 / Line:11 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\u000Bcontent>","reason":"invalid"},"errors":[{"location":"File: 0 / Line:5 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\u0006status\u0012>","reason":"invalid"},{"location":"File: 0 / Line:6 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\tstartDa>","reason":"invalid"},{"location":"File: 0 / Line:8 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: ","reason":"invalid"},{"location":"File: 0 / Line:10 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: ","reason":"invalid"},{"location":"File: 0 / Line:11 / Field:1","message":"Bad character (ASCII 0) encountered: field starts with: <\u000Bcontent>","reason":"invalid"}],"state":"DONE"},"user_email":"xx@gmail.com"}

I read from other posts which says Bad Character ASCII is a bug and which will be fixed in next release, is it not done yet?

解决方案

It looks like you have a unicode tab character there and Talend is failing to parse it properly as it is expecting ASCII text.

If you go to the advanced settings of the tBigQueryBulkExec component there should be an option for encoding. If you set this to "utf-8" this should now work.

这篇关于在导入数据存储备份时,大量查询加载失败,并显示错误字符(ASCII 0)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆