用于编码“UTF8"的无效字节序列 [英] invalid byte sequence for encoding "UTF8"

查看:66
本文介绍了用于编码“UTF8"的无效字节序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试导入一些数据 进入我的数据库.所以我创建了一个临时表,

I'm trying to import some data into my database. So I've created a temporary table,

create temporary table tmp(pc varchar(10), lat decimal(18,12), lon decimal(18,12), city varchar(100), prov varchar(2));

现在我正在尝试导入数据

 copy tmp from '/home/mark/Desktop/Canada.csv' delimiter ',' csv

但后来我得到了错误,

ERROR:  invalid byte sequence for encoding "UTF8": 0xc92c

我该如何解决?我是否需要更改整个数据库的编码(如果是,如何更改?)还是可以仅更改 tmp 表的编码?或者我应该尝试更改文件的编码?

How do I fix that? Do I need to change the encoding of my entire database (if so, how?) or can I change just the encoding of my tmp table? Or should I attempt to change the encoding of the file?

推荐答案

如果需要在数据库中存储 UTF8 数据,则需要一个接受 UTF8 的数据库.您可以在 pgAdmin 中检查数据库的编码.只需右键单击数据库,然后选择属性".

If you need to store UTF8 data in your database, you need a database that accepts UTF8. You can check the encoding of your database in pgAdmin. Just right-click the database, and select "Properties".

但该错误似乎是在告诉您源文件中有一些无效的 UTF8 数据.这意味着 copy 实用程序已检测到或猜测您正在向其提供 UTF8 文件.

But that error seems to be telling you there's some invalid UTF8 data in your source file. That means that the copy utility has detected or guessed that you're feeding it a UTF8 file.

如果您在某些 Unix 变体下运行,您可以使用 file 实用程序.

If you're running under some variant of Unix, you can check the encoding (more or less) with the file utility.

$ file yourfilename
yourfilename: UTF-8 Unicode English text

(我认为这也适用于 Mac 的终端.)不知道在 Windows 下如何做到这一点.

(I think that will work on Macs in the terminal, too.) Not sure how to do that under Windows.

如果您对来自 Windows 系统的文件(即以 UTF8 编码的文件)使用相同的实用程序,它可能会显示如下内容:

If you use that same utility on a file that came from Windows systems (that is, a file that's not encoded in UTF8), it will probably show something like this:

$ file yourfilename
yourfilename: ASCII text, with CRLF line terminators

如果事情仍然很奇怪,您可能会尝试将输入数据转换为已知编码、更改客户端的编码,或两者兼而有之.(我们真的在扩展我对编码的了解.)

If things stay weird, you might try to convert your input data to a known encoding, to change your client's encoding, or both. (We're really stretching the limits of my knowledge about encodings.)

您可以使用 iconv 实用程序来更改编码输入数据.

You can use the iconv utility to change encoding of the input data.

iconv -f original_charset -t utf-8 originalfile > newfile

您可以按照 字符集支持上的说明更改 psql(客户端)编码.在该页面上,搜索短语启用自动字符集转换".

You can change psql (the client) encoding following the instructions on Character Set Support. On that page, search for the phrase "To enable automatic character set conversion".

这篇关于用于编码“UTF8"的无效字节序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆