使用BCP插入具有Unicode字符的行 [英] Insert rows with Unicode characters using BCP

查看:122
本文介绍了使用BCP插入具有Unicode字符的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用BCP将数据从CSV文件批量上传到SQL Azure(因为不支持BULK INSERT).此命令运行并上传行:

I'm using BCP to bulk upload data from a CSV file to SQL Azure (because BULK INSERT is not supported). This command runs and uploads the rows:

bcp [resource].dbo.TableName in C:\data.csv -t "," -r "0x0a" -c -U bcpuser@resource -S tcp:resource.database.windows.net

但是data.csv是UTF8编码的,并且包含非ASCII字符串.这些被损坏.我尝试将-c选项更改为-w:

But data.csv is UTF8 encoded and contains non-ASCII strings. These get corrupted. I've tried changing the -c option to -w:

bcp [resource].dbo.TableName in C:\data.csv -t "," -r "0x0a" -w -U bcpuser@resource -S tcp:resource.database.windows.net

但是随后我得到复制0行".

But then I get '0 rows copied'.

我在做什么错,如何使用BCP批量插入Unicode字符?

What am I doing wrong and how do I bulk insert Unicode characters using BCP?

推荐答案

但是data.csv是UTF8编码的

But data.csv is UTF8 encoded

UTF-8编码是主要问题.使用-w将无济于事,因为在Microsoft-land中,术语"Unicode"几乎总是指UTF-16 Little Endian.

The UTF-8 encoding is the primary issue. Using -w won't help because in Microsoft-land, the term "Unicode" nearly always refers to UTF-16 Little Endian.

解决方案将取决于您作为选项使用的BCP版本是最新版本(13.0/2016)中添加的:

The solution will depend on which version of BCP you are using as an option was added in the newest version (13.0 / 2016):

  • 如果使用的是SQL Server 2016之前的SQL Server BCP(版本13.0),则需要将csv文件转换为UTF-16 Little Endian(LE),因为这是Windows/SQL Server/.NET用于所有字符串.并使用-w开关.

我使它能够在Notepad ++中将文件编码为"UCS-2 LE BOM",而使用-c开关,该导入文件失败.

I got this to work encoding a file as "UCS-2 LE BOM" in Notepad++, whereas that same import file failed using the -c switch.

如果您使用的是SQL Server 2016(版本13.0)或更高版本随附的BCP,则只需在命令行中添加-c -C 65001. -C用于代码页",而65001是用于UTF-8的代码页.

If you are using BCP that came with SQL Server 2016 (version 13.0) or newer, then you can simply add -c -C 65001 to the command line. -C is for "code page", and 65001 is the code page for UTF-8.

用于 bcp实用工具的MSDN页面(在解释中-C开关的位置):

The MSDN page for bcp Utility states (in the explanation of the -C switch):

版本13之前的版本(SQL Server 2016)不支持代码页65001(UTF-8编码).以13开头的版本可以将UTF-8编码导入到SQL Server的早期版本中.

Versions prior to version 13 (SQL Server 2016) do not support code page 65001 (UTF-8 encoding). Versions beginning with 13 can import UTF-8 encoding to earlier versions of SQL Server.

更新

对此Microsoft KB文章中所述,通过SP2将对UTF-8/代码页65001的支持添加到SQL Server 2014中:

Support for UTF-8 / code page 65001 was added to SQL Server 2014 via SP2, as noted in this Microsoft KB article:

查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆