R中的编码:如何将此字符串转换为UTF-8? [英] Encoding in R: How to convert this string to UTF-8?

查看:1016
本文介绍了R中的编码:如何将此字符串转换为UTF-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用R从旧的名声数据库中读取数据。这一般工作正常,但当阅读说明时,我会收到意外的编码。例如:

  a<  - \U3e34653c
#应该是
ä

我试图 iconv 我的自我这个问题,尽管尝试了许多可能性,但我无法以正确的方式显示它。我的地区:en_US.UTF-8。有没有办法替换(sub)这样的字符串?

解决方案

从SQL Server中提取数据时,我有一个相同的问题ODBC和RODBC包)。我通过更改ODBC驱动程序上的设置来解决它,将所有字符串视为unicode。



更具体来说,我使用的是SQL Server的实际技术ODBC驱动程序,而在高级语言设置下可以指定将文本类型视为Unicode多字节文本编码'设置为UTF-8。


I am using R to read data from an old fame database. This works fine in general but I get unexpected encoding back when reading descriptions. E.g.:

a <- "\U3e34653c"
# is supposed to be 
"ä"

I tried to iconv my self around this problem but despite trying numerous possibilities I was not able to get it displayed in a proper way. my locale: en_US.UTF-8. Is there a way around replacing (sub) such strings?

解决方案

I had an identical problem when extracting data from SQL Server (via ODBC and the RODBC package). I solved it by changing the settings on the ODBC driver to treat all strings as unicode.

More specifically, I'm using Actual Technologies ODBC driver for SQL Server and under 'Advanced Language Settings' can specify 'Treat text types as Unicode' with an option for 'Multi-byte text encoding' to be set to UTF-8.

这篇关于R中的编码:如何将此字符串转换为UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆