R MySQL查询使日语字符变形 [英] R RMySQL query deforms japanese characters

查看:99
本文介绍了R MySQL查询使日语字符变形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用RMySQL连接到AWS MySQL服务器。它起作用,除了字符值变形。之前曾有人问过这个问题,但修复程序似乎对我不起作用。这就是我正在做的事情:

I am using RMySQL to connect to an aws MySQL server. It works, except character values are deformed. This question has been asked before but the fixes don't seem to work for me. Here's what I'm doing:

确保没有打开连接:


dbListConnections(MySQL())
list()

dbListConnections(MySQL()) list()

确保我的连接设置为使用UTF-8:

Make sure my connection is set to use UTF-8:


dbGetQuery(凭证,显示变量,例如'character_set%')

dbGetQuery(credentials, "show variables like 'character_set%'")



             Variable_name                                     Value
1     character_set_client                                      utf8
2 character_set_connection                                      utf8
3   character_set_database                                      utf8
4 character_set_filesystem                                      utf8
5    character_set_results                                      utf8
6     character_set_server                                      utf8
7     character_set_system                                      utf8
8       character_sets_dir /rdsdbbin/mysql-5.5.40.R1/share/charsets/

获取数据:


数据<-dbGetQuery(凭证,查询)
头(数据)
keyword_ja
1 \ 036
2 \036蜀ャ
3 \036螟\x8f
4 \036譌・譛ャ莠コ
5 \037繧,繝ゥ繧ケ繝\x88
6 \037连续守ゥ ォ

data <- dbGetQuery(credentials, Query) head(data) keyword_ja 1 \036 2 \036蜀ャ 3 \036螟\x8f 4 \036譌・譛ャ莠コ 5 \037繧、繝ゥ繧ケ繝\x88 6 \037蜿守ゥォ

当我将数据写入磁盘时,Excel显示相同变形的字符,但notepad ++可以以某种方式显示日语:

When I write this data to disk Excel shows the same deformed characters, but notepad++ can somehow show the japanese as it's intended:

keyword_ja

"keyword_ja"



日本人
イラスト
收获

"冬" "夏" "日本人" "イラスト" "収穫"

我一直在努力使用R中的Encoding()和enc2utf8()之类的函数来使其像notepad ++一样正确显示字符,但没有成功。

I've been trying to use functions like Encoding() and enc2utf8() in R to get it to display the characters correctly as notepad++ does, with no success.


编码(head(data $ keyword_ja))

Encoding(head(data$keyword_ja))

[1]未知未知未知未知未知未知

[1] "unknown" "unknown" "unknown" "unknown" "unknown" "unknown"

enc2utf8(head(data $ keyword_ja))

enc2utf8(head(data$keyword_ja))

[1] \036 \036蜀ャ \036螟< 8f > \036譌・譛ャ莠コ \037繧,繝ゥ繧ケ繝< 88> \037蜿蜒ゥ ォ

[1] "\036" "\036蜀ャ" "\036螟<8f>" "\036譌・譛ャ莠コ" "\037繧、繝ゥ繧ケ繝<88>" "\037蜿守ゥォ"

我通常可以键入日语字符,R可以毫无问题地显示它们

I can normally type japanese characters and R has no problem displaying them


Sys.getlocale()
[1] LC_COLLATE = Japanese_Japan.932; LC_CTYPE = Japanese_Japan.932; LC_MONETARY = Japanese_Japan.932; LC_NUMERIC = C; LC_TIME = Japanese_Japan.932
mystring<-日本语入力できる
mystring
[1]日本语入力できる
编码(mystring)
[1]未知

Sys.getlocale() [1] "LC_COLLATE=Japanese_Japan.932;LC_CTYPE=Japanese_Japan.932;LC_MONETARY=Japanese_Japan.932;LC_NUMERIC=C;LC_TIME=Japanese_Japan.932" mystring <- "日本語入力できる" mystring [1] "日本語入力できる" Encoding(mystring) [1] "unknown"

我非常想知道这一点,因此非常感谢您的帮助。请让我知道是否可以提供其他信息。

I'm pretty desperate to figure this out so any help is very much appreciated. Please let me know if I can provide additional information.

推荐答案

基于这篇SO文章,您可能必须使用UTF 数据写入磁盘-8编码。尝试以下操作:

Based on this SO article, you might have to write your data to disk with UTF-8 encoding. Try this:

data <- dbGetQuery(credentials, Query)
con <- file('output.csv', encoding="utf8")
write.csv(data, file=con)

然后尝试在Excel和Notepad ++中打开 output.csv 并让我们知道结果。当您将该文件读回R时,它有望表现出预期的效果:

Then try opening output.csv in both Excel and Notepad++ and let us know the results. When you read this file back into R, it should hopefully behave as expected:

fread("test.csv", encoding="UTF-8")

这篇关于R MySQL查询使日语字符变形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆