readr :: read_csv问题:汉字变成乱码 [英] readr::read_csv issue: Chinese Character becomes messy codes

查看：749 发布时间：2020/7/5 18:40:36 r dplyr tidyverse readr

本文介绍了readr :: read_csv问题:汉字变成乱码的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图将数据集导入RStudio，但是由于汉字变得混乱，我被汉字所困.这是代码:

I'm trying to import a dataset to RStudio, however I am stuck with Chinese characters, as they become messy codes. Here is the code:

library(tidyverse)
df <- read_csv("中文,英文\n英文,德文")
df
# A tibble: 1 x 2
  `\xd6\xd0\xce\xc4`            `Ӣ\xce\xc4`
               <chr>                  <chr>
1 "<U+04E2>\xce\xc4" "<U+00B5>\xc2\xce\xc4"

当我使用基本函数read.csv时，它运行良好.我想我在编码方面一定做错了.但是read_csv中没有编码选项，我该怎么做?

When I use the base function read.csv, it works well. I guess I must do something wrong with encoding. But there are no encoding option in read_csv, how can I do this?

推荐答案

这是因为字符被标记为UTF-8，而实际编码是系统默认值(您可以通过stringi::stri_enc_get()获得).

This is because that the characters are marked as UTF-8 whereas the actual encoding is the system default (you can get by stringi::stri_enc_get()).

因此，您可以执行以下任一操作:

So, you can do either:

1)以正确的编码读取数据:

1) Read data with the correct encoding:

df <- read_csv("中文,英文\n英文,德文", locale = locale(encoding = stringi::stri_enc_get()))

2)读取编码错误的数据，并在以后使用正确的编码对其进行标记(请注意，这并不总是有效的):

2) Read data with the incorrect encoding and mark them with the correct encoding later (note that this does not always work):

df <- read_csv("中文,英文\n英文,德文")
df <- dplyr::mutate_all(df, `Encoding<-`, value = "unknown")

这篇关于readr :: read_csv问题:汉字变成乱码的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

readr :: read_csv问题:汉字变成乱码 [英] readr::read_csv issue: Chinese Character becomes messy codes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

readr :: read_csv问题:汉字变成乱码 [英] readr::read_csv issue: Chinese Character becomes messy codes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭