R 的 read.csv 在第一列名称前面加上垃圾文本 [英] R's read.csv prepending 1st column name with junk text

查看：21 发布时间：2021/12/28 16:42:07 r utf-8 byte-order-mark

本文介绍了R 的 read.csv 在第一列名称前面加上垃圾文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已将数据从 SQL Server Management Studio 中的结果网格导出到 csv 文件.csv 文件看起来是正确的.

I have exported data from a result grid in SQL Server Management Studio to a csv file. The csv file looks correct.

但是当我使用 read.csv 将数据读入 R 数据帧时，第一列名称前面带有ï..".我如何摆脱这些垃圾文本?

But when I read the data into an R dataframe using read.csv, the first column name is prepended with "ï..". How do I get rid of this junk text?

示例:

str(trainData)

'data.frame':   64169 obs. of  20 variables:    
 $ ï..Column1             : int  3232...   
 $ Column2                : int  4242...

数据看起来像这样(没什么特别的):

The data looks something like this (nothing special) :

第一列、第二列
100116577,100116577
100116698,100116702

Column1,Column2
100116577,100116577
100116698,100116702

推荐答案

文件开头有一个 Unicode UTF-8 BOM:

You've got a Unicode UTF-8 BOM at the start of the file:

http://en.wikipedia.org/wiki/Byte_order_mark

文本编辑器或网络浏览器将文本解释为 ISO-8859-1 或CP1252 将为此显示字符 ï»¿

A text editor or web browser interpreting the text as ISO-8859-1 or CP1252 will display the characters ï»¿ for this

R 为您提供 ï，然后将其他两个转换为点，因为它们是非字母数字字符.

R is giving you the ï and then converting the other two into dots as they are non-alphanumeric characters.

这里:

http://r.789695.n4.nabble.com/Writing-Unicode-Text-into-Text-File-from-R-in-Windows-td4684693.html

邓肯默多克建议:

如果你想，你可以声明一个文件为UTF-8-BOM"编码忽略输入的 BOM

You can declare a file to be in encoding "UTF-8-BOM" if you want to ignore a BOM on input

所以用 fileEncoding="UTF-8-BOM" 试试你的 read.csv 或者说服你的 SQL wotsit 不输出 BOM.

So try your read.csv with fileEncoding="UTF-8-BOM" or persuade your SQL wotsit to not output a BOM.

否则你也可以测试名字是否以 ï.. 开头并用 substr 去掉它(只要你知道你永远不会有一个列确实是这样开始的……)

Otherwise you may as well test if the first name starts with ï.. and strip it with substr (as long as you know you'll never have a column that does start like that genuinely...)

这篇关于R 的 read.csv 在第一列名称前面加上垃圾文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R 的 read.csv 在第一列名称前面加上垃圾文本 [英] R's read.csv prepending 1st column name with junk text

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R 的 read.csv 在第一列名称前面加上垃圾文本 [英] R&#39;s read.csv prepending 1st column name with junk text

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

R 的 read.csv 在第一列名称前面加上垃圾文本 [英] R's read.csv prepending 1st column name with junk text

登录关闭