data.table::fread 和不平衡的 " [英] data.table::fread and Unbalanced "

查看:16
本文介绍了data.table::fread 和不平衡的 "的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试使用 data.table:fread(fn, sep=' ', header=T) 读取 csv 文件时,它给出了在这一行观察到的不平衡"错误.数据有3个整数变量和1个字符串变量.csv文件中的字符串没有用"括起来,是的,有些行包含"字符串变量和 " 字符不是成对的.

When I tried to read a csv file using data.table:fread(fn, sep=' ', header=T), it gives an "Unbalanced " observed on this line" error. The data has 3 integer variables and 1 string variable. The strings in the csv file are not enclosed with ", and yes there are some lines that contains " within the string variable and the " characters are not in pairs.

我想知道是否可以让 fread 忽略变量中未配对的 " 并继续读取数据?谢谢.

I am wondering is it possible to let fread just ignore the unpaired " in the variable and continue reading data? Thanks.

这里是示例数据(只有一条记录)

Here is the sample data(just one record)

N_ID    VISIT_DATE  REQ_URL REQType
175931  2013-3-8 23:40:30   http://aaa.com/rest/api2.do?api=getSetMobileSession&data={"imei":"60893ZTE-CN13cd","appkey":"android_client","content":"Z0JiRA0qPFtWM3BYVltmcx5MWF9ZS0YLdW1ydXoqPycuJS8idXdlY3R0TGBtU   1

推荐答案

更新:现已在 v1.8.11 中实现

来自新闻:

fread 现在在字段中间接受引号(' 和 "),字段是否以 " 开头,而不是 'unbalanced引号的错误,#2694.感谢百度报道.众所周知并且记录在 ?fread 的顶部(现已删除文本).如果一个字段开始带有 " 它必须以 " 结尾(如果字段分隔符本身在字段内容).嵌入的引号也可以在列名中.换行符 ( )仍然不能在带引号的字段或带引号的列名中.

fread now accepts quotes (both ' and ") in the middle of fields, whether the field starts with " or not, rather than the 'unbalanced quotes' error, #2694. Thanks to baidao for reporting. It was known and documented at the top of ?fread (text now removed). If a field starts with " it must end with " (necessary if the field separator itself is in the field contents). Embedded quotes can be in column names too. Newlines ( ) still can't be in quoted fields or quoted column names, yet.

<小时>

是的,正如@agstudy 所说,嵌入引号是一个已知的记录问题,尚未实现,因为 fread 是新的.严格来说,我想这些不是嵌入的,因为您示例中的字符串不是以引号开头.


Yes as @agstudy said, embedded quotes are a known documented problem not yet implemented since fread is new. Strictly speaking, I suppose these ones aren't embedded because the string in your example doesn't start with a quote, though.

无论如何,我已将此作为错误报告提交,以免被遗忘.在下一个版本中完成.感谢您的强调.

Anyway, I've filed this as a bug report so it doesn't get forgotten. To be done in the next release. Thanks for highlighting.

#2694 :fread 中包含引号但不以引号开头的字符串

这篇关于data.table::fread 和不平衡的 &quot;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆