修剪错误?未删除前导空格 [英] trimws bug? leading whitespace not removed
问题描述
编辑:感谢 R Yoda,我终于能够针对我面临的问题创建一个可重现的示例:
Edit: Thanks to R Yoda, I was finally able to create a reproducible example to the issue I am facing:
x = rawToChar(as.raw(c(0xa0, 0x31, 0x31, 0x2e, 0x31, 0x33, 0x32, 0x35, 0x39, 0x32)))
trimws(x)
=> 问题:如何修剪 x?
=> Question: How can I trim x?
问题的旧文本:
请参阅附件截图.不幸的是,我无法创建可重现的示例,因为 dput
正在影响结果...
任何人都知道如何调查 x 出了什么问题?前导空格似乎不是标准的!
As anyone an idea how to investigate what's going wrong with x? The leading whitespace doesn't seem to be a standard one!
charToRaw(x)
给出 a0 31 31 2e 31 33 32 35 39 32dput(charToRaw(x))
给出 as.raw(c(0xa0, 0x31, 0x31, 0x2e, 0x31, 0x33, 0x32, 0x35, 0x39,0x32))
Encoding(x)
给出 "unknown"
(与 Encoding(" 11.132592")
相同)
charToRaw(x)
gives a0 31 31 2e 31 33 32 35 39 32
dput(charToRaw(x))
gives as.raw(c(0xa0, 0x31, 0x31, 0x2e, 0x31, 0x33, 0x32, 0x35, 0x39,
0x32))
Encoding(x)
gives "unknown"
(same as Encoding(" 11.132592")
)
推荐答案
0xa0
正在对 R
中的另一种类型的空格(不间断空格)进行编码,而 0x20
是空格.trimws
搜索空格或制表符或换行符或回车符(由 [ \t\r\n]+
表示)但不搜索不间断空格,因此它会不工作.
您可以使用 sub
(抑制前导或尾随空格)或 gsub
(抑制尾随和前导空格)删除任何类型的尾随或前导空格(包括0xa0
表示的那个):
0xa0
is encoding another type of space (the non-breaking space) in R
, while 0x20
is the white space.
trimws
searches for white spaces or tabs or linebreaks or carriage returns (represented by [ \t\r\n]+
) but not for non-breaking spaces, hence it does not work.
You can use sub
(to suppress either leading or trailing spaces) or gsub
(to suppress both trailing and leading spaces) to remove any kind of trailing or leading space(s) (including the one represented by 0xa0
):
sub("^\\s+", "", x)
[1] "11.132592"
以及删除前导和尾随空格:
And for removing leading and trailing spaces:
gsub("(^\\s+)|(\\s+$)", "", x)
这篇关于修剪错误?未删除前导空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!