用空格分割R中的不均匀字符串 [英] Split an uneven character string in R with space
本文介绍了用空格分割R中的不均匀字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
> dput(df)
结构(list(SETID = c(24153L,24187L,24215L,31990L,31990L,
31995L,31995L,31995L,31996L,31996L,31996L,31997L,31997L,
32002L,32002L,32002L,32002L,32003L,32003L,32003L),VESSELID = c(6830 2002/08/13,
6830 2002/08/12,6830 2002/08/15 105372 2002/08/23,
105372 2002/08/23,104234 2002/07/20,104234 2002/07/20,
104234 2002/07 / 20,104234 2002/07/21,104234 2002/07/21,
104234 2002/07/21,104234 2002/07/22,104234 2002/07/22 ,
5744 2002/08/14,5744 2002/08/14,5744 2002/08/14,
5744 2002/08/14,5744 2002 / 08/13,5744 2002/08/13,
5744 2002/08/13)),.Names = c(SETID,VESSELID),row.names = c(1L ,
2L,3L,10L,11L,12L,13L,14L,15L,16L,17L,18L,19L,20L,
21L,22L,23L,24L,25L,26L) data.frame)
我尝试了以下内容:
library(reshape2)
test< - data.frame(d f,colsplit(df $ VESSELID,split =,names = c(vesselID,DATE)))
但是,我收到此错误消息:
colsplit中的错误(log21 $ VESSELID,split = ,name = c(vesselID,DATE))
未使用参数(split =)
split
命令似乎无法正常工作。我不知道如何修复我的字符串。
解决方案
参数名称不是 split
,它是模式
:
test< ; - data.frame(df,colsplit(df $ VESSELID,pattern =,names = c(vesselID,DATE)))
给出:
SETID VESSELID vesselID DATE
1 24153 6830 2002 / 08/13 6830 2002/08/13
2 24187 6830 2002/08/12 6830 2002/08/12
3 24215 6830 2002/08/15 6830 2002/08/15
10 31990 105372 2002/08/23 105372 2002/08/23
11 31990 105372 2002/08/23 105372 2002/08/23
12 31995 104234 2002/07/20 104234 2002/07/20
13 31995 104234 2002/07/20 104234 2002/07/20
14 31995 104234 2002/07/20 104234 2002/07/20
15 31996 104234 2002/07/21 104234 2002 / 07/21
16 31996 104234 2002/07/21 104234 2002/07/21
17 31996 104234 2002/07/21 104234 2002/07 / 21
18 31997 104234 2002/07/22 104234 2002/07/22
19 31997 104234 2002/07/22 104234 2002/07/22
20 32002 5744 2002/08/14 5744 2002/08/14
21 32002 5744 2002/08/14 5744 2002/08/14
22 32002 5744 2002/08/14 5744 2002/08/14
23 32002 5744 2002 / 08/14 5744 2002/08/14
24 32003 5744 2002/08/13 5744 2002/08/13
25 32003 5744 2002/08/13 5744 2002/08/13
26 32003 5744 2002/08/13 5744 2002/08/13
I read many posts on splitting strings in R. However, I am running into an error which I think is due to the way the variables were read into R i.e., space after the date in some cases because the ID is shorter. I am trying to split the character variable "VESSELID" into 2 new variables: "vesselID" and "DATE". Below is a subset of my dataset.
> dput(df)
structure(list(SETID = c(24153L, 24187L, 24215L, 31990L, 31990L,
31995L, 31995L, 31995L, 31996L, 31996L, 31996L, 31997L, 31997L,
32002L, 32002L, 32002L, 32002L, 32003L, 32003L, 32003L), VESSELID = c("6830 2002/08/13 ",
"6830 2002/08/12 ", "6830 2002/08/15 ", "105372 2002/08/23",
"105372 2002/08/23", "104234 2002/07/20", "104234 2002/07/20",
"104234 2002/07/20", "104234 2002/07/21", "104234 2002/07/21",
"104234 2002/07/21", "104234 2002/07/22", "104234 2002/07/22",
"5744 2002/08/14 ", "5744 2002/08/14 ", "5744 2002/08/14 ",
"5744 2002/08/14 ", "5744 2002/08/13 ", "5744 2002/08/13 ",
"5744 2002/08/13 ")), .Names = c("SETID", "VESSELID"), row.names = c(1L,
2L, 3L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L,
21L, 22L, 23L, 24L, 25L, 26L), class = "data.frame")
I did try the following:
library(reshape2)
test <- data.frame(df, colsplit(df$VESSELID, split= " ",names=c("vesselID","DATE")))
However, I get this error message:
Error in colsplit(log21$VESSELID, split = " ", names = c("vesselID", "DATE")) :
unused argument(s) (split = " ")
The split
command doesn't seem to be able to work properly. I don't know how to fix my character string.
解决方案
The argument name is not split
, it is pattern
:
test <- data.frame(df, colsplit(df$VESSELID, pattern = " ",names=c("vesselID","DATE")))
gives :
SETID VESSELID vesselID DATE
1 24153 6830 2002/08/13 6830 2002/08/13
2 24187 6830 2002/08/12 6830 2002/08/12
3 24215 6830 2002/08/15 6830 2002/08/15
10 31990 105372 2002/08/23 105372 2002/08/23
11 31990 105372 2002/08/23 105372 2002/08/23
12 31995 104234 2002/07/20 104234 2002/07/20
13 31995 104234 2002/07/20 104234 2002/07/20
14 31995 104234 2002/07/20 104234 2002/07/20
15 31996 104234 2002/07/21 104234 2002/07/21
16 31996 104234 2002/07/21 104234 2002/07/21
17 31996 104234 2002/07/21 104234 2002/07/21
18 31997 104234 2002/07/22 104234 2002/07/22
19 31997 104234 2002/07/22 104234 2002/07/22
20 32002 5744 2002/08/14 5744 2002/08/14
21 32002 5744 2002/08/14 5744 2002/08/14
22 32002 5744 2002/08/14 5744 2002/08/14
23 32002 5744 2002/08/14 5744 2002/08/14
24 32003 5744 2002/08/13 5744 2002/08/13
25 32003 5744 2002/08/13 5744 2002/08/13
26 32003 5744 2002/08/13 5744 2002/08/13
这篇关于用空格分割R中的不均匀字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文