将逗号分隔的字符串转换为数字列 [英] Convert comma separated string to numeric columns

查看:105
本文介绍了将逗号分隔的字符串转换为数字列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含几列的数据集,其中一列是反应时间的列.这些反应时间用逗号分隔,以表示不同试验的(同一参与者的)反应时间.

I have a dataset with several columns, one of which is a column for reaction times. These reaction times are comma separated to denote the reaction times (of the same participant) for the different trials.

例如:第1行(即来自参与者1的数据)在反应时间"列下具有以下内容

For example: row 1 (i.e.: the data from participant 1) has the following under the column "reaction times"

reaction_times
2000,1450,1800,2200

因此,这是参与者1对试验1,2,3,4的反应时间.

Hence these are the reaction times of participant 1 for trials 1,2,3,4.

我现在想创建一个新的数据集,其中这些试验的反应时间全部形成单独的列.这样,我可以计算每个 trial 的平均反应时间.

I now want to create a new data set in which the reaction times for these trials all form individual columns. This way I can calculate the mean reaction time for each trial.

              trial 1  trial 2  trial 3  trial 4 
participant 1:   2000     1450     1800     2200

我尝试了reshape2包中的colsplit,但这似乎并没有将我的数据分成新的列(可能是因为我的数据全部在1个单元格中).

I tried the colsplit from the reshape2 package but that doesn't seem to split my data into new columns (perhaps because my data is all in 1 cell).

有什么建议吗?

推荐答案

我认为您正在寻找strsplit()函数;

I think you are looking for the strsplit() function;

a = "2000,1450,1800,2200"
strsplit(a, ",")
[[1]]                                                                                                                                                       
[1] "2000" "1450" "1800" "2200"   

请注意,strsplit将返回一个列表,在这种情况下,该列表仅包含一个元素.这是因为strsplit将向量作为输入.因此,您还可以将单个单元格字符的长向量放入函数中,并获取该向量的分割列表.在一个更相关的示例中,它看起来像:

Notice that strsplit returns a list, in this case with only one element. This is because strsplit takes vectors as input. Therefore, you can also put a long vector of your single cell characters into the function and get back a splitted list of that vector. In a more relevant example this look like:

# Create some example data
dat = data.frame(reaction_time = 
       apply(matrix(round(runif(100, 1, 2000)), 
                     25, 4), 1, paste, collapse = ","),
                     stringsAsFactors=FALSE)
splitdat = do.call("rbind", strsplit(dat$reaction_time, ","))
splitdat = data.frame(apply(splitdat, 2, as.numeric))
names(splitdat) = paste("trial", 1:4, sep = "")
head(splitdat)
  trial1 trial2 trial3 trial4
1    597   1071   1430    997
2    614    322   1242   1140
3   1522   1679     51   1120
4    225   1988   1938   1068
5    621    623   1174     55
6   1918   1828    136   1816

最后,计算每人的平均值:

and finally, to calculate the mean per person:

apply(splitdat, 1, mean)
[1] 1187.50  361.25  963.75 1017.00  916.25 1409.50  730.00 1310.75 1133.75
[10]  851.25  914.75  881.25  889.00 1014.75  676.75  850.50  805.00 1460.00
[19]  901.00 1443.50  507.25  691.50 1090.00  833.25  669.25

这篇关于将逗号分隔的字符串转换为数字列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆