向数据框中添加列 [英] Adding a column to a data.frame

查看:96
本文介绍了向数据框中添加列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有下面的data.frame。我想添加一个列,按照列1( h_no )对我的数据进行分类,这样h_no 1,2,3,4的第一个系列是class 1,第二个系列 h_no (1到7)是类2等,如最后一列所示。

  h_no h_freq h_freqsq 
1 0.09091 0.008264628 1
2 0.00000 0.000000000 1
3 0.04545 0.002065702 1
4 0.00000 0.000000000 1
1 0.13636 0.018594050 2
2 0.00000 0.000000000 2
3 0.00000 0.000000000 2
4 0.04545 0.002065702 2
5 0.31818 0.101238512 2
6 0.00000 0.000000000 2
7 0.50000 0.250000000 2
1 0.13636 0.018594050 3
2 0.09091 0.008264628 3
3 0.40909 0.167354628 3
4 0.04545 0.002065702 3


解决方案

您可以使用各种技术向列中添加列。下面的引号来自相关帮助文本的详细信息部分, [[。data.frame



数据框架可以以多种模式进行索引,当 [ [[与单个向量索引 x [i] x [[i]] ),它们将数据帧索引为列表。

  my.dataframe [new.col]<  -  a.vector 
my.dataframe [ [new.col]]< - a.vector

data.frame方法 $ ,将 x 作为列表

  my.dataframe $ new.col<  -  a.vector 

[ [[与两个索引( x [i,j] x [[i,j]] ),它们像索引矩阵

  my.dataframe [,new.col]<  -  a.vector 

由于 data.frame 的方法假设如果您没有指定是否使用列或行,它将假定你的意思是列。






对于你的例子,这应该是有效的: p>

 #制作一些假数据
your.df< - data.frame(no = c(1:4,1 :7,1:5),h_freq = runif(16),h_freqsq = runif(16))

#查找出现的地方,
来自< - which(your.df $没有== 1)
到< - c((from-1)[ - 1],nrow(your.df))#到此为止,序列运行

#generate一个序列(len)并根据其长度重复一个连续数len次
get.seq< - mapply(from,to,1:length(from),FUN = function(x,y,z) {
len< - length(seq(from = x [1],to = y [1]))
return(rep(z,times = len))
})

#当我们取消列表时,我们得到一个向量
your.df $ group< - unlist(get.seq)
#并将其附加到您的原始数据框架。因为这是
#指定一个组,所以使它成为一个因素
your.df $ group< - as.factor(your.df $ group)


否h_freq h_freqsq group
1 1 0.40998238 0.06463876 1
2 2 0.98086928 0.33093795 1
3 3 0.28908651 0.74077119 1
4 4 0.10476768 0.56784786 1
5 1 0.75478995 0.60479945 2
6 2 0.26974011 0.95231761 2
7 3 0.53676266 0.74370154 2
8 4 0.99784066 0.37499294 2
9 5 0.89771767 0.83467805 2
10 6 0.05363139 0.32066178 2
11 7 0.71741529 0.84572717 2
12 1 0.10654430 0.32917711 3
13 2 0.41971959 0.87155514 3
14 3 0.32432646 0.65789294 3
15 4 0.77896780 0.27599187 3
16 5 0.06100008 0.55399326 3


I have the data.frame below. I want to add a column that classifies my data according to column 1 (h_no) in that way that the first series of h_no 1,2,3,4 is class 1, the second series of h_no (1 to 7) is class 2 etc. such as indicated in the last column.

h_no  h_freq  h_freqsq
1     0.09091 0.008264628 1
2     0.00000 0.000000000 1
3     0.04545 0.002065702 1
4     0.00000 0.000000000 1  
1     0.13636 0.018594050 2
2     0.00000 0.000000000 2
3     0.00000 0.000000000 2
4     0.04545 0.002065702 2
5     0.31818 0.101238512 2
6     0.00000 0.000000000 2
7     0.50000 0.250000000 2 
1     0.13636 0.018594050 3 
2     0.09091 0.008264628 3
3     0.40909 0.167354628 3
4     0.04545 0.002065702 3

解决方案

You can add a column to your data using various techniques. The quotes below come from the "Details" section of the relevant help text, [[.data.frame.

"Data frames can be indexed in several modes. When [ and [[ are used with a single vector index (x[i] or x[[i]]), they index the data frame as if it were a list."

my.dataframe["new.col"] <- a.vector
my.dataframe[["new.col"]] <- a.vector

"The data.frame method for $, treats x as a list"

my.dataframe$new.col <- a.vector

"When [ and [[ are used with two indices (x[i, j] and x[[i, j]]) they act like indexing a matrix"

my.dataframe[ , "new.col"] <- a.vector

Since the method for data.frame assumes that if you don't specify if you're working with columns or rows, it will assume you mean columns.


For your example, this should work:

# make some fake data
your.df <- data.frame(no = c(1:4, 1:7, 1:5), h_freq = runif(16), h_freqsq = runif(16))

# find where one appears and 
from <- which(your.df$no == 1)
to <- c((from-1)[-1], nrow(your.df)) # up to which point the sequence runs

# generate a sequence (len) and based on its length, repeat a consecutive number len times
get.seq <- mapply(from, to, 1:length(from), FUN = function(x, y, z) {
            len <- length(seq(from = x[1], to = y[1]))
            return(rep(z, times = len))
         })

# when we unlist, we get a vector
your.df$group <- unlist(get.seq)
# and append it to your original data.frame. since this is
# designating a group, it makes sense to make it a factor
your.df$group <- as.factor(your.df$group)


   no     h_freq   h_freqsq group
1   1 0.40998238 0.06463876     1
2   2 0.98086928 0.33093795     1
3   3 0.28908651 0.74077119     1
4   4 0.10476768 0.56784786     1
5   1 0.75478995 0.60479945     2
6   2 0.26974011 0.95231761     2
7   3 0.53676266 0.74370154     2
8   4 0.99784066 0.37499294     2
9   5 0.89771767 0.83467805     2
10  6 0.05363139 0.32066178     2
11  7 0.71741529 0.84572717     2
12  1 0.10654430 0.32917711     3
13  2 0.41971959 0.87155514     3
14  3 0.32432646 0.65789294     3
15  4 0.77896780 0.27599187     3
16  5 0.06100008 0.55399326     3

这篇关于向数据框中添加列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆