选择列序列并创建变量 [英] Selecting column sequences and creating variables
问题描述
我想知道是否有方法通过序列选择特定的列,并从中创建新的变量。所以例如,如果我有8个列和n个观察结果,那么我如何创建4个依次选择2行的变量?我的数据集比这个大得多,我有1416个变量,每个都有62个观察(我已经粘贴到下面的电子表格的链接,第一列和第二行代表名称)。我想从这个命名为网站1-12创建新的数据框。所以现场1 = df [,1:117];网站2 = df [,119:237]等。
我正在计划将这个代码用于具有更多变量的未来数据集,所以某些形式的循环或序列函数如果任何人能够明白如何实现这一点,那么这个方法会非常有效吗?
https://www.dropbox.com/s/p1a5cu567lxntmw/MyData.csv?dl=0
提前谢谢。
James
ps @nrussell我已经复制并粘贴了下面提到的代码的输出,作为一系列数字,如显示。
dput(z [,1:10])
结构(list(1
= c(0,0,0,0,0,0,0,0,0.0311410340342049,
0,0,0,0,0,0,0,0,0,0,0 ,0,0,0,0,0,0,0,0,0,0,
0,0,0.0207444023791158,0,0,0,0,0,0,0,0,0,0,0 ,0,0.0312971643732546,
0,0,0,0,0,0,0,0,0,3.076287494579976,0,0,0,0,0,0
0,0),.. .......10
= c(0,0,0,0,0.119280313679916,
0,0,0.301029995663981,0,0,0, 0,0,0,0,0,0,0.7015681882079494,
0.136831816210901,0,0,0,0.036363632421801,0,0,0,0547327264843602,
0,0,0,0,0.0231561535126139, 0,0,0.0903089986991944,0,
0,0.0752574989159953,0.1599368821233872,0.0272640716982664,
0.0177076468037636,0,0,0.120411998265592,0,0,0,0,0.00322532138211408,
0.0250858329719984,0, 0,0,0.119280313679916,0,0.172922500085254,
0.225772496747986,0,0,0,0.0954242509439325,0)),.Names = c(1,
2 3,4,5,6,7,8,9,10),class =data.frame,row.names = c(NA,
-62L))
我们可以 / code>数据集('df')与'1416'列相等的大小'118'列通过创建一个分组索引与
gl
lst< - setNames(lapply(split(1:ncol(df))as.numeric(gl(ncol(df))118,
ncol(df)))),function(i)df [,i]),paste0('site',1:12))
或者您可以创建lst而不使用 split
lst< - setNames(lapply(seq(1,ncol(df),by = 118),
function(i)df [i:(i + 117) ]),paste0('site',1:12))
如果我们需要创建12个数据集在全球环境中的对象 list2env
是一个选项(我宁愿在lst本身内工作)
list2env(lst,envir = .GlobalEnv)
使用一个小数据集('df1')与'8'列
lst1< - setNames(lapply(split(1:ncol(df1))as.numeric(gl(ncol(df1),
/ pre>
2,ncol(df1)))),函数(i)df1 [,i]),paste0('site',1:4))
list2env(lst1,envir = .GlobalEnv)
头(site1,3)
#V1 V2
#1 6 12
#2 4 7
#3 14 14
头(site4,3)
#V7 V8
#1 10 2
#2 5 4
#3 5 0
数据
set.seed(24)
df1< - as.data.frame(matrix(sample(0:20,8 * 10,replace = TRUE),ncol = 8))
I was wondering if there was a way to select specific columns via a sequence and create new variables from this.
So for example, if I had 8 columns with n observations, how could I create 4 variables that selects 2 rows sequentially? My dataset is much larger than this and I have 1416 variables with 62 observations each (I have pasted a link to the spreadsheet below, whereby the first column and row represent names). I would like to create new dataframes from this named as sites 1-12. So site 1 = df[,1:117]; site 2 = df [,119:237] etc.
I am planning on using this code for future datasets with even more variables so some form of loop or sequence function would be very effective if anyone could shed any light on how to achieve this?
https://www.dropbox.com/s/p1a5cu567lxntmw/MyData.csv?dl=0
Thank you in advance.
James
p.s @nrussell I have copied and pasted the output of the code you mentioned below, it follows on as a series of numbers like those displayed.
dput(z[ , 1:10]) structure(list(
1
= c(0, 0, 0, 0, 0, 0, 0, 0, 0.0311410340342049, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0207444023791158, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0312971643732546, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0376287494579976, 0, 0, 0, 0, 0, 0, 0),.........10
= c(0, 0, 0, 0, 0.119280313679916, 0, 0, 0.301029995663981, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.715681882079494, 0.136831816210901, 0, 0, 0, 0.0273663632421801, 0, 0, 0, 0.0547327264843602, 0, 0, 0, 0, 0.0231561535126139, 0, 0, 0.0903089986991944, 0, 0, 0.0752574989159953, 0.159368821233872, 0.0272640716982664, 0.0177076468037636, 0, 0, 0.120411998265592, 0, 0, 0, 0, 0.0322532138211408, 0.0250858329719984, 0, 0, 0, 0.119280313679916, 0, 0.172922500085254, 0.225772496747986, 0, 0, 0, 0.0954242509439325, 0)), .Names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame", row.names = c(NA, -62L))
解决方案We could
split
the dataset ('df') with '1416' columns to equal size '118' columns by creating a grouping index withgl
lst <- setNames(lapply(split(1:ncol(df), as.numeric(gl(ncol(df), 118, ncol(df)))), function(i) df[,i]), paste0('site', 1:12))
Or you can create the 'lst' without using the
split
lst <- setNames(lapply(seq(1, ncol(df), by = 118), function(i) df[i:(i+117)]), paste0('site', 1:12))
If we need to create 12 dataset objects in the global environment,
list2env
is an option (I would prefer to work within the 'lst' itself)list2env(lst, envir=.GlobalEnv)
Using a small dataset ('df1') with '8' columns
lst1 <- setNames(lapply(split(1:ncol(df1), as.numeric(gl(ncol(df1), 2, ncol(df1)))), function(i) df1[,i]), paste0('site', 1:4)) list2env(lst1, envir=.GlobalEnv) head(site1,3) # V1 V2 #1 6 12 #2 4 7 #3 14 14 head(site4,3) # V7 V8 #1 10 2 #2 5 4 #3 5 0
data
set.seed(24) df1 <- as.data.frame(matrix(sample(0:20, 8*10, replace=TRUE), ncol=8))
这篇关于选择列序列并创建变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!