在R中将长列表分为短列表 [英] Divide long list into shorter lists in R
问题描述
我有一长串对象,我需要将它们分成较小的列表,每个列表有20个条目.要注意的是,每个对象只能在一个列表中出现一次.
I have a long list of objects that I need to divide into smaller lists, each with 20 entries. The catch is that each object can only appear once in a single list.
# Create some example data...
# Make a list of objects.
LIST <- c('Oranges', 'Toast', 'Truck', 'Dog', 'Hippo', 'Bottle', 'Hope', 'Mint', 'Red', 'Trees', 'Watch', 'Cup', 'Pencil', 'Lunch', 'Paper', 'Peanuts', 'Cloud', 'Forever', 'Ocean', 'Train', 'Fork', 'Moon', 'Horse', 'Parrot', 'Leaves', 'Book', 'Cheese', 'Tin', 'Bag', 'Socks', 'Lemons', 'Blue', 'Plane', 'Hammock', 'Roof', 'Wind', 'Green', 'Chocolate', 'Car', 'Distance')
# Generate a longer list, with a random sequence and number of repetitions for each entry
LONG.LIST <- data.frame(Name = (sample(LIST, size = 200, replace = TRUE)))
print(LONG.LIST)
Name
1 Cup
2 Distance
3 Roof
4 Pencil
5 Lunch
6 Toast
7 Watch
8 Bottle
9 Car
10 Roof
11 Lunch
12 Forever
13 Cheese
14 Oranges
15 Ocean
16 Chocolate
17 Socks
18 Leaves
19 Oranges
20 Distance
21 Green
22 Paper
23 Red
24 Paper
25 Trees
26 Chocolate
27 Bottle
28 Dog
29 Wind
30 Parrot
etc....
使用上面生成的示例,'Distance'
出现在位置'2'和位置'20'处,'Lunch'
出现在'5'和'11处,'Oranges'
出现在'14'和19'处,第一个没有重复的列表将需要扩展为包括'Green'
,'Paper'
和'Red'
.然后,第二个列表将从'Paper'
在位置24开始.
Using the example generated above, 'Distance'
appears at both position '2' and position '20', 'Lunch'
at both '5' and '11, and 'Oranges'
at '14' and 19', so the first list without duplicates would need to extend to include 'Green'
, 'Paper'
and 'Red'
. The second list would then begin with 'Paper'
at position 24.
最后一个列表可能不完整,因此最好用'NA'填充
The last list is likely to be incomplete, so it would be good to pad it with 'NA's
如果输出是单个数据帧中的列,那将是最简单的.
It would be simplest if the output were columns in a single data frame.
我什至不知道从哪里开始,所以真的很感谢任何建议.谢谢!
I've no idea where to even start with this, so any suggestions are really appreciated. Thanks!
推荐答案
我们可以使用tidyverse
做到这一点.按名称"分组,创建一个包含序列号的列,我们在group_by
中使用该列创建一个新的序列列"ind",然后使用spread
和order
列按字母顺序将其转换为宽"格式>
We can do this with tidyverse
. Grouped by 'Name', create a column with sequence numbers, that we use in group_by
to create a new sequence column 'ind', then convert to 'wide' format with spread
and order
the columns alphabetically
library(tidyverse)
LONG.LIST %>%
group_by(Name) %>%
mutate(grp = row_number()) %>%
group_by(grp) %>%
mutate(ind = row_number()) %>%
spread(grp, Name) %>%
mutate_at(vars(-one_of("ind")), funs(.[order(as.character(.))]))
# A tibble: 40 x 12
# ind `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11`
# <int> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr>
# 1 1 Bag Bag Bag Bag Bag Bag Bag Bag Cup Distance Distance
# 2 2 Blue Blue Book Book Book Cloud Cup Cup Distance Train NA
# 3 3 Book Book Bottle Cloud Cloud Cup Distance Distance Train NA NA
# 4 4 Bottle Bottle Cheese Cup Cup Distance Dog Hammock NA NA NA
# 5 5 Car Car Cloud Distance Distance Dog Hammock Moon NA NA NA
# 6 6 Cheese Cheese Cup Dog Dog Hammock Moon Parrot NA NA NA
# 7 7 Chocolate Chocolate Distance Fork Hammock Horse Paper Train NA NA NA
# 8 8 Cloud Cloud Dog Hammock Horse Moon Parrot NA NA NA NA
# 9 9 Cup Cup Fork Hippo Mint Paper Train NA NA NA NA
#10 10 Distance Distance Green Horse Moon Parrot NA NA NA NA NA
# ... with 30 more rows
这篇关于在R中将长列表分为短列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!