根据相同数据帧中其他列的条件,我想从R数据帧中的列生成8个名称组合 [英] I want to generate 8 combinations of names from a column in an R data frame based on conditions from other columns in the same data frame
问题描述
这是我的数据框架如下所示:
球员KDA LH积分薪水PPS
4 ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209 1622 1.3692
2 ATN至高6.8 5.3 7.1 229.4 21.954 1578 1.3913
1 ATN sasu 3.6 6.4 11.0 95.7 19.357 1244 1.5560
3 ATN eL lisasH 2 2.6 6.1 7.9 29.7 12.037 998 1.2061
5 ATN Nisha 2.7 5.6 7.5 48.2 12.282 955 1.2861
11 CL Swiftending 6.0 5.8 7.8 360.5 22.285 1606 1.3876
13 CL Pajkatt 13.3 7.5 9.3 326.8 37.248 1489 2.5015
15 CL SexyBamboe 6.3 8.5 9.3 168.0 20.660 1256 1.6449
14 CL EGM 2.8 6.0 13.5 78.8 21.988 989 2.2233
12 CL Saksa 2.5 6.5 10.5 59.8 15.898 967 1.6441
51 DBEARS Ace 7.0 3.4 6.9 195.6 23.596 1578 1.4953
31 DBEARS HesteJoe 5.4 5.4 6.1 176.7 16.927 1512 1.1195
61 DBEARS Miggel 2.8 6.8 11.0 141.8 17.818 1212 1.4701
21 DBEARS Noia 3.0 6.0 8.0 36.1 13.161 970 1.3568
41 DBEARS Ryze 2.7 4.7 6.7 74.6 12.166 937 1.2984
8 GB Keyser Soze 6.0 5.0 5.6 316.0 19.120 1602 1.1935
9 GB Madara 5.4 5.3 6.6 334.5 19.405 1577 1.2305
10 GB SkyLark 1.8 5.3 7.0 71.8 10.218 1266 0.8071
7 GB MNT 2.3 5.9 6.1 85.6 9.316 1007 0.9251
6 GB SKANKS224 1.4 7.6 7.4 52.5 7.565 954 0.7930
我遵循这篇文章中描述的一般概念:我想从R数据帧中的一列生成5个名称的组合,其值在一个不同的t列加起来一定数量或更少
调整代码以适应我的需要。这是我到目前为止:
##列出了玩家,积分和工资8的所有组合
xx< - 与(FantasyPlayers,lapply(list(as.character(Player),Points,Salary),combn,8))
##将名称转换为字符串
##找到其他人的列总和,
##设置名称
yy< - setNames(
lapply(xx,function(x){
if(typeof(x) ==character)apply(x,2,toString)else colSums(x)
}),
名称(FantasyPlayers)[c(2,7,8)]
)
## cerce to data.frame
newdf< - as.data.frame(yy)
使用上面的代码,我可以生成8个玩家的所有可能的阵容,然后通过各种标准(总薪水和积分数)将其分组,但是当我们排除阵容中的阵容时,我很挣扎超过3名来自同一支队伍的球员。
我想像,阵容需要从newdf排除,但我真的不知道从哪里开始
这是dput结果:
结构(列表(Team = c(ATNATN,ATN,ATN,ATN,CL,
CL,CL,CL ,DBEARS,DBEARS,DBEARS,DBEARS,
DBEARS,GB,GB,GB,GB,GB (c(2L,
5L,4L,1L,3L,15L,12L,14L,11L,13L,16L,18L,19L,20L,
21L,6L,7L,10L,8L,9L ),.Label = c(eL lisasH 2,ExoticDeer,
Nisha,sasu,Supreme,Keyser Soze,Madara,MNT,SKANKS224
SkyLark,EGM,Pajkatt,Saksa,SexyBamboe,Swiftending,
Ace,DruidzOzoneShoc,HesteJoe,Miggel ,Ryze
),class =factor),K = c(6.1,6.8,3.6,2.6,2.7,6,13.3,
6.3,2.8,2.5,7,5.4, 2.8,3,2.7,6,5.4,1.8,2.3,1.4),D = c(3.3,
5.3,6.4,6.1,5.6,5.8,7.5,8.5,6,5.5,3.4,5.4,6.8 ,6,
4.7,5,5.3,5.3,5.9,7.6),A = c(6.4,7.1,11,7.9,7.5,7.8,
9.3,9.3,13.5,10.5,6.9, 6.1,11,8,6.7,5.6,6.6,7, 6.1,
7.4),LH = c(306.9,229.4,95.7,29.7,48.2,360.5,326.8,168,
78.8,59.8,195.6,177.71,148.8,36.1,74.6,316,334.5 ,71.8,
85.6,52.5),积分= c(22.209,21.954,19.357,12.037,12.282,
22.285,37.248,20.66,21.988,15.898,23.596,16.927,17.818,
13.161,12.166,19.12,19.405,10.218,9.316,7.565),薪资= c(1622,
1578,1244,998,955,1606,1489,1256,989,967,1578,1512,$ b $ PPS = c(1.3692,
1.3913,1.556,1.2061,1.2861,1.3876,2.5015,1.6449,2.2233,
1.6441,b1212,970,937,1602,1577,1266,1007,954) (Team,Player,K,D,$ b $(Team,Player,K,D,
A,LH,积分,工资,PPS),class =data.frame,row.names = c(4,
2 ,3,5,11,13,15,14,12,51,31,
61 41,8,9,10,7,6))
这是一种方式:
splt.names< - strsplit(as.character(newdf $ Player),,)
索引< - lapply (splt.names,function(x)match(x,FantasyPlayers $ Player))
exclude< - lapply(indices,function(x)any(table(FantasyPlayers $ Team [x])> 3))
newdf2< - newdf [!unlist(exclude),]
用逗号分割 Player
列。然后将播放器名称与 Fantasy Players
播放器名称列匹配。使用这些索引
,我们可以做主要工作是 any(table(FantasyPlayers $ Team [x])> 3)
。这是对超过三分的球队数量的检查,这将显示来自同一个队伍的3名或更多球员。
I have a data frame with 20 players from 4 different teams (5 players per team), each assigned a salary from a fantasy draft. I would like to be able to create all combinations of 8 players whose salaries are equal to or less than 10000 & whose total points are greater than x but excluding any combinations that contains 4 or more players from the same team.
Here is what my data frame looks like:
Team Player K D A LH Points Salary PPS
4 ATN ExoticDeer 6.1 3.3 6.4 306.9 22.209 1622 1.3692
2 ATN Supreme 6.8 5.3 7.1 229.4 21.954 1578 1.3913
1 ATN sasu 3.6 6.4 11.0 95.7 19.357 1244 1.5560
3 ATN eL lisasH 2 2.6 6.1 7.9 29.7 12.037 998 1.2061
5 ATN Nisha 2.7 5.6 7.5 48.2 12.282 955 1.2861
11 CL Swiftending 6.0 5.8 7.8 360.5 22.285 1606 1.3876
13 CL Pajkatt 13.3 7.5 9.3 326.8 37.248 1489 2.5015
15 CL SexyBamboe 6.3 8.5 9.3 168.0 20.660 1256 1.6449
14 CL EGM 2.8 6.0 13.5 78.8 21.988 989 2.2233
12 CL Saksa 2.5 6.5 10.5 59.8 15.898 967 1.6441
51 DBEARS Ace 7.0 3.4 6.9 195.6 23.596 1578 1.4953
31 DBEARS HesteJoe 5.4 5.4 6.1 176.7 16.927 1512 1.1195
61 DBEARS Miggel 2.8 6.8 11.0 141.8 17.818 1212 1.4701
21 DBEARS Noia 3.0 6.0 8.0 36.1 13.161 970 1.3568
41 DBEARS Ryze 2.7 4.7 6.7 74.6 12.166 937 1.2984
8 GB Keyser Soze 6.0 5.0 5.6 316.0 19.120 1602 1.1935
9 GB Madara 5.4 5.3 6.6 334.5 19.405 1577 1.2305
10 GB SkyLark 1.8 5.3 7.0 71.8 10.218 1266 0.8071
7 GB MNT 2.3 5.9 6.1 85.6 9.316 1007 0.9251
6 GB SKANKS224 1.4 7.6 7.4 52.5 7.565 954 0.7930
I am following the general concept described in this post: I want to generate combinations of 5 names from a column in an R data frame, whose values in a different column add up to a certain number or less
tweaking the code to suit my needs. This is what I have so far:
## make a list of all combinations of 8 of Player, Points and Salary
xx <- with(FantasyPlayers, lapply(list(as.character(Player), Points, Salary), combn, 8))
## convert the names to a string,
## find the column sums of the others,
## set the names
yy <- setNames(
lapply(xx, function(x) {
if(typeof(x) == "character") apply(x, 2, toString) else colSums(x)
}),
names(FantasyPlayers)[c(2, 7, 8)]
)
## coerce to data.frame
newdf <- as.data.frame(yy)
Using the above code I am able to generate all possibly lineups of 8 players and then subset that by various criteria (total salary and number of points), but I am struggling when it comes to excluding the lineups where there are more than 3 players from the same team.
I imagine the lineups would need to be excluded from newdf but I don't really know where to begin in doing that.
Here are the dput results:
structure(list(Team = c("ATN", "ATN", "ATN", "ATN", "ATN", "CL",
"CL", "CL", "CL", "CL", "DBEARS", "DBEARS", "DBEARS", "DBEARS",
"DBEARS", "GB", "GB", "GB", "GB", "GB"), Player = structure(c(2L,
5L, 4L, 1L, 3L, 15L, 12L, 14L, 11L, 13L, 16L, 18L, 19L, 20L,
21L, 6L, 7L, 10L, 8L, 9L), .Label = c("eL lisasH 2", "ExoticDeer",
"Nisha", "sasu", "Supreme", "Keyser Soze", "Madara", "MNT", "SKANKS224",
"SkyLark", "EGM", "Pajkatt", "Saksa", "SexyBamboe", "Swiftending",
"Ace", "DruidzOzoneShoc", "HesteJoe", "Miggel", "Noia", "Ryze"
), class = "factor"), K = c(6.1, 6.8, 3.6, 2.6, 2.7, 6, 13.3,
6.3, 2.8, 2.5, 7, 5.4, 2.8, 3, 2.7, 6, 5.4, 1.8, 2.3, 1.4), D = c(3.3,
5.3, 6.4, 6.1, 5.6, 5.8, 7.5, 8.5, 6, 6.5, 3.4, 5.4, 6.8, 6,
4.7, 5, 5.3, 5.3, 5.9, 7.6), A = c(6.4, 7.1, 11, 7.9, 7.5, 7.8,
9.3, 9.3, 13.5, 10.5, 6.9, 6.1, 11, 8, 6.7, 5.6, 6.6, 7, 6.1,
7.4), LH = c(306.9, 229.4, 95.7, 29.7, 48.2, 360.5, 326.8, 168,
78.8, 59.8, 195.6, 176.7, 141.8, 36.1, 74.6, 316, 334.5, 71.8,
85.6, 52.5), Points = c(22.209, 21.954, 19.357, 12.037, 12.282,
22.285, 37.248, 20.66, 21.988, 15.898, 23.596, 16.927, 17.818,
13.161, 12.166, 19.12, 19.405, 10.218, 9.316, 7.565), Salary = c(1622,
1578, 1244, 998, 955, 1606, 1489, 1256, 989, 967, 1578, 1512,
1212, 970, 937, 1602, 1577, 1266, 1007, 954), PPS = c(1.3692,
1.3913, 1.556, 1.2061, 1.2861, 1.3876, 2.5015, 1.6449, 2.2233,
1.6441, 1.4953, 1.1195, 1.4701, 1.3568, 1.2984, 1.1935, 1.2305,
0.8071, 0.9251, 0.793)), .Names = c("Team", "Player", "K", "D",
"A", "LH", "Points", "Salary", "PPS"), class = "data.frame", row.names = c("4",
"2", "1", "3", "5", "11", "13", "15", "14", "12", "51", "31",
"61", "21", "41", "8", "9", "10", "7", "6"))
Here's one way:
splt.names <- strsplit(as.character(newdf$Player), ", ")
indices <- lapply(splt.names, function(x) match(x, FantasyPlayers$Player))
exclude <- lapply(indices, function(x) any(table(FantasyPlayers$Team[x]) > 3))
newdf2 <- newdf[!unlist(exclude), ]
First split the Player
column by comma. Then match the player names to the Fantasy Players
player name column. With those indices
, we can do the main work which is any(table(FantasyPlayers$Team[x]) > 3)
. This is the check of team counts that exceed three, which will indicate 3 or more players from the same team.
这篇关于根据相同数据帧中其他列的条件,我想从R数据帧中的列生成8个名称组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!