使用ddply选择特定的行等 [英] selecting specific rows etc. using ddply
问题描述
我有一个三部分的问题,该问题基于足球运动员在一个赛季中进球的数据框(df是示例行)
I have a three part question based on a dataframe (df is example rows) of goals scored by soccer players in a season
Player Season Goals
Teddy Sheringham 1992/3 22
Les Ferdinand 1992/3 20
Dean Holdsworth 1992/3 19
Andy Cole 1993/4 34
Alan Shearer 1993/4 31
Chris Sutton 1993/4 25
如果我想获得每年的最佳射手,我可以使用
If I want to obtain the top scorer each year I can use
ddply(df, "Season", summarise, maxGoals = max(Goals),
Player=Player[which.max(Goals)])
问题:
1)在这种情况下不适用,但是只要有联合得分手就足够了
1) It does not apply in this case but does this suffice if there are joint top scorers
2)我也对每个赛季的亚军都很感兴趣.我一直在按降序排列的目标和索引2进行排序,但是没有找到解决方法
2) I am also interested in the runner up for each season being extracted. I have played around with sorting on Goals descending and index 2 but have not found solution
3)另外我如何根据得分的目标数获得每年的计数值,例如,根据以上数据,目标20"应为1992/3提供1,为1993/4提供3
3) Also how would I obtain a count value for each year based on number of Goals scored e.g Goals>20 should give 1 for 1992/3 and 3 for 1993/4 on the above data
推荐答案
如果有多个最佳参与者,则该表达式将仅报告其中一个(特别是该年数据框中的第一个).
If there are multiple best players, that expression will report only one of them (specifically, the first in the dataframe in that year).
对于第二季度:
d = ddply(df, "Season", summarise, SecondPlayer=Player[order(Goals)[length(Goals)-1]])
对于第3季度:
d = ddply(df, "Season", summarise, Count=sum(Goals > 20))
这篇关于使用ddply选择特定的行等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!