从数据框列中查找索引或唯一值 [英] Finding the index or unique values from a dataframe column
问题描述
我有一个数据框
TableName函数参数
A func1 3
B func1 4
A func2 6
B func2 2
C func1 5
我要首先在数据框中找出唯一的TableName,使用唯一函数很简单。但是,我还需要将唯一索引映射到该值。
TableName索引
A 1 3
B 2 4
C 5
稍后我想将此输出读取为获取唯一的TableName值(例如A),然后使用与之对应的每个索引(依次为1和3)来执行某些操作。
请建议我
这是一个 dplyr
解决方案,其中我们创建了一个变量 row_number()
,并将其用作我们的索引,即
df%>%
mutate(new = row_number())%>%
group_by(TableName)%>%
summarise(Index = toString(new))
这给出了
#小注:3 x 2
TableName索引
< fct> < chr>
1 A 1,3
2 B 2,4
3 C 5
您也可以将它们另存为列表,而不是字符串,这将使以后的操作更加容易,例如
df%>%
mutate(new = row_number())%&%;%
group_by(TableName)%&%;%
summarise(索引=列表(新) )
这样,
#小标题:3 x 2
TableName索引
< fct> < list>
1 A< int [2]>
2 B< int [2]>
3 C< int [1]>
I have a dataframe
TableName Function Argument
A func1 3
B func1 4
A func2 6
B func2 2
C func1 5
I want to first find out the unique TableName in the dataframe which is simple using unique function. However, I also need the indexes of unique mapped to the value.Something like:
TableName Index
A 1 3
B 2 4
C 5
Later I want to read this output to get the unique TableName value (e.g. A) and then use each index corresponding to it one by one ( 1 and then 3) to perform some operations.
Please suggest me an approach.
Here is a dplyr
solution where we create a variable with the row_number()
, and use that as our index, i.e.
df %>%
mutate(new = row_number()) %>%
group_by(TableName) %>%
summarise(Index = toString(new))
which gives,
# A tibble: 3 x 2 TableName Index <fct> <chr> 1 A 1, 3 2 B 2, 4 3 C 5
You can also save them as lists rather than strings, which will make future operations easier, i.e.
df %>%
mutate(new = row_number()) %>%
group_by(TableName) %>%
summarise(Index = list(new))
which gives,
# A tibble: 3 x 2 TableName Index <fct> <list> 1 A <int [2]> 2 B <int [2]> 3 C <int [1]>
这篇关于从数据框列中查找索引或唯一值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!