Dataframe根据其他列创建新列 [英] Dataframe create new column based on other columns
问题描述
我有一个数据框:
df< - data.frame('a'= c(1,2, 3,4,5),'b'= c(1,20,3,4,50))
df
ab
1 1 1
2 2 20
3 3 3
4 4 4
5 5 50
根据现有的列创建一个新的列。如下所示:
if(df [['a']] == df [['b']]){
df [['c']]< - df [['a']] + df [['b']]
} else {
df [['c'] ]< - df [['b']] - df [['a']]
}
问题是 if
条件仅针对第一行进行检查...如果我从以上中创建一个函数if
语句,然后我使用 apply()
(或 mapply()
...)这是一样的。
在Python /熊猫我可以使用这个:
df ['c'] = df [['a','b']]。apply(lambda x:x ['a'] + x ['b'] if(x ['a'] = = x ['b'])\
else x ['b'] - x ['a'],轴= 1)
我想在R中类似的东西。所以结果应该如下所示:
abc
1 1 1 2
2 2 20 18
3 3 3 6
4 4 4 8
5 5 50 45
一个选项是 ifelse
,它是向量化版本的 if / else
。如果我们正在为每一行执行此操作,OP的熊猫帖子中显示的 if / else
可以在中为
loop或 lapply / sapply
,但是在 R
中将无效。
df< - transform(df,c = ifelse(a == b,a + b,ba))
df
#abc
#1 1 1 2
#2 2 20 18
#3 3 3 6
#4 4 4 8
#5 5 50 45
这可以写成
df $ c< - (df,ifelse(a == b,a + b,ba))
在原始数据集中创建'c'列
由于OP在 R
中使用 if / else
I have a dataframe:
df <- data.frame('a'=c(1,2,3,4,5), 'b'=c(1,20,3,4,50))
df
a b
1 1 1
2 2 20
3 3 3
4 4 4
5 5 50
and I want to create a new column based on existing columns. Something like this:
if (df[['a']] == df[['b']]) {
df[['c']] <- df[['a']] + df[['b']]
} else {
df[['c']] <- df[['b']] - df[['a']]
}
The problem is that the if
condition is checked only for the first row... If I create a function from the above if
statement then I use apply()
(or mapply()
...), it is the same.
In Python/pandas I can use this:
df['c'] = df[['a', 'b']].apply(lambda x: x['a'] + x['b'] if (x['a'] == x['b']) \
else x['b'] - x['a'], axis=1)
I want something similar in R. So the result should look like this:
a b c
1 1 1 2
2 2 20 18
3 3 3 6
4 4 4 8
5 5 50 45
One option is ifelse
which is vectorized version of if/else
. If we are doing this for each row, the if/else
as showed in the OP's pandas post can be done in either a for
loop or lapply/sapply
, but that would be inefficient in R
.
df <- transform(df, c= ifelse(a==b, a+b, b-a))
df
# a b c
#1 1 1 2
#2 2 20 18
#3 3 3 6
#4 4 4 8
#5 5 50 45
This can be otherwise written as
df$c <- with(df, ifelse(a==b, a+b, b-a))
to create the 'c' column in the original dataset
As the OP wants a similar option in R
using if/else
df$c <- apply(df, 1, FUN = function(x) if(x[1]==x[2]) x[1]+x[2] else x[2]-x[1])
这篇关于Dataframe根据其他列创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!