根据r中列的值添加遗漏值 [英] add missed value based on the value of the column in r

查看：286 发布时间：2018/1/27 23:27:11 r for-loop dataframe rbind

本文介绍了根据r中列的值添加遗漏值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

vector1 < - data.frame（名称=a， age= 10， fruit= c（orange，cherry，apple）， count= c （b，b，b，b，b，b，b，b，b，b，b，b，b，b，b，b） =b， age= 33， fruit= c（apple，mango）， count= c（1,1）， tag= c（2,2）） vector3 < - data.frame（ name=c， age= 58， fruit= c（cherry，apple）， count= c（1,1）， tag= c （1,1）） list < - list（vector1，vector2，vector3） print（list）
这是我的测试：

默认值< c（cherry， orange， apple， mango）））{ #print（list [[num]]） list [[num]]< - rbind（ list [[num]]， data.frame（ name= list [[num]] $ name， age= list [[num]] $ age， fruit= setdiff（default ，list [[num]] $ fruit），＃add missed value count= 0， tag= 1＃未找到解决方案）） print（paste0（--------------，num，--------）） print（list） } #print（list）
我试图找到哪个水果在数据框中丢失，果实基于标签的值。例如，在第一个数据框中，有标签1和2.如果标签1的值没有默认水果，例如苹果和香蕉期望格式如下所示：

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ ]
名称年龄水果计数标签
1 a 10橙色1 1
2 a 10樱桃1 1
3 a 10苹果1 2
4 a 10芒果0 1
5 a 10苹果0 1
6 a 10芒果0 2
7 a 10橙色0 2
8 a 10樱桃0 2

当我检查循环的过程时，我也发现第一个循环加了芒果3次我不明白为什么它不能一次性添加遗漏的值。总体输出如下：

$ $ $ $ $ $ $ $ $ $ $ [$ 1]]
名称年龄水果计数标签
1 a 10橙色1 1
2 a 10樱桃1 1
3 a 10苹果1 2
4 a 10芒果0 1
5 a 10芒果0 1
6 a 10芒果0 1

[[2]]
名称年龄水果计数标记
1 b 33苹果1 2
2 b 33芒果1 2
3 b 33樱桃0 1
4 b 33橙色0 1

[[3]]
名称年龄水果计数标签
1 c 58樱桃1 1
2 c 58苹果1 1
3 c 58橙色0 1
4 c 58 mango 0 1

有人帮我，提供简单的方法或其他方法？我应该使用sqldf函数来添加0值吗？这是一个简单的方法来解决我的问题？解决方案

解决方案

显示问题的标记'dplyr'rel =tag> dplyr 和 tidyr 。我们可以使用 complete 展开数据框，并将填充值指定为0到 count 。

请注意，我将列表名从 list 更改为 fruit_list ，因为它是在R中使用保留字来命名对象是一种不好的做法。另请注意，当我创建示例数据框时，我设置了 stringsAsFactors = FALSE ，因为我不想创建因子列。最后，我使用 lapply 来代替for循环来遍历列表元素。

library（dplyr） library（tidyr） fruit_list2< - lapply fruit_list，function（x）{ x2 < - x％>％ complete（name，age，fruit = default，tag = c（1,2），fill = list（count = 0））％>％ select（name，age，fruit，count，tag）％>％ arrange（tag，fruit）％>％ as.data.frame（） return（x2） }） fruit_list2 ＃[[1]] ＃年龄水果计数标签＃1 a 10苹果0 1 ＃2 a 10樱桃1 1 ＃3 a 10芒果0 1 ＃4 a 10橙色1 1 ＃5 a 10苹果1 2 ＃6 a 10樱桃0 2 ＃7 a 10芒果0 2 ＃8 a 10橙色0 2 ＃＃[[2]] ＃名称年龄水果计数标记＃1 b 33苹果0 1 ＃2 b 33樱桃0 1 ＃3 b 33芒果0 1 ＃4 b 33橙色0 1 ＃5 b 33苹果1 2 ＃6 b 33樱桃0 2 ＃7 b 33芒果1 2 ＃8 b 33橙色0 2 ＃＃[[3]] ＃年龄水果计数标签＃ 1 c 58苹果1 1 ＃2 c 58樱桃1 1 ＃3 c 58芒果0 1 ＃4 c 58橙色0 1 ＃5 c 58苹果0 2 ＃6 c 58樱桃0 2 ＃7 c 58芒果0 2 ＃8 c 58橙色0 2
DATA

vector1 < - data.frame（ name=a， age= 10， fruit= c（orange，cherry，apple）， count= c（1,1,1）， tag= c（1,1,2）， stringsAsFactors = FALSE ） vector2 < - data.frame（ name=b， age= 33， fruit= c（apple，芒果）， count= c（1,1）， tag= c（2,2）， stringsAsFactors = FALSE ） vector3 < - data.frame（ name=c， age= 58， fruit= c（cherry，apple）， count= c（1,1）， tag c（1,1）， stringsAsFactors = FALSE ） fruit_list< - list（vector1，vector2，vector3） default< - c（cherry，orange，apple，mango）

This is my sample dataset:
vector1 <- data.frame( "name" = "a", "age" = 10, "fruit" = c("orange", "cherry", "apple"), "count" = c(1, 1, 1), "tag" = c(1, 1, 2) ) vector2 <- data.frame( "name" = "b", "age" = 33, "fruit" = c("apple", "mango"), "count" = c(1, 1), "tag" = c(2, 2) ) vector3 <- data.frame( "name" = "c", "age" = 58, "fruit" = c("cherry", "apple"), "count" = c(1, 1), "tag" = c(1, 1) ) list <- list(vector1, vector2, vector3) print(list)
This is my test:
default <- c("cherry", "orange", "apple", "mango") for (num in 1:length(list)) { #print(list[[num]]) list[[num]] <- rbind( list[[num]], data.frame( "name" = list[[num]]$name, "age" = list[[num]]$age, "fruit" = setdiff(default, list[[num]]$fruit),#add missed value "count" = 0, "tag" = 1 #not found solutions ) ) print(paste0("--------------", num, "--------")) print(list) } #print(list)
I'm trying to find which fruit miss in the data frame and the fruit is based on the value of the tag.For example, in the first data frame, there are tags 1 and 2.If the value of tag 1 does not have the default fruit such as apple and banana, the missed default fruit will be added to 0 to the data frame.The expectation format likes the following:
[[1]] name age fruit count tag 1 a 10 orange 1 1 2 a 10 cherry 1 1 3 a 10 apple 1 2 4 a 10 mango 0 1 5 a 10 apple 0 1 6 a 10 mango 0 2 7 a 10 orange 0 2 8 a 10 cherry 0 2
When I check the process of the loop, I also find that the first loop adds mango 3 times and I don't find the reason why it cannot add the missed value at one time.The overall output likes the following:
[[1]] name age fruit count tag 1 a 10 orange 1 1 2 a 10 cherry 1 1 3 a 10 apple 1 2 4 a 10 mango 0 1 5 a 10 mango 0 1 6 a 10 mango 0 1 [[2]] name age fruit count tag 1 b 33 apple 1 2 2 b 33 mango 1 2 3 b 33 cherry 0 1 4 b 33 orange 0 1 [[3]] name age fruit count tag 1 c 58 cherry 1 1 2 c 58 apple 1 1 3 c 58 orange 0 1 4 c 58 mango 0 1
Does anyone help me and provides simple methods or other ways? Should I use the sqldf function to add 0 value?Is this a simple way to solve my problems?
解决方案
A solution using dplyr and tidyr. We can use complete to expand the data frame and specify the fill values as 0 to count.

Notice that I changed your list name from list to fruit_list because it is a bad practice to use reserved words in R to name an object. Also notice that when I created the example data frame I set stringsAsFactors = FALSE because I don't want to create factor columns. Finally, I used lapply instead of for-loop to loop through the list elements.
library(dplyr) library(tidyr) fruit_list2 <- lapply(fruit_list, function(x){ x2 <- x %>% complete(name, age, fruit = default, tag = c(1, 2), fill = list(count = 0)) %>% select(name, age, fruit, count, tag) %>% arrange(tag, fruit) %>% as.data.frame() return(x2) }) fruit_list2 # [[1]] # name age fruit count tag # 1 a 10 apple 0 1 # 2 a 10 cherry 1 1 # 3 a 10 mango 0 1 # 4 a 10 orange 1 1 # 5 a 10 apple 1 2 # 6 a 10 cherry 0 2 # 7 a 10 mango 0 2 # 8 a 10 orange 0 2 # # [[2]] # name age fruit count tag # 1 b 33 apple 0 1 # 2 b 33 cherry 0 1 # 3 b 33 mango 0 1 # 4 b 33 orange 0 1 # 5 b 33 apple 1 2 # 6 b 33 cherry 0 2 # 7 b 33 mango 1 2 # 8 b 33 orange 0 2 # # [[3]] # name age fruit count tag # 1 c 58 apple 1 1 # 2 c 58 cherry 1 1 # 3 c 58 mango 0 1 # 4 c 58 orange 0 1 # 5 c 58 apple 0 2 # 6 c 58 cherry 0 2 # 7 c 58 mango 0 2 # 8 c 58 orange 0 2
DATA
vector1 <- data.frame( "name" = "a", "age" = 10, "fruit" = c("orange", "cherry", "apple"), "count" = c(1, 1, 1), "tag" = c(1, 1, 2), stringsAsFactors = FALSE ) vector2 <- data.frame( "name" = "b", "age" = 33, "fruit" = c("apple", "mango"), "count" = c(1, 1), "tag" = c(2, 2), stringsAsFactors = FALSE ) vector3 <- data.frame( "name" = "c", "age" = 58, "fruit" = c("cherry", "apple"), "count" = c(1, 1), "tag" = c(1, 1), stringsAsFactors = FALSE ) fruit_list <- list(vector1, vector2, vector3) default <- c("cherry", "orange", "apple", "mango")

这篇关于根据r中列的值添加遗漏值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

根据r中列的值添加遗漏值 [英] add missed value based on the value of the column in r

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

根据r中列的值添加遗漏值 [英] add missed value based on the value of the column in r

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭