在单独的表 R 中查找和计算字符串出现次数到新列 [英] Vlookup and Count String Occurrences in Separate Table R to new Column

查看:57
本文介绍了在单独的表 R 中查找和计算字符串出现次数到新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框.以下是示例,但应易于重现以供说明.

I have two data frames. Below are samples but should be easily reproducible for illustration.

df1 <- data.frame(School = c("Omaha South", "Omaha Central", "Grand Island"), 
                  Enrollment = c(2166, 2051, 1982))

df2 <- data.frame('Away Score' = c(25, 57, 76), 
                  'Away Team' = c("Omaha South", "Omaha Central", "Grand Island"),
                  'Away Score' = c(52, 88, 69), 
                  'Away Team' = c("Omaha Central", "Grand Island", "Omaha South"),                 
                  Date = c("1/11/2020", "1/12/2020", "1/13/2020"),
                  Winner = c("Omaha Central", "Grand Island", "Grand Island"),
                  Loser = c("Omaha South", "Omaha Central", "Omaha South"))

我的目标是在 df1 中创建一个名为Wins"的新列,该列在 df1 中查找学校,然后计算该学校在 df2 的Winner"列中列出的次数.

My goal is to create a new column in df1 called "Wins" that looks up the school in df1 and then counts the number of times that school is listed in the "Winner" column of df2.

所以希望 df1 看起来像这样:

So want df1 to look like this:

df1 <- data.frame(School = c("Omaha South", "Omaha Central", "Grand Island"), 
                  Enrollment = c(2166, 2051, 1982),
                  Wins = c(0, 1, 2))

我尝试了许多解决方案都无济于事,包括 sqldf.我最近的尝试如下,但它给了我一个错误,说没有适用于group_by_"的方法应用于NULL"类的对象

I've tried a number of solutions to no avail, including sqldf. My latest attempt was the below but it gives me an error saying no applicable method for 'group_by_' applied to an object of class "NULL"

df$Wins %>%
     group_by(df2$Winner) %>%
     mutate(Count=n())

推荐答案

使用 dplyr 和 joins 的一种方式:

One way using dplyr and joins :

library(dplyr)

df1 %>%
  left_join(df2, by = c('School' = 'Winner')) %>%
  na.omit() %>%
  count(School, name = "wins") %>%
  right_join(df1) %>%
  mutate(wins = replace(wins, is.na(wins), 0))

<小时>

使用基R,我们使用table计算获胜频率,使用stack将其转换为数据帧,然后mergedf1.


Using base R we calculate the frequency of wins using table, convert it into dataframe using stack and then merge to df1.

merge(df1, stack(table(factor(df2$Winner, levels = df1$School))), 
           by.x = 'School', by.y = "ind")

这篇关于在单独的表 R 中查找和计算字符串出现次数到新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆