循环并添加到R中的计数器 [英] Looping and adding to a counter in R

查看：111 发布时间：2020/5/4 5:21:09 r loops counter

本文介绍了循环并添加到R中的计数器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框df，其中包含几列，但下面仅列出了相关列.

I have a dataframe df that contains a couple of columns, but the only relevant ones are given below.

node    |   precedingWord
-------------------------
A-bom       de
A-bom       die
A-bom       de
A-bom       een
A-bom       n
A-bom       de
acroniem    het
acroniem    t
acroniem    het
acroniem    n
acroniem    een
act         de
act         het
act         die
act         dat
act         t
act         n

我想使用这些值对每个节点的前一个单词进行计数，但要包含子类别.例如:要为其添加值的一列标题为neuter，另一列non-neuter和最后一个rest. neuter将包含所有值，其中previousWord是以下值之一:t，het，dat. non-neuter将包含de和die,，而rest将包含不属于neuter或non-neuter的所有内容. (最好是动态的，换句话说，rest使用某种用于中性和非中性的反向变量.或者简单地从长度中减去中性和非中性的值.具有该节点的行.)

I'd like to use these values to make a count of the precedingWords per node, but with subcategories. For instance: one column to add values to that is titled neuter, another non-neuter and a last one rest. neuter would contain all values for which precedingWord is one of these values: t,het, dat. non-neuter would contain de and die, and rest would contain everything that doesn't belong into neuter or non-neuter. (It would be nice if this could be dynamic, in other words that rest uses some sort of reversed variable that is used for neuter and non-neuter. Or which simply subtracts the values in neuter and non-neuter from the length of rows with that node.)

示例输出(在一个新的数据框中，假设为freqDf，看起来像这样:

Example output (in a new dataframe, let's say freqDf, would look like this:

node    |   neuter   | nonNeuter   | rest
-----------------------------------------
A-bom       0          4             2
acroniem    3          0             2
act         3          2             1

要创建freqDf $ node，可以执行以下操作:

To create freqDf$node, I can do this:

freqDf<- data.frame(node = unique(df$node), stringsAsFactors = FALSE)

但这已经是我所拥有的；我不知道如何继续.我以为我可以做这样的事情，但是不幸的是++运算符没有按我希望的那样工作.

But that's already all I got; I don't know how to continue. I figured I could do something like this, but unfortunately the ++ operator doesn't work as I had hoped.

freqDf$neuter[grep("dat|het|t", df$precedingWord, perl=TRUE)] <- ++
freqDf$nonNeuter[grep("de|die", df$precedingWord, perl=TRUE)] <- ++

e <- table(df$Node)
freqDf$rest <- as.numeric(e - freqDf$neuter - freqDf$nonNeuter)

此外，这不适用于每个节点.我需要某种针对freqDf$node中每个不同值自动运行的循环.

Also, this won't work for each node individually. I need some sort of loop that automatically runs for each different value in freqDf$node.

推荐答案

一种方法是用值的类别替换值，然后使用table函数生成频率.

One way is to replace the values by their categories and then use the tablefunction to generate the frequecies.

neuter <- c("t", "het", "dat")
non.neuter <- c("de", "die")

df$precedingWord[df$precedingWord %in% neuter] <- "neuter"
df$precedingWord[df$precedingWord %in% non.neuter] <- "non.neuter"
df$precedingWord[!df$precedingWord %in% c(neuter, non.neuter)] <- "rest"

table(df)

      precedingWord
  node       neuter non.neuter rest
  A-bom         0          4    2
  acroniem      3          0    2
  act           3          2    1

但是我敢肯定，例如dplyr软件包有更好的解决方案.

But I'm sure there is a better solution with the dplyr package for example.

也许是这样的: (它不会覆盖您的"precedingWord"列，而是添加一个新的"gender")

EDIT : Maybe something like that : (It dont overwrite your "precedingWord" column but add a new "gender" one)

library(dplyr)
df %>%
  mutate(gender = ifelse(!precedingWord %in% c(neuter, non.neuter), "rest", 
                         ifelse(precedingWord %in% neuter, "neuter", "non.neuter"))) %>%
  count(node, gender)


Source: local data frame [7 x 3]
Groups: node

      node     gender n
1    A-bom non.neuter 4
2    A-bom       rest 2
3 acroniem     neuter 3
4 acroniem       rest 2
5      act     neuter 3
6      act non.neuter 2
7      act       rest 1

# And if you want the same output you put in your question, you can use table
df2 <- mutate(df, gender = ifelse(!precedingWord %in% c(neuter, non.neuter), "rest", 
                       ifelse(precedingWord %in% neuter, "neuter", "non.neuter")))

table(df2$node, df2$gender)

           neuter non.neuter rest
  A-bom         0          4    2
  acroniem      3          0    2
  act           3          2    1

将表转换为可操作的数据框

Edit : Convert table to a manipulable data frame

myTable <- table(df2$node, df2$gender) %>% 
  as.data.frame.matrix %>%
  mutate(node = row.names(.))

 > myTable
  neuter non.neuter rest     node
1      0          4    2    A-bom
2      3          0    2 acroniem
3      3          2    1      act
> str(myTable)
'data.frame':   3 obs. of  4 variables:
 $ neuter    : int  0 3 3
 $ non.neuter: int  4 0 2
 $ rest      : int  2 2 1
 $ node      : chr  "A-bom" "acroniem" "act"

# And here is a more understandable way if you are not familiar with piping
# To learn more about forward piping : https://github.com/smbache/magrittr 
myTable <- table(df2$node, df2$gender)
myTable2 <- as.data.frame.matrix(myTable)
myTable3 <- mutate(myTable2, node = row.names(myTable2))

这篇关于循环并添加到R中的计数器的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

循环并添加到R中的计数器 [英] Looping and adding to a counter in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

循环并添加到R中的计数器 [英] Looping and adding to a counter in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭