计算字符串出现的次数(频率) [英] Count the number of times (frequency) a string occurs

查看:175
本文介绍了计算字符串出现的次数(频率)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据框中有一个列,如下所示

  Col1 
--------- -------------------------------------------------- -----------------
医学部高血压科动物管理中心
初级保健部动物管理司外科系
动物控制中心高血压部内科学部

如何计数

  

>联系频率
------------------------------------------
动物管制中心3
高血压科2
医学系1
手术科1
初级保健科1
内科学科1

有人可以帮我解决这个问题吗?

scan 和 trimws 解决方案。

  inp < - 医学部高血压部动物管理中心
Department of Surgery,Division初级保健,动物控制中心
高级动物控制中心内科系

> $($)
阅读9项

动物控制部门内部医学中心
3 1
医学系外科学部
1 1
高血压科小学护理学科
2 1

也可以在结果周围打包as.data.frame:

  > as.data.frame(table(trimws(scan(text = inp,what =,sep =,))))
读取9个项目
Var1 Freq
1动物控制3
2内科学1
3医学系1
4外科学1
5高血压科2
6初级保健科1


I have a column in my dataframe as follows

   Col1
   ----------------------------------------------------------------------------
   Center for Animal Control, Division of Hypertension, Department of Medicine
   Department of Surgery, Division of Primary Care, Center for Animal Control
   Department of Internal Medicine, Division of Hypertension, Center for Animal Control

How do I count the number of strings that occur that is separated by a comma, in other words what I am trying to accomplish is something like this below

    Affiliation                         Freq
    ------------------------------------------
    Center for Animal Control           3
    Division of Hypertension            2
    Department of Medicine              1
    Department of Surgery               1
    Division of Primary Care            1
    Department of Internal Medicine     1  

Could someone help me to figure this out?

解决方案

I use scan and trimws for these text processing tasks.

inp <- "    Center for Animal Control, Division of Hypertension, Department of Medicine
    Department of Surgery, Division of Primary Care, Center for Animal Control
    Department of Internal Medicine, Division of Hypertension, Center for Animal Control"

> table( trimws(scan(text=inp, what="", sep=",")))
Read 9 items

      Center for Animal Control Department of Internal Medicine 
                              3                               1 
         Department of Medicine           Department of Surgery 
                              1                               1 
       Division of Hypertension        Division of Primary Care 
                              2                               1 

Can also wrap as.data.frame around that result:

> as.data.frame(table(  trimws(scan(text=inp, what="", sep=","))))
Read 9 items
                             Var1 Freq
1       Center for Animal Control    3
2 Department of Internal Medicine    1
3          Department of Medicine    1
4           Department of Surgery    1
5        Division of Hypertension    2
6        Division of Primary Care    1

这篇关于计算字符串出现的次数(频率)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆