使用grep确定字符串的频率 [英] determine frequency of string using grep

查看:137
本文介绍了使用grep确定字符串的频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有一个vector

  x <-c(ajjss,acdjfkj,auyjyjjksjj) 

和do:

  y <-x [grep(jj,x)] 
表(y)

我得到:

  y 
ajjss auyjyjjksjj
1 1

然而,第二个字符串auyjyjjksjj应该计算子字符串jj两次。我怎么能把这个从真/假计算改变到实际计算jj的频率?

另外,如果对于每个字符串,子字符串的频率除以字符串的长度可以计算,这将是很大的。



预先感谢。

解决方案
  x <-c(ajjss,acdjfkj, (x)if(x [[1]]!=  -  1)length(x)else 0)
(b)freq< - sapply(gregexpr(jj,x) df< -data.frame(x,freq)

df
#x freq
#1 ajjss 1
#2 acdjfkj 0
#3 auyjyjjksjj 2

对于问题的最后部分,计算频率 / string length ...

...

  df $ rate < -  df $ freq / nchar(as.character (df $ x))

有必要将df $ x转换回字符串,因为数据.frame(x,freq)automati除非指定stringsAsFactors = F,否则将字符串转换为因子。

$ $ $ $ b $ x $ d
#x freq rate
# 1 ajjss 1 0.2000000
#2 acdjfkj 0 0.0000000
#3 auyjyjjksjj 2 0.1818182


if I have a vector

x <- c("ajjss","acdjfkj","auyjyjjksjj")

and do:

y <- x[grep("jj",x)]
table(y)

I get:

y
      ajjss auyjyjjksjj 
          1           1 

However the second string "auyjyjjksjj" should count the substring "jj" twice. How can I change this from a true/false computation, to actually counting the frequency of "jj"?

Also if for each string the frequency of the substring divided by the string's length could be calculated that would be great.

Thanks in advance.

解决方案

I solved this using gregexpr()

x <- c("ajjss","acdjfkj","auyjyjjksjj")
freq <- sapply(gregexpr("jj",x),function(x)if(x[[1]]!=-1) length(x) else 0)
df<-data.frame(x,freq)

df
#            x freq
#1       ajjss    1
#2     acdjfkj    0
#3 auyjyjjksjj    2

And for the last part of the question, calculating frequency / string length...

df$rate <- df$freq / nchar(as.character(df$x))

It is necessary to convert df$x back to a character string because data.frame(x,freq) automatically converts strings to factors unless you specify stringsAsFactors=F.

df
#            x freq      rate
#1       ajjss    1 0.2000000
#2     acdjfkj    0 0.0000000
#3 auyjyjjksjj    2 0.1818182

这篇关于使用grep确定字符串的频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆