我如何在R中使用grep参数? [英] How can I use grep with parameters in R?

查看:100
本文介绍了我如何在R中使用grep参数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很明显,我没有得到在R中使用grep的方式。如果我在我的OS X终端上使用grep,我可以使用参数-o使grep只返回匹配的部分。在R中,我找不到如何做相应的事情。阅读手册我认为值是正确的方法,这是更好的,因为它返回字符而不是索引,但仍返回整个字符串。

 #some stringfasdjlk465öfsdj123
#R
test< - fasdjlk465öfsdj123
grep( [0-9],test,value = TRUE)#返回fasdjlk465öfsdj123

#shell
grep -o'[0-9]'fasdjlk465öfsdj123
#返回4 6 5 1 2 3

我在R中缺少的参数是什么?

编辑:Joris Meys的建议非常接近我所要做的。作为readLines的结果,我得到了一个向量。我想检查矢量的每个元素的数字并返回这些数字。我很惊讶没有标准的解决方案。我想过使用一些适用于字符串的regexp函数,并返回像grep -o这样的匹配,然后在该向量上使用lapply。 grep.custom最接近 - 我会尽力为我工作。

解决方案

Spacedman已经说过了。如果你真的想在shell中模拟grep,你必须使用 strsplit()

 > chartest<  -  unlist(strsplit(test,))
> chartest
[1]fasdjlk465öfsd j123
> grep([0-9],chartest,value = T)
[1]465123

编辑:



正如Nico所说的,如果你想为完整的规则表达式,您需要使用 gregexpr() substr()

  grep.custom<  -  function(x,pattern){
strt< - gregexpr(pattern,x)[[1]]
lngth< - attributes(strt)$ match.length
stp< - strt + lngth - 1
apply(cbind(strt,stp),1,function(i){substr(x,i [1],i [2])})
}
pre>

然后:

 > grep.custom(test,sd)
[1]sdsd
> grep.custom(test,[0-9])
[1]465123
> grep.custom(test,[az] s [az])
[1]asdfsd

EDIT2:

对于向量,使用函数 Vectorize(),例如:

 > X< -c(sq25dfgj,sqd265jfm,qs55d26fjm)
> v.grep.custom< - Vectorize(grep.custom)
> v.grep.custom(X,[0-9] +)
$ sq25dfgj
[1]25

$ sqd265jfm
[1] 265

$ qs55d26fjm
[1]5526

,如果你想从shell调用grep,请参阅?system


Obviously I dont get the way grep works in R. If I use grep on my OS X terminal, I am able to use the parameter -o which makes grep only return the matching part. In R, I can't find how to do a corresponding thing. Reading the manual I thought values was the right approach, which is better inasmuch that it returns characters not indexes, but still returns the whole string.

# some string  fasdjlk465öfsdj123 
# R
test <-  fasdjlk465öfsdj123 
grep("[0-9]",test,value=TRUE) # returns  "fasdjlk465öfsdj123"

# shell
grep -o '[0-9]' fasdjlk465öfsdj123
# returns 4 6 5 1 2 3

What's the parameter I am missing in R ?

EDIT: Joris Meys' suggestions comes really close to what I am trying to do. I get a vector as a result of readLines. And I'd like to check every element of the vector for numbers and return these numbers. I am really surprised there's no standard solution for that. I thought of using some regexp function that works on a string and returns the match like grep -o and then use lapply on that vector. grep.custom comes closest – i'll try to make that work for me.

解决方案

Spacedman said it already. If you really want to simulate grep in the shell, you have to work on the characters itself, using strsplit() :

> chartest <- unlist(strsplit(test,""))
> chartest
 [1] "f" "a" "s" "d" "j" "l" "k" "4" "6" "5" "ö" "f" "s" "d" "j" "1" "2" "3"
> grep("[0-9]",chartest,value=T)
[1] "4" "6" "5" "1" "2" "3"

EDIT :

As Nico said, if you want to do this for complete regular expressions, you need to use the gregexpr() and substr(). I'd make a custom function like this one :

grep.custom <- function(x,pattern){
    strt <- gregexpr(pattern,x)[[1]]
    lngth <- attributes(strt)$match.length
    stp <- strt + lngth - 1
    apply(cbind(strt,stp),1,function(i){substr(x,i[1],i[2])})
}

Then :

> grep.custom(test,"sd")
[1] "sd" "sd"
> grep.custom(test,"[0-9]")
[1] "4" "6" "5" "1" "2" "3"
> grep.custom(test,"[a-z]s[a-z]")
[1] "asd" "fsd"

EDIT2 :

for vectors, use the function Vectorize(), eg:

> X <- c("sq25dfgj","sqd265jfm","qs55d26fjm" )
> v.grep.custom <- Vectorize(grep.custom)
> v.grep.custom(X,"[0-9]+")
$sq25dfgj
[1] "25"

$sqd265jfm
[1] "265"

$qs55d26fjm
[1] "55" "26"

and if you want to call grep from the shell, see ?system

这篇关于我如何在R中使用grep参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆