我如何在R中使用grep参数? [英] How can I use grep with parameters in R?
问题描述
#some stringfasdjlk465öfsdj123
#R
test< - fasdjlk465öfsdj123
grep( [0-9],test,value = TRUE)#返回fasdjlk465öfsdj123
#shell
grep -o'[0-9]'fasdjlk465öfsdj123
#返回4 6 5 1 2 3
我在R中缺少的参数是什么?
编辑:Joris Meys的建议非常接近我所要做的。作为readLines的结果,我得到了一个向量。我想检查矢量的每个元素的数字并返回这些数字。我很惊讶没有标准的解决方案。我想过使用一些适用于字符串的regexp函数,并返回像grep -o这样的匹配,然后在该向量上使用lapply。 grep.custom最接近 - 我会尽力为我工作。
Spacedman已经说过了。如果你真的想在shell中模拟grep,你必须使用 strsplit()
:
> chartest< - unlist(strsplit(test,))
> chartest
[1]fasdjlk465öfsd j123
> grep([0-9],chartest,value = T)
[1]465123
编辑:
正如Nico所说的,如果你想为完整的规则表达式,您需要使用 gregexpr()
和 substr()
。
grep.custom< - function(x,pattern){
pre>
strt< - gregexpr(pattern,x)[[1]]
lngth< - attributes(strt)$ match.length
stp< - strt + lngth - 1
apply(cbind(strt,stp),1,function(i){substr(x,i [1],i [2])})
}
然后:
> grep.custom(test,sd)
[1]sdsd
> grep.custom(test,[0-9])
[1]465123
> grep.custom(test,[az] s [az])
[1]asdfsd
EDIT2:
对于向量,使用函数
Vectorize()
,例如:
> X< -c(sq25dfgj,sqd265jfm,qs55d26fjm)
> v.grep.custom< - Vectorize(grep.custom)
> v.grep.custom(X,[0-9] +)
$ sq25dfgj
[1]25
$ sqd265jfm
[1] 265
$ qs55d26fjm
[1]5526
,如果你想从shell调用grep,请参阅
?system
Obviously I dont get the way grep works in R. If I use grep on my OS X terminal, I am able to use the parameter -o which makes grep only return the matching part. In R, I can't find how to do a corresponding thing. Reading the manual I thought values was the right approach, which is better inasmuch that it returns characters not indexes, but still returns the whole string.
# some string fasdjlk465öfsdj123 # R test <- fasdjlk465öfsdj123 grep("[0-9]",test,value=TRUE) # returns "fasdjlk465öfsdj123" # shell grep -o '[0-9]' fasdjlk465öfsdj123 # returns 4 6 5 1 2 3
What's the parameter I am missing in R ?
EDIT: Joris Meys' suggestions comes really close to what I am trying to do. I get a vector as a result of readLines. And I'd like to check every element of the vector for numbers and return these numbers. I am really surprised there's no standard solution for that. I thought of using some regexp function that works on a string and returns the match like grep -o and then use lapply on that vector. grep.custom comes closest – i'll try to make that work for me.
解决方案Spacedman said it already. If you really want to simulate grep in the shell, you have to work on the characters itself, using
strsplit()
:> chartest <- unlist(strsplit(test,"")) > chartest [1] "f" "a" "s" "d" "j" "l" "k" "4" "6" "5" "ö" "f" "s" "d" "j" "1" "2" "3" > grep("[0-9]",chartest,value=T) [1] "4" "6" "5" "1" "2" "3"
EDIT :
As Nico said, if you want to do this for complete regular expressions, you need to use the
gregexpr()
andsubstr()
. I'd make a custom function like this one :grep.custom <- function(x,pattern){ strt <- gregexpr(pattern,x)[[1]] lngth <- attributes(strt)$match.length stp <- strt + lngth - 1 apply(cbind(strt,stp),1,function(i){substr(x,i[1],i[2])}) }
Then :
> grep.custom(test,"sd") [1] "sd" "sd" > grep.custom(test,"[0-9]") [1] "4" "6" "5" "1" "2" "3" > grep.custom(test,"[a-z]s[a-z]") [1] "asd" "fsd"
EDIT2 :
for vectors, use the function
Vectorize()
, eg:> X <- c("sq25dfgj","sqd265jfm","qs55d26fjm" ) > v.grep.custom <- Vectorize(grep.custom) > v.grep.custom(X,"[0-9]+") $sq25dfgj [1] "25" $sqd265jfm [1] "265" $qs55d26fjm [1] "55" "26"
and if you want to call grep from the shell, see
?system
这篇关于我如何在R中使用grep参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!