从R中向量中的条目中提取字符 [英] Extracting characters from entries in a vector in R

查看:120
本文介绍了从R中向量中的条目中提取字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Excel中有一些名为leftrightmid的函数,您可以在其中从单元格中提取部分条目.例如,=left(A1, 3)将返回单元格A1中最左边的3个字符,而=mid(A1, 3, 4)将以单元格A1中的第三个字符开始,并为您提供3-6个字符.R中是否有类似的函数或类似的简单函数?怎么做到的?

There are functions in Excel called left, right, and mid, where you can extract part of the entry from a cell. For example, =left(A1, 3), would return the 3 left most characters in cell A1, and =mid(A1, 3, 4) would start with the the third character in cell A1 and give you characters number 3 - 6. Are there similar functions in R or similarly straightforward ways to do this?

作为一个简化的样本问题,我想取一个向量

As a simplified sample problem I would like to take a vector

sample<-c("TRIBAL","TRISTO", "RHOSTO", "EUGFRI", "BYRRAT")

并创建3个新矢量,其中包含每个条目的前3个字符,每个条目的中2个字符和每个条目的后4个字符.

and create 3 new vectors that contain the first 3 characters in each entry, the middle 2 characters in each entry, and the last 4 characters in each entry.

一个Excel所不具备的功能(我知道),这是一个稍微复杂一点的问题,那就是如何用每个条目的第1个,第3个和第5个字符来创建一个新的向量.

A slightly more complicated question that Excel doesn't have a function for (that I know of) would be how to create a new vector with the 1st, 3rd, and 5th characters from each entry.

推荐答案

您正在寻找函数substr或它的近亲substring:

You are looking for the function substr or its close relative substring:

前导字符很简单:

substr(sample, 1, 3)
[1] "TRI" "TRI" "RHO" "EUG" "BYR"

因此正在定义的位置提取一些字符:

So is extracting some characters at a defined position:

substr(sample, 2, 3)
[1] "RI" "RI" "HO" "UG" "YR"

要获取结尾字符,您有两个选择:

To get the trailing characters, you have two options:

substr(sample, nchar(sample)-3, nchar(sample))
[1] "IBAL" "ISTO" "OSTO" "GFRI" "RRAT"

substring(sample, nchar(sample)-3)
[1] "IBAL" "ISTO" "OSTO" "GFRI" "RRAT"


最后一个复杂"问题:


And your final "complicated" question:

characters <- function(x, pos){
  sapply(x, function(x)
    paste(sapply(pos, function(i)substr(x, i, i)), collapse=""))
}
characters(sample, c(1,3,5))
TRIBAL TRISTO RHOSTO EUGFRI BYRRAT 
 "TIA"  "TIT"  "ROT"  "EGR"  "BRA" 

这篇关于从R中向量中的条目中提取字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆