将全名向量拆分为2个单独向量的有效方法 [英] Efficient way to split a vector of a full name in to 2 separate vectors

查看:347
本文介绍了将全名向量拆分为2个单独向量的有效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由全名组成的向量,名字和姓氏用逗号分隔,这就是前几个元素的样子:

I have a vector consisting of full names with the first and last name separated by a comma this is what the first few elements look like:

> head(val.vec)
[1] "Aabye,ֲ Edgar"        "Aaltonen,ֲ Arvo"      "Aaltonen,ֲ Paavo"    
[4] "Aalvik Grimsb,ֲ Kari" "Aamodt,ֲ Kjetil Andr" "Aamodt,ֲ Ragnhild

我正在寻找一种将它们分成名字和姓氏的2个单独列的方法.我的最终目的是将它们都作为更大数据框架的一部分.

I am looking for a way to split them in to 2 separate columns of first and last name. My final intention is to have both of them as a part of a bigger data frame.

我尝试使用像这样的strsplit函数

I tried using strsplit function like this

names<-unlist(strsplit(val.vec,','))

但是它给了我一个长向量而不是2个单独的集合,我知道它是 可以使用循环并遍历所有元素,然后将名字和姓氏放在2个单独的向量中,但是考虑到大约有25000条记录,这会花费一些时间.

but it gave me one long vector instead of 2 separate sets, I know it is Possible to use a loop and go over all the elements and place the first and last name in 2 separate vectors, but it is a little time consuming considering the fact that there are about 25000 records.

我看到了一些类似的问题,但是讨论的是如何在C +和Java上做到这一点

I saw a few similar questions but the discussion was how to do it on C+ and Java

推荐答案

我们可以使用read.csvvector转换为具有2列的data.frame

We can use read.csv to convert the vector into a data.frame with 2 columns

read.csv(text=val.vec, header=FALSE, stringsAsFactors=FALSE)


或者如果我们使用的是strsplit,而不是unlist ing(它将把整个list转换为单个vector),我们可以分别提取list中的第一个和第二个元素,以创建两个vector("v1"和"v2").


Or if we are using strsplit, instead of unlisting (which will convert the whole list to a single vector), we can extract the first and second elements in the list separately to create two vectors ('v1' and 'v2').

lst <- strsplit(val.vec,',')
v1 <- lapply(lst, `[`, 1)
v2 <- lapply(lst, `[`, 2)


另一个选择是sub

v1 <- sub(",.*", "", val.vec)
v2 <- sub("[^,]+,", "", val.vec)

数据

val.vec <- c("Aabye,ֲ Edgar", "Aaltonen,ֲ Arvo", "Aaltonen,ֲ Paavo", 
        "Aalvik Grimsb,ֲ Kari", "Aamodt,ֲ Kjetil Andr", "Aamodt,ֲ Ragnhild")

这篇关于将全名向量拆分为2个单独向量的有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆