如何做vlookup并填写(像在Excel中)在R? [英] How to do vlookup and fill down (like in Excel) in R?
问题描述
VLOOKUP
并填写。 我将如何做同样的事情在 R
?
本质上,我有一个 HouseType
变量,我需要计算 HouseTypeNo
。以下是一些示例数据:
HouseType HouseTypeNo
Semi 1
单个2
第3行
单人2
公寓4
公寓4
行3
如果我正确理解你的问题,这里有四种方法来做相当于Excel的 VLOOKUP
并使用 R
:
#从Q
/ pre>
加载样本数据 - read.table(header = TRUE,
stringsAsFactors = FALSE,
text =HouseType HouseTypeNo
Semi 1
单个2
行3
单个2
公寓4
公寓4
行3)
#创建一个具有'HouseType'列的玩具大表
#但没有'HouseTypeNo'列(还)
largetable< - data.frame(HouseType = as.character(sample(unique(hous(house $ HouseTy pe),1000,replace = TRUE)),stringsAsFactors = FALSE)
#创建一个查找表来获取数字以填充
#大表
lookup <独特的(房子)
HouseType HouseTypeNo
1 Semi 1
2单人2
3行3
5公寓4
以下是四种方法来填充
可目标定位$中的
HouseTypeNo
c $ c>使用查找
表中的值:
首先使用
merge
in base:#1.使用base
base1< - (merge ,largetable,by ='HouseType'))
p>
#2.使用基础和命名矢量
housenames < - as.numeric(1:length(unique $ HouseType))
名称(名称)< - unique(hous $ HouseType)
base2< - data.frame(HouseType = largetable $ HouseType,
HouseTypeNo =(housenames [largetable $ HouseType]))
第三,使用
plyr
包:#3.使用plyr包
库(plyr)
plyr1< - join(largetable,lookup,by =HouseType)
使用
sqldf
包#4.使用sqldf包
库(sqldf)
sqldf1< - sqldf(SELECT largetable.HouseType,lookup.HouseTypeNo
FROM largetable
INNER JOIN查找
ON largetable.HouseType = lookup.HouseType )
如果可能的一些房屋类型在
可定位
不存在于lookup
中,则会使用左连接:sqldf(select * from largetable left join lookup using(HouseType))
相应的更改还需要其他解决方案。
这是你想要的做了吗让我知道你喜欢哪种方法,我会添加评论。
I have a dataset about 105000 rows and 30 columns. I have a categorical variable that I would like to assign it to a number. In Excel, I would probably do something with
VLOOKUP
and fill.How would I go about doing the same thing in
R
?Essentially, what I have is a
HouseType
variable, and I need to calculate theHouseTypeNo
. Here are some sample data:HouseType HouseTypeNo Semi 1 Single 2 Row 3 Single 2 Apartment 4 Apartment 4 Row 3
解决方案If I understand your question correctly, here are four methods to do the equivalent of Excel's
VLOOKUP
and fill down usingR
:# load sample data from Q hous <- read.table(header = TRUE, stringsAsFactors = FALSE, text="HouseType HouseTypeNo Semi 1 Single 2 Row 3 Single 2 Apartment 4 Apartment 4 Row 3") # create a toy large table with a 'HouseType' column # but no 'HouseTypeNo' column (yet) largetable <- data.frame(HouseType = as.character(sample(unique(hous$HouseType), 1000, replace = TRUE)), stringsAsFactors = FALSE) # create a lookup table to get the numbers to fill # the large table lookup <- unique(hous) HouseType HouseTypeNo 1 Semi 1 2 Single 2 3 Row 3 5 Apartment 4
Here are four methods to fill the
HouseTypeNo
in thelargetable
using the values in thelookup
table:First with
merge
in base:# 1. using base base1 <- (merge(lookup, largetable, by = 'HouseType'))
A second method with named vectors in base:
# 2. using base and a named vector housenames <- as.numeric(1:length(unique(hous$HouseType))) names(housenames) <- unique(hous$HouseType) base2 <- data.frame(HouseType = largetable$HouseType, HouseTypeNo = (housenames[largetable$HouseType]))
Third, using the
plyr
package:# 3. using the plyr package library(plyr) plyr1 <- join(largetable, lookup, by = "HouseType")
Fourth, using the
sqldf
package# 4. using the sqldf package library(sqldf) sqldf1 <- sqldf("SELECT largetable.HouseType, lookup.HouseTypeNo FROM largetable INNER JOIN lookup ON largetable.HouseType = lookup.HouseType")
If it's possible that some house types in
largetable
do not exist inlookup
then a left join would be used:sqldf("select * from largetable left join lookup using (HouseType)")
Corresponding changes to the other solutions would be needed too.
Is that what you wanted to do? Let me know which method you like and I'll add commentary.
这篇关于如何做vlookup并填写(像在Excel中)在R?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!