按字符类型左联接R [英] Left join in R by character type
问题描述
我试图从2个不同的网站上抓取电影背景
am trying to scrape movie sets from 2 different websites
我想结合这两个信息,并使用电影标题名称将它们捆绑在一起.这是第一个数据集看起来像
And I want to combine these 2 informations, and tied them up using the movie title name. Here is the first dataset looks like
structure(list(event_name = c("maze runner: the death cure", "star wars: the last jedi",
"spider-man: homecoming"), event_start_time = structure(c(100,
200, 300), class = "Date"), movie_sold_all = c(100L, 200L,
300L)), .Names = c("event_name", "event_start_time", "movie_sold_all"
), row.names = c(NA, 3L), class = "data.frame")
这是我抓到的第二个数据集
And this is the 2nd dataset that i've scraped
我必须上传图片,因为有> 10列
I have to upload image only since there are >10 columns
我希望拥有的是加入movie_title
,因此他们将合并这2条信息.基本上类似于SQL中的left join
What i expect to have is to join the movie_title
so they it'll incorporate these 2 informations. Basically similar like left join
in SQL
我尝试了merge( df_bq_movies,movies,by.y="movie_title",all.x = TRUE)
但发生错误
Error in merge.data.frame(df_bq_movies, movies, by.y = "movie_title", :'by.x' and 'by.y' specify different numbers of columns
有关更多信息,这是数据集的维度
For more information, this is the dimension of the dataset
data 1 : 605 rows , 3 column
data 2 : 509 rows , 21 column
data 1 : 605 rows , 3 column
data 2 : 509 rows , 21 column
推荐答案
使用合并,必须定义by.x
和by.y
,如果两个数据集上的列名相同,则只能使用by
.
With merge you have to define both by.x
and by.y
if the column name is the same on both datasets you can just use by
instead.
例如
merge( df_bq_movies, movies, by.x = "event_name", by.y = "movie_title")
这篇关于按字符类型左联接R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!