使用SQL连接和子查询在R中查询两个表 [英] Using SQL join and subquery to query two tables in R

查看:305
本文介绍了使用SQL连接和子查询在R中查询两个表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是初学者.

我有两个.txt文件,并且我将R和sqldf pakage一起使用来查询它们

I have two .txt files and I'm using R with sqldf pakage to query them

第一个表(venues.txt)如下:

The first table (venues.txt) look like this:

userID,venueID,year,month,date,hour
1302,47,2012,2,24,11
45,132,2012,2,24,11
24844,86,2012,2,24,11
896,248,2012,2,24,11
5020,29,2012,2,24,11

第二个表(friends.txt)如下:

The second table (friends.txt) look like this:

userID,friendID
1,5
1,9
1,50
1,102
1,300

我想查询用户(例如userID = 1)与一个或多个朋友(friendID)一起访问过的场所(venueID)

I want to query the venues (venueID) that a user (say userID=1) visited WITH one or more of his friends (friendID)

注意:朋友表的userID,friendID都可以链接到场所表中的userID

Note: both userID,friendID of friends table could be linked to userID in venues table

查询结果应如下所示:

venueID  friendID
47       5
47       9
29       102
86       102

我可以使用许多单独的查询来执行此操作,然后将它们联接到表中,但是我的数据集非常大.有没有更简单的方法可以做到这一点?

I can do this using many separate queries and then join them in a table but my dataset is very larg. Is there any easier way to do this?

我能够查询用户或其朋友访问过的所有场所:

I was able to query all venues that have been visited by a user or his friends:

sqldf("select userID, venueID from data
       where userID=1 OR userID IN (select friendID from freind where userID=1)")

非常感谢.

推荐答案

我是Java pl/sql开发人员,因此以下是我的回答:至少有两个朋友访问过的场所列表" 仅使用join并假设来自场地文件的数据称为场地,而FROM.子句中的friends.txt文件称为朋友.基本上,我假设这些文件是表.

I am a Java pl/sql developer so here is my shot to answer to:"a list of venues that were visited by at least two friends" using only join and assuming data from venues.txt is called venues and friends.txt file is called friend in the FROM clause. Basically, I am assuming that those files are tables.

SELECT v1.venueID, f.friendID

FROM venues v1 
INNER JOIN friends f ON v1.userID = f.userID 
INNER JOIN venues v2 ON v2.userID = f.friendID

WHERE
   v1.venueID = v2.venueID

,如果您想添加更多条件,即至少有两个朋友一起拜访,因此具有相同的年,月,日,时",则只需将其添加到过滤器(WHERE子句).因此查询看起来像这样:

and if you want to add more conditions i.e."at least two friends visited together, so having the same year, month, date, hour" then just add them to the filter(WHERE clause). So the query would look like this:

SELECT v1.venueID, f.friendID

FROM venues v1 
INNER JOIN friends f ON v1.userID = f.userID 
INNER JOIN venues v2 ON v2.userID = f.friendID

WHERE
   v1.venueID = v2.venueID
   v1.year = v2.year
   v1.month = v2.month
   v1.date = v2.date
   v1.hour = v2.hour

如果场地上有2个以上的朋友(或者有可能同时),则可能需要在SELECT语句中使用DISTINCT.

You might need to use DISTINCT in the SELECT statement if there are more than 2 friends at the venue(or optionally at the same time).

这篇关于使用SQL连接和子查询在R中查询两个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆