如何在我的数据库中实现Twitter转发操作 [英] How to implement Twitter retweet action in my database

查看:140
本文介绍了如何在我的数据库中实现Twitter转发操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在实现类似于Twitter的Web应用程序。我需要实施转发动作,一个tweet可以由一个人多次转发



我有一个基本的tweets具有以下列的表:


推文: tweet_id | tweet_text | tweet_date_created | tweet_user_id


(其中 tweet_id 是tweets的主要关键, tweet_text 包含tweet文本, tweet_date_created 是创建推文时的DateTime, tweet_user_id 用户表的外键,并标识已创建推文的用户)



现在我想知道应该如何我在我的数据库中实现转发操作。



选项1



我应该创建新的连接表,看起来像这样:


Retweets :tweet_id | user_id | retweet_date_retweeted


(其中 tweet_id tweets 表, user_id 用户表的外键,并标识具有转发了tweet, retweet_date_retweeted 是一个DateTime,它指定转发完成时间。)



将不会创建空列,当用户进程重新生成时,转发表中的新行将被创建。



cons:查询过程将更加困难,需要加入两个表格,并以某种方式对tweet进行两个日期的排序(当tweet不转发时,按tweet_date_created排序,tweet is retweet ,请通过retweet_date_retweeted进行排序。



选项2



或者我应该在 tweets 表格为 parent_id ,则会如下所示:


推文: tweet_id | tweet_text | tweet_date_created | tweet_user_id | parent_id


(其中所有列保持不变, parent_id 是一个外部关键在于 tweets table。创建推文时, parent_id 保持为空,当tweet被转发时, parent_id 包含原始tweet id, tweet_user_id 包含处理转发操作的用户, tweet_date_created 在转发完成时包含DateTime,而 tweet_text 保持为空 - becouse 我们不会让用户在转发时更改原始的推文。 / p>

优点:查询过程更加优雅,因为我不必加入两个表。



cons:每次tweet都会被转发,将会有空单元格。所以如果我的数据库中有1千个tweets,并且每个都被转发5次,那么我的 tweets 表将会有5000行。






哪种最有效的方式?拥有空单元格或查询过程更为干净的最好方法是否更好?

解决方案

IMO选项#1会更好。加入tweet和转发表的查询将不会复杂,可以通过左连接或内部连接来完成,具体取决于您是要显示所有的tweets还是只显示被转发的tweets。并且连接查询应该是表格较窄的,正在连接的列是int,并且它们将由于FK约束而各自具有索引。



另一个建议是不要使用tweet或retweet标记所有的列,那些可以从存储数据的表中推断出来,例如:

  tweet 
id
user_id
文本
created_at

retweet
tweet_id
user_id
created_at

和样例联接:

 #返回所有转发的tweets 
SELECT
count(*),
t.id
FROM
tweet AS t
INNER JOIN转发AS rt ON rt.tweet_id = t.id
GROUP BY
t.id

#返回特定鸣叫的推特和可能的转发数据
SELECT
t.id
FROM
tweet AS t
LEFT OUTER JOIN retweet AS rt ON rt.tweet_id = t.id
WHERE
t.id =: tweetId

- 每个请求更新 -



以下只是示范,代表为什么我会选择#1选项,没有外键,也没有任何索引,你必须自己添加。但是结果应该表明连接不会太痛苦。

  CREATE TABLE`tweet`(
` id` int(10)unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(10)unsigned NOT NULL,
`value` varchar(255)NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
PRIMARY KEY(`id`)
)ENGINE = MyISAM AUTO_INCREMENT = 8 DEFAULT CHARSET = utf8

CREATE TABLE`retweet`(
`id` int(10)unsigned NOT NULL AUTO_INCREMENT,
`tweet_id` int(10)unsigned NOT NULL,
`user_id` int(10)unsigned NOT NULL,
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY(`id`)
)ENGINE = MyISAM AUTO_INCREMENT = 3 DEFAULT CHARSET = utf8;

#样本行

mysql>从推文中选择*
+ ---- + --------- + ---------------- + ------------- -------- +
| id | user_id |值| created_at |
+ ---- + --------- + ---------------- + ------------- -------- +
| 1 | 1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
| 2 | 1 | User1 | Tweet2 | 2012-07-27 00:04:35 |
| 3 | 2 | User2 | Tweet1 | 2012-07-27 00:04:47 |
| 4 | 3 | User3 | Tweet1 | 2012-07-27 00:04:58 |
| 5 | 1 | User1 | Tweet3 | 2012-07-27 00:06:47 |
| 6 | 1 | User1 | Tweet4 | 2012-07-27 00:06:50 |
| 7 | 1 | User1 | Tweet5 | 2012-07-27 00:06:54 |
+ ---- + --------- + ---------------- + ------------- -------- +

mysql> select * from retweet;
+ ---- + ---------- + --------- + ------------------- - +
| id | tweet_id | user_id | created_at |
+ ---- + ---------- + --------- + ------------------- - +
| 1 | 4 | 1 | 2012-07-27 00:06:37 |
| 2 | 3 | 1 | 2012-07-27 00:07:11 |
+ ---- + ---------- + --------- + ------------------- - +

#查询以拉取user_id = 1的所有推文,包括从新到旧的转发和订单

select * from(
select t。*从tweet as t where user_id = 1
union
select t。* from tweet as t where t.id in(select tweet_id from retweet where user_id = 1))
a order by created_at desc;

mysql>选择* from(选择t。* from tweet as t where user_id = 1 union select t。* from tweet as t where t.id in(select tweet_id from retweet where user_id = 1))order by created_at desc;
+ ---- + --------- + ---------------- + ------------- -------- +
| id | user_id |值| created_at |
+ ---- + --------- + ---------------- + ------------- -------- +
| 7 | 1 | User1 | Tweet5 | 2012-07-27 00:06:54 |
| 6 | 1 | User1 | Tweet4 | 2012-07-27 00:06:50 |
| 5 | 1 | User1 | Tweet3 | 2012-07-27 00:06:47 |
| 4 | 3 | User3 | Tweet1 | 2012-07-27 00:04:58 |
| 3 | 2 | User2 | Tweet1 | 2012-07-27 00:04:47 |
| 2 | 1 | User1 | Tweet2 | 2012-07-27 00:04:35 |
| 1 | 1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
+ ---- + --------- + ---------------- + ------------- -------- +

注意在最后一组结果中,我们可以还包括转发,并在#3的转发之前显示#4的转发。



- 更新 -



您可以通过更改查询来完成您要求的内容:

 选择* from(
select t.id,t.value,t.created_at from tweet as t where user_id = 1
union
select t.id,t.value,rt.created_at from tweet as t inner join retweet as rt on rt.tweet_id = t.id where rt.user_id = 1)
a order by created_at desc;

mysql> select * from(select t.id,t.value,t.created_at from tweet as t where user_id = 1 union select t.id,t.value,rt.created_at from tweet as t inner join retweet as rt on rt.tweet_id = t.id其中rt.user_id = 1)按照create_at desc的顺序;
+ ---- + ---------------- + --------------------- +
| id |值| created_at |
+ ---- + ---------------- + --------------------- +
| 3 | User2 | Tweet1 | 2012-07-27 00:07:11 |
| 7 | User1 | Tweet5 | 2012-07-27 00:06:54 |
| 6 | User1 | Tweet4 | 2012-07-27 00:06:50 |
| 5 | User1 | Tweet3 | 2012-07-27 00:06:47 |
| 4 | User3 | Tweet1 | 2012-07-27 00:06:37 |
| 2 | User1 | Tweet2 | 2012-07-27 00:04:35 |
| 1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
+ ---- + ---------------- + --------------------- +


I am implementing web application similar to Twitter. I need to implement 'retweet' action, and one tweet can by retweeted by one person multiple times.

I have a basic 'tweets' table that have columns for:

Tweets: tweet_id | tweet_text | tweet_date_created | tweet_user_id

(where tweet_id is primary key for tweets, tweet_text contains tweet text, tweet_date_created is the DateTime when tweet was created and tweet_user_id is the foreign key to users table and identifies user who has created the tweet)

Now I am wondering how should I implement the retweet action in my database.

Option 1

Should I create new join table, which would look like this:

Retweets: tweet_id | user_id | retweet_date_retweeted

(Where tweet_id is a foreign key to tweets table, user_id is a foreign key to users table and identifies user who has retweeted the tweet, retweet_date_retweeted is a DateTime which specifies when the retweet was done.)

pros: There will be no empty columns, when user process reteet, new line in retweets table will be created.

cons: Querying process will be more difficult, it will need to join two tables and somehow sort the tweets by two dates (when tweet is not retweet, sort it by tweet_date_created, when tweet is retweet, sort it by retweet_date_retweeted).

Option 2

Or should I implement it in the tweets table as parent_id, it will then look like this:

Tweets: tweet_id | tweet_text | tweet_date_created | tweet_user_id | parent_id

(Where all the columns remains the same and parent_id is a foreign key to the same tweets table. When tweet is created, parent_id remains empty. When tweet is retweeted, parent_id contains origin tweet id, tweet_user_id contains user which processed the retweet action, tweet_date_created contains the DateTime when retweet was done, and tweet_text remains empty - becouse we will not let users change the original tweet when retweeting.)

pros: Querying process is much more elegant, as I do not have to join two tables.

cons: There will be empty cells every time tweet is retweeted. So if I have 1 000 tweets in my database and every of them is retweeted for 5 times, there will be 5 000 lines in my tweets table.


Which is the most efficient way? Is it better to have empty cells or to have querying process more clean?

解决方案

IMO option #1 would be better. The query to join the tweet and retweet tables would not be at all complex and could be done via a left or inner join, depending on whether you want to show all tweets or only tweets which were retweeted. And the join query should be performant as the table is narrow, the columns being joined are ints, and they will each have indices due to the FK constraints.

Another recommendation is not to label all your columns with tweet or retweet, those can be inferred from the table in which the data is stored, for example:

tweet
    id
    user_id
    text
    created_at

retweet
    tweet_id
    user_id
    created_at

And sample joins:

# Return all tweets which have been retweeted
SELECT
    count(*),
    t.id
FROM
    tweet AS t
INNER JOIN retweet AS rt ON rt.tweet_id = t.id
GROUP BY
    t.id

# Return tweet and possible retweet data for a specific tweet
SELECT
    t.id
FROM
    tweet AS t
LEFT OUTER JOIN retweet AS rt ON rt.tweet_id = t.id
WHERE
    t.id = :tweetId

-- Update per request --

The following is demonstrative only, representing why I would opt for option #1, there are no foreign keys nor are there any indices, you will have to add these yourself. But the results should demonstrate that the joins won't be too painful.

CREATE TABLE `tweet` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `user_id` int(10) unsigned NOT NULL,
    `value` varchar(255) NOT NULL,
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=utf8

CREATE TABLE `retweet` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `tweet_id` int(10) unsigned NOT NULL,
    `user_id` int(10) unsigned NOT NULL,
    `created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;

# Sample Rows

mysql> select * from tweet;
+----+---------+----------------+---------------------+
| id | user_id | value          | created_at          |
+----+---------+----------------+---------------------+
|  1 |       1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
|  2 |       1 | User1 | Tweet2 | 2012-07-27 00:04:35 |
|  3 |       2 | User2 | Tweet1 | 2012-07-27 00:04:47 |
|  4 |       3 | User3 | Tweet1 | 2012-07-27 00:04:58 |
|  5 |       1 | User1 | Tweet3 | 2012-07-27 00:06:47 |
|  6 |       1 | User1 | Tweet4 | 2012-07-27 00:06:50 |
|  7 |       1 | User1 | Tweet5 | 2012-07-27 00:06:54 |
+----+---------+----------------+---------------------+

mysql> select * from retweet;
+----+----------+---------+---------------------+
| id | tweet_id | user_id | created_at          |
+----+----------+---------+---------------------+
|  1 |        4 |       1 | 2012-07-27 00:06:37 |
|  2 |        3 |       1 | 2012-07-27 00:07:11 |
+----+----------+---------+---------------------+

# Query to pull all tweets for user_id = 1, including retweets and order from newest to oldest

select * from (
    select t.* from tweet as t where user_id = 1
    union
    select t.* from tweet as t where t.id in (select tweet_id from retweet where user_id = 1))
a order by created_at desc;

mysql> select * from (select t.* from tweet as t where user_id = 1 union select t.* from tweet as t where t.id in (select tweet_id from retweet where user_id = 1)) a order by created_at desc;
+----+---------+----------------+---------------------+
| id | user_id | value          | created_at          |
+----+---------+----------------+---------------------+
|  7 |       1 | User1 | Tweet5 | 2012-07-27 00:06:54 |
|  6 |       1 | User1 | Tweet4 | 2012-07-27 00:06:50 |
|  5 |       1 | User1 | Tweet3 | 2012-07-27 00:06:47 |
|  4 |       3 | User3 | Tweet1 | 2012-07-27 00:04:58 |
|  3 |       2 | User2 | Tweet1 | 2012-07-27 00:04:47 |
|  2 |       1 | User1 | Tweet2 | 2012-07-27 00:04:35 |
|  1 |       1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
+----+---------+----------------+---------------------+

Notice in the last set of results, that we were able to also include the retweets and display the retweet of #4 before the retweet of #3.

-- Update --

You can accomplish what you are asking for by changing the query a bit:

select * from (
    select t.id, t.value, t.created_at from tweet as t where user_id = 1
    union
    select t.id, t.value, rt.created_at from tweet as t inner join retweet as rt on rt.tweet_id = t.id where rt.user_id = 1)
a order by created_at desc;

mysql> select * from (select t.id, t.value, t.created_at from tweet as t where user_id = 1 union select t.id, t.value, rt.created_at from tweet as t inner join retweet as rt on rt.tweet_id = t.id where rt.user_id = 1) a order by created_at desc;
+----+----------------+---------------------+
| id | value          | created_at          |
+----+----------------+---------------------+
|  3 | User2 | Tweet1 | 2012-07-27 00:07:11 |
|  7 | User1 | Tweet5 | 2012-07-27 00:06:54 |
|  6 | User1 | Tweet4 | 2012-07-27 00:06:50 |
|  5 | User1 | Tweet3 | 2012-07-27 00:06:47 |
|  4 | User3 | Tweet1 | 2012-07-27 00:06:37 |
|  2 | User1 | Tweet2 | 2012-07-27 00:04:35 |
|  1 | User1 | Tweet1 | 2012-07-27 00:04:30 |
+----+----------------+---------------------+

这篇关于如何在我的数据库中实现Twitter转发操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆