社交Web应用程序数据库设计:如何改进此架构? [英] Social web application database design: how can I improve this schema?

查看:107
本文介绍了社交Web应用程序数据库设计:如何改进此架构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为诗人和作家开发一个社交网络应用程序,使他们可以分享自己的诗歌,收集反馈并与其他诗人交流.我很少接受数据库设计方面的正式培训,但是我一直在阅读书籍,SO和在线DB设计资源,以确保在不过度设计的情况下确保性能和可伸缩性.

I am developing a social web app for poets and writers, allowing them to share their poetry, gather feedback, and communicate with other poets. I have very little formal training in database design, but I have been reading books, SO, and online DB design resources in an attempt to ensure performance and scalability without over-engineering.

数据库是MySQL,应用程序是用PHP编写的.我不确定我们是否将使用ORM库或在应用程序中从头开始编写SQL查询.除了Web应用程序外,Solr搜索服务器以及某些消息传递客户端将与数据库进行交互.

The database is MySQL, and the application is written in PHP. I'm not sure yet whether we will be using an ORM library or writing SQL queries from scratch in the app. Other than the web application, Solr search server and maybe some messaging client will interact with the database.

我在下面综合起来的模式代表了网站第一版的主要组成部分.最初,用户可以注册该网站并执行以下任一操作:

The schema I have thrown together below represents the primary components of the first version of the website. Initially, users can register for the site and do any of the following:

  • 创建和修改个人资料详细信息和帐户设置
  • 发布,标记和分类他们的写作
  • 阅读,评论和收藏"其他用户的帖子
  • 关注"其他用户以获取其活动的通知
  • 搜索和浏览内容并获得建议的帖子/用户(尽管我们将使用Solr搜索服务器为数据库数据编制索引并运行这些类型的查询)

这是我在MySQL Workbench上为初始站点想到的.对于某些关系数据库方面的内容,我还是有点模糊,所以轻松一点.

Here is what I came up with on MySQL Workbench for the initial site. I'm still a little fuzzy on some relational databasey things, so go easy.

  1. 总的来说,我在做错什么或可以改进的地方吗?
  2. 我是否有任何理由不应该将ExternalAccounts表合并到UserProfiles表中?
  3. 有什么理由不应该将PostStats表合并到Posts表中吗?
  4. 我是否应该扩展设计以包含我们在第二个版本中正在执行的功能,只是为了确保初始架构可以支持它?
  5. 有什么我可以优化Solr索引/性能/其他方面的数据库设计的吗?
  6. 我应该在位置表中使用更自然的主键,例如用用户名代替UserID,还是用邮政编码/区域代码代替位置ID?

感谢您的帮助!

推荐答案

总的来说,我做错了什么还是可以改善的吗?

In general, is there anything I'm doing wrong or can improve upon?

总体而言,我认为您当前的设置或架构中没有任何重大缺陷.

Overall, I don't see any big flaws in your current setup or schema.

我想知道的是将您分为3个User *表.我得到了您想要的目标(具有不同的与用户相关的东西分开),但是我不知道我是否会使用完全相同的东西.如果您打算只显示站点上User表中的数据,这很好,因为不需要在同一页面上多次显示其他信息,但是如果用户需要使用其真实姓名并显示其真实姓名(例如John Doe代替doe55),因为当您可能需要连接时,当数据变大时,这会减慢速度.将Preferences分开似乎是个人选择.我没有支持或反对的论点.

What I'm wonderng is your split into 3 User* tables. I get what you want your intendtion was (having different user-related things seperate) but I don't know if I would go with the exact same thing. If you plan on displaying only data from the User table on the site, this is fine, since the other info is not needed multiple times on the same page but if users need to use their real name and display their real name (like John Doe instead of doe55) than this will slow down things when the data gets bigger since you may require joins. Having the Preferences seperate seems like a personal choice. I have no argument in favor of nor against it.

您的多对多表格不需要附加的PK(例如PostFavoriteID). PostIDUserID的组合主变量就足够了,因为PostFavoriteID从未在其他任何地方使用.这适用于所有联接表

Your many-to-many tables would not need an addtional PK (e.g PostFavoriteID). A combined primary of both PostID and UserID would be enough since PostFavoriteID is never used anywhere else. This goes for all join tables

我是否有任何理由不应该合并ExternalAccounts 表进入UserProfiles表?

Is there any reason why I shouldn't combine the ExternalAccounts table into the UserProfiles table?

与上一版一样.回答,我看不到优势或劣势.我可能将它们放在同一张表中,因为NULL(或者可能更好的-1)值不会打扰我.

As withe the prev. answer, I don't see a advatanage or disadvantage. I may put both in the same table since the NULL (or maybe better -1) values would not bother me.

是否有任何我不应该合并PostStats表的原因 进入帖子表?

Is there any reason why I shouldn't combine the PostStats table into the Posts table?

我会使用触发器将它们放入同一张表中,以处理ViewCount表的增量

I would put them into the same table using a trigger to handle the increment of the ViewCount table

我应该扩展设计以包括 我们正在做的功能 第二版只是为了确保 初始模式可以支持它吗?

Should I expand the design to include the features we are doing in the second version just to ensure that the initial schema can support it?

您正在使用规范化架构,因此可以随时进行任何添加.

You are using a normalsied schema so any additions can be done at any time.

我可以做些什么来优化Solr的数据库设计吗? 索引/性能/什么?

Is there anything I can do to optimize the DB design for Solr indexing/performance/whatever?

不能告诉你,还没有做,但是我知道Solr非常强大和灵活,所以我认为你应该做得很好.

Can't tell you, haven't done it yet but I know that Solr is very powerfull and flexible so I think you should be doing fine.

我应该使用更自然的主键,例如Username而不是 用户ID或邮政编码,而不是 在位置中替代LocationID 桌子?

Should I be using more natural primary keys, like Username instead of UserID, or zip/area code instead of a surrogate LocationID in the Locations table?

这里有许多个线程在讨论这个问题.就个人而言,我更喜欢替代键(或另一个唯一的数字键,如果有的话),因为它使查询变得更加容易和快捷,因为查找int更容易.如果您允许更改用户名/电子邮件/任何您的PK-,则需要进行大量更新.使用代理键,您无需打扰.

There are many threads here on SO discussing this. Personally, I like a surrogate key better (or another unique number key if available) since it makes queries more easier and faster since an int is looked up easier. If you allow a change of username/email/whatever-your-PK-is than there are massive updates required. With the surrogate key, you don't need to bother.

我还要做的是在(最好通过触发器或IMO程序完成)添加created_atlast_accessed之类的东西,以使某些统计信息已经可用.确实可以为您提供有价值的统计信息

What I would also do is to add things like created_at, last_accessed at (best done via triggers or procedures IMO) to have some stats already available. This can realy give you valuable stats

其他创建性能的策略将是诸如内存缓存,计数器缓存,分区表之类的东西...当您确实被用户超负荷使用时,可以讨论这些东西,因为可能存在某些东西/技术/技术/...是非常针对您的问题的.

Further strategies to increate the performance would be things like memcache, counter cache, partitioned tables,... Such things can be discussed when you are really overrun by users because there may be things/technologies/techniques/... that are very specific to your problem.

这篇关于社交Web应用程序数据库设计:如何改进此架构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆