第一次数据库设计:我是否有办法? [英] First-time database design: am I overengineering?

查看:154
本文介绍了第一次数据库设计:我是否有办法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是第一年CS学生,我在爸爸的小生意上兼职工作。我在现实世界的应用程序开发中没有任何经验。我已经在Python中编写了脚本,在C中有一些课程,但是没有这样的。

I'm a first year CS student and I work part time for my dad's small business. I don't have any experience in real world application development. I have written scripts in Python, some coursework in C, but nothing like this.

我爸爸有一个小型的培训企业,目前所有的课程都安排,记录和跟进通过外部Web应用程序。有一个导出/报告功能,但它是非常通用的,我们需要具体的报告。我们没有访问实际的数据库来运行查询。我被要求设置自定义报告系统。

My dad has a small training business and currently all classes are scheduled, recorded and followed up via an external web application. There is an export/"reports" feature but it is very generic and we need specific reports. We don't have access to the actual database to run the queries. I've been asked to set up a custom reporting system.

我的想法是创建通用的CSV导出和导入(可能用Python)他们到每天晚上在办公室托管的MySQL数据库,从那里我可以运行特定需要的查询。我没有数据库的经验,但了解非常基础。我已经阅读了一些关于数据库创建和正常表单。

My idea is to create the generic CSV exports and import (probably with Python) them into a MySQL database hosted in the office every night, from where I can run the specific queries that are needed. I don't have experience in databases but understand the very basics. I've read a little about database creation and normal forms.

我们可能开始有国际客户很快,所以我想数据库不爆炸如果/发生。我们目前还有几家大公司作为客户,有不同的部门(例如ACME母公司,ACME医疗部门,ACME身体护理部门)。

We may start having international clients soon, so I want the database to not explode if/when that happens. We also currently have a couple big corporations as clients, with different divisions (e.g. ACME parent company, ACME healthcare division, ACME bodycare division)

是以下内容:


  1. 从客户角度:


    • 客户是主表

    • 客户连接到他们工作的部门


      • 部门

      • 部门与公司的部门相关

  1. From the client perspective:
    • Clients is the main table
    • Clients are linked to the department they work for
      • Departments can be scattered around a country: HR in London, Marketing in Swansea, etc.
      • Departments are linked to the division of a company

  • 会话是主表


    • 教师链接到每个会话

    • 每个会话都有一个statusid。例如。 0 - 已完成,1 - 已取消

    • 会话分组为任意大小的包。

    喜欢书写)在一张纸上的模式,试图保持它正常化到第三种形式。然后我把它插入到MySQL Workbench,它使它对我来说都很漂亮:
    点击这里,

    I "designed" (more like scribbled) the schema on a piece of paper, trying to keep it normalised to the 3rd form. I then plugged it into MySQL Workbench and it made it all pretty for me:
    (Click here for full-sized graphic)

    alt text http ://maian.org/img/schema.png


    • 有信用余额的客户仍然处于非活动状态(未来未安排类别的客户)

    • 每个客户/部门/部门的出席率


    • 一个月内有多少课程

    • 出席率偏低的客户

    • HR部门的自定义报告

    • Which clients with credit still left are inactive (those without a class scheduled in the future)
    • What is the attendance rate per client/department/division (measured by the status id in each session)
    • How many classes has a teacher had in a month
    • Flag clients who have low attendance rate
    • Custom reports for HR departments with attendance rates of people in their division

    • 这是否是正确的方式?


    • 我为客户添加了一个'lastsession'列,因为它可能是一个常见的查询。这是一个好主意,还是应该保持数据库严格标准化?

    感谢您的时间

    推荐答案

    您的问题有更多答案:

    1)正在接近这样的问题第一次。我认为这个问题上的其他人的指针到目前为止几乎覆盖了它。很好!

    1) You're pretty much on target for someone who is approaching a problem like this for the first time. I think the pointers from others on this question thus far pretty much cover it. Good job!

    2& 3)您将采取的性能打击将在很大程度上取决于为您的特定查询/过程和更重要的是记录量拥有和优化正确的索引。除非你正在谈论你的主表中超过一百万条记录,你似乎正在走上一个足够主流的设计,性能不会是合理的硬件上的问题。

    2 & 3) The performance hit you will take will largely be dependent on having and optimizing the right indexes for your particular queries / procedures and more importantly the volume of records. Unless you are talking about well over a million records in your main tables you seem to be on track to having a sufficiently mainstream design that performance will not be an issue on reasonable hardware.

    这就是说,这涉及到你的问题3,你开始你可能不应该真的过分担心性能或超敏感性正常化正统在这里。这是您正在构建的报告服务器,而不是基于事务的应用程序后端,这将在性能或规范化的重要性方面具有非常不同的配置文件。支持实时注册和调度应用程序的数据库必须注意返回数据需要几秒钟的查询。报表服务器功能不仅对复杂和冗长的查询具有更多的容差,而且提高性能的策略也大不相同。

    That said, and this relates to your question 3, with the start you have you probably shouldn't really be overly worried about performance or hyper-sensitivity to normalization orthodoxy here. This is a reporting server you are building, not a transaction based application backend, which would have a much different profile with respect to the importance of performance or normalization. A database backing a live signup and scheduling application has to be mindful of queries that take seconds to return data. Not only does a report server function have more tolerance for complex and lengthy queries, but the strategies to improve performance are much different.

    例如,在基于事务的应用程序环境中,性能改进选项可能包括将存储过程和表结构重构到第n级,或者开发少量缓存策略的通用请求数据。在报告环境中,您当然可以这样做,但是您可以通过引入快照机制对性能产生更大的影响,其中计划的进程运行并存储预配置的报告,并且您的用户访问快照数据,每个请求基础。

    For example, in a transaction based application environment your performance improvement options might include refactoring your stored procedures and table structures to the nth degree, or developing a caching strategy for small amounts of commonly requested data. In a reporting environment you can certainly do this but you can have an even greater impact on performance by introducing a snapshot mechanism where a scheduled process runs and stores pre-configured reports and your users access the snapshot data with no stress on your db tier on a per request basis.

    这些都是一个长期的咆哮,说明你使用什么设计原则和技巧可能会有所不同,因为db的作用创造。我希望这是有帮助的。

    All of this is a long-winded rant to illustrate that what design principles and tricks you employ may differ given the role of the db you're creating. I hope that's helpful.

    这篇关于第一次数据库设计:我是否有办法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆