这是一个很好的DB模式的位置 [英] Is this a good DB schema for locations

查看:225
本文介绍了这是一个很好的DB模式的位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个特定位置的应用程序 - 将其视为店铺所在地址信息的商店定位器,其他用户只能在一定范围内查看附近的商店。然而,这是一个不同的意思,一个确切的位置不需要,只有城市/州是必需的(奇怪的,我知道)。我已经考虑了存储位置的模式,并且已经决定了这个。

I'm working on an application that is location specific -- think of it as a store locator where store owners enter their address information and other users can only see nearby stores within a certain range. However, it's a little different in the sense that an exact location is not required, only the city/state is required (weird, I know). I have thought about the schema for storing locations, and have decided on this one.

位置

id                      -- int
formatted_address       -- varchar(200)
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city                    -- varchar(40)
state                   -- varchar(40)
state_code              -- varchar(3)
postal_code             -- varchar(10)
country                 -- varchar(40)
country_code            -- varchar(3)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

以下是有关申请的一些说明:

Here are some notes about the application:


  • 我想保持国际地点的门li>
  • 我打算使用地理编码服务来搜索并验证商店所有者指定的地点。

  • 我真的只需要纬度/纬度,但是显示商店信息需要其他数据

  • 格式化地址字段将包含完整格式的地址 - 例如,Giants Stadium,50 NJ-120,East Rutherford,NJ 07073,USA - to允许更容易地搜索存储的位置

  • 可能会有很多重复的字段,因为每行可能具有不同的粒度级别 - 例如 123 Main Street,City,State 12345 Main Street,City,State 12345 不同,因为有一个指定的街道号码,另一个没有

  • I want to keep the door open for international locations
  • I plan to use a geocoding service to search for and validate the locations specified by the store owner
  • I truly only need the lat/lon, but the other data is necessary for displaying store information
  • The formatted_address field will contain the fully formatted address -- e.g., Giants Stadium, 50 NJ-120, East Rutherford, NJ 07073, USA -- to allow for easier searching of stored locations
  • There will possibly be a lot of duplicate fields, because each row may have a different level of granularity -- for instance, 123 Main Street, City, State 12345 is different from Main Street, City, State 12345 because one has a specified street number and the other doesn't

我明白模式不是很正常化,但是我也看不到需要对它进行规范化,因为位置非常复杂,这就是为什么我依靠稳定的地理编码服务(google)。此外,我打算允许自由格式输入/搜索,所以不需要任何下拉列表。

I understand that the schema is not very normalized, but I also do not see the need to normalize it any more because locations are very complex, which is why I'm relying on a stable geocode service (google). Also, I plan to allow freeform text input/search, so theres no need for any dropdown lists.

有没有人看到任何错误或有任何改进,考虑到我提到的内容?我可以看到这个表格变得越来越大。

Does anybody see anything wrong or have any improvements, taking into consideration what I've mentioned? I can see this table growing rather large.

推荐答案

我不这么认为。这是我两分钟的简介:

I do not think so. Here is my two-minute synopsis:

这是非常严格的归一化。至少 city - > 国家/地区应该移出到不同的表格(并从那里规范化)。我相信邮政编码可以跨越城市边界(或者我非常错误地忘记);我不知道这样的城市越过国界。

This very badly normalized. At least city->country should be moved out to a different table (and normalized from there). I believe postal codes can cross city boundaries though (or I am very badly misremembering); I am not aware of such a city that crosses a state boundary.

formatted_address 是一个优化,应该可能是一个计算字段:也就是说,重新创建它的所有数据应该存在于其他地方。 (这意味着现在不需要担心)。

formatted_address is an "optimization" and should likely be a computed field: that is, all the data to re-create it should exist elsewhere. (This means that it doesn't need to worried about now.)

设计愉快。

简单的更规范化表单只是做上述建议:

The simple "more-normalized" form just doing the above proposed:

LOCATIONS
location_id             -- int
is_point_of_interest    -- bool
name                    -- varchar(100) -- NULL
street_number           -- varchar(10)  -- NULL
street                  -- varchar(40)  -- NULL
city_id                 -- int
postal_code             -- varchar(10)
latitude                -- float(10,6)
longitude               -- float(10,6)
last_updated_at         -- timestamp

CITIES
city_id                 
name                    -- varchar
-- similarly, the following should be normalized to STATES and COUNTRIES
state                   -- varchar(40)
state_code              -- varchar(3)
country                 -- varchar(40)
country_code            -- varchar(3)

当然,CITIES可以进一步规范化,所以可以一个POSTALS表:我不太了解邮政编码,还是应用程式域名。 postal_code 作为隐式复合代理-FK的一部分,所以它不像它那样非常可怕的。但是,将其移动到单独的表中可以轻松地允许验证和完整性约束。

Of course, CITIES can be further normalized, and so could a POSTALS table: I don't know enough about postal codes, or the application domain though. postal_code acts as part of an implicit compound-surrogate-FK so it's not super terrible as it is there. However, moving it into a separate table could easily allow verification and integrity constraints.

编辑:规范化POSTALs表将是最好的,因为只有一个非常多的邮政编码对于一个特定的城市是有效的:虽然我不知道邮政编码和城市之间的关系,所以我不能推荐如何做到这一点。可能看现有的模式?

Normalizing a POSTALs table would be best, as only a very samll number of postal codes are valid for a given city: I am not sure the relation between a postal code and a city, though, so I can't recommend how to do this. Perhaps look at existing schemas used?

这篇关于这是一个很好的DB模式的位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆