哪个更快/最好? SELECT *或SELECT column1,colum2,column3等 [英] Which is faster/best? SELECT * or SELECT column1, colum2, column3, etc

查看:262
本文介绍了哪个更快/最好? SELECT *或SELECT column1,colum2,column3等的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我听说 SELECT * 在编写SQL命令时通常是不好的做法,因为 SELECT

如果我需要 SELECT 表中的每一列,我应该使用

  SELECT * FROM TABLE 

  SELECT column1,colum2,column3等FROM TABLE 

在这种情况下效率真的很重要吗?我认为 SELECT * 会在内部更优化,如果你真的需要所有的数据,但我说这个没有真正了解数据库。



我很想知道在这种情况下最好的做法是什么。



UPDATE:我可能应该指定,我真正想要做一个 SELECT * 的唯一情况是,当我从一个表中选择数据,其中I知道所有列将总是需要检索,即使添加新列。



鉴于我看到的回应,这仍然是一个坏主意, SELECT *

解决方案

根据您的 选择所有列,此时 之间没有什么区别。然而,意识到数据库模式会改变。如果您使用 SELECT * ,您将获得添加到表中的任何新列,即使很可能,您的代码不准备使用或呈现新数据。这意味着您将暴露您的系统以获得意想不到的性能和功能更改。



您可能愿意以较小的成本将其忽略,

  • 通过网络发送

  • 汇入您的程序

  • (用于ADO类型技术)保存在内存数据表中

  • 忽略并丢弃/垃圾收集

  • 项目#1具有许多隐藏成本,包括消除一些潜在的覆盖索引,导致数据页加载服务器缓存抖动),导致可能被避免的行/页/表锁。



    平衡这与指定列的潜在节省 * ,只有潜在的节省是:


    1. 程序员不需要重新访问SQL以添加列

    2. SQL的网络传输更小/更快

    3. SQL Server查询解析/验证时间

    4. SQL Server查询计划缓存

    对于第1项,现实是, /更改代码以使用您可能添加的任何新列,因此它是一个清洗。



    对于第2项,差异很少足以推入一个不同的数据包大小或网络数据包数。如果你到达SQL语句传输时间是主要问题,你可能需要先降低语句的速率。



    对于第3项,有NO节省因为 * 的扩展无论如何必须发生,这意味着咨询表格模式。实际上,列出列将产生相同的成本,因为它们必须针对模式进行验证。



    对于第4项,当您指定特定列时,查询计划缓存可能会变大,但如果你处理不同的列集合(这不是你指定的)。在这种情况下,您需要不同的缓存条目,因为您需要不同的计划。



    因此,这一切都下来了,因为你指定问题的方式,面对最终模式修改的问题弹性。如果你把这个模式烧成ROM(发生),那么 * 是完全可以接受的。



    但是,我的一般准则是,您只应该选择所需的列,这意味着有时会看起来像您要求



    我的建议是,您应该始终选择特定的列。记住,你擅长一遍遍做什么,所以只是习惯做正确的。



    如果你想知道为什么架构可能会改变没有代码改变,考虑审计日志记录,有效/过期日期以及其他类似的事情,由DBA为系统性地增加合规性问题。另一个引人注目的变化的来源是系统或用户定义字段中其他地方性能的非规范化。


    I've heard that SELECT * is generally bad practice to use when writing SQL commands because it is more efficient to SELECT columns you specifically need.

    If I need to SELECT every column in a table, should I use

    SELECT * FROM TABLE
    

    or

    SELECT column1, colum2, column3, etc. FROM TABLE
    

    Does the efficiency really matter in this case? I'd think SELECT * would be more optimal internally if you really need all of the data, but I'm saying this with no real understanding of database.

    I'm curious to know what the best practice is in this case.

    UPDATE: I probably should specify that the only situation where I would really want to do a SELECT * is when I'm selecting data from one table where I know all columns will always need to be retrieved, even when new columns are added.

    Given the responses I've seen however, this still seems like a bad idea and SELECT * should never be used for a lot more technical reasons that I ever though about.

    解决方案

    Given your specification that you are selecting all columns, there is little difference at this time. Realize, however, that database schemas do change. If you use SELECT * you are going to get any new columns added to the table, even though in all likelihood, your code is not prepared to use or present that new data. This means that you are exposing your system to unexpected performance and functionality changes.

    You may be willing to dismiss this as a minor cost, but realize that columns that you don't need still must be:

    1. Read from database
    2. Sent across the network
    3. Marshalled into your process
    4. (for ADO-type technologies) Saved in a data-table in-memory
    5. Ignored and discarded / garbage-collected

    Item #1 has many hidden costs including eliminating some potential covering index, causing data-page loads (and server cache thrashing), incurring row / page / table locks that might be otherwise avoided.

    Balance this against the potential savings of specifying the columns versus an * and the only potential savings are:

    1. Programmer doesn't need to revisit the SQL to add columns
    2. The network-transport of the SQL is smaller / faster
    3. SQL Server query parse / validation time
    4. SQL Server query plan cache

    For item 1, the reality is that you're going to add / change code to use any new column you might add anyway, so it is a wash.

    For item 2, the difference is rarely enough to push you into a different packet-size or number of network packets. If you get to the point where SQL statement transmission time is the predominant issue, you probably need to reduce the rate of statements first.

    For item 3, there is NO savings as the expansion of the * has to happen anyway, which means consulting the table(s) schema anyway. Realistically, listing the columns will incur the same cost because they have to be validated against the schema. In other words this is a complete wash.

    For item 4, when you specify specific columns, your query plan cache could get larger but only if you are dealing with different sets of columns (which is not what you've specified). In this case, you do want different cache entries because you want different plans as needed.

    So, this all comes down, because of the way you specified the question, to the issue resiliency in the face of eventual schema modifications. If you're burning this schema into ROM (it happens), then an * is perfectly acceptable.

    However, my general guideline is that you should only select the columns you need, which means that sometimes it will look like you are asking for all of them, but DBAs and schema evolution mean that some new columns might appear that could greatly affect the query.

    My advice is that you should ALWAYS SELECT specific columns. Remember that you get good at what you do over and over, so just get in the habit of doing it right.

    If you are wondering why a schema might change without code changing, think in terms of audit logging, effective/expiration dates and other similar things that get added by DBAs for systemically for compliance issues. Another source of underhanded changes is denormalizations for performance elsewhere in the system or user-defined fields.

    这篇关于哪个更快/最好? SELECT *或SELECT column1,colum2,column3等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    相关文章
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆