如何在大型 SQL Server 查询中使用多核? [英] How do I make use of multiple cores in Large SQL Server Queries?

查看:63
本文介绍了如何在大型 SQL Server 查询中使用多核?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两台 SQL Server,一台用于生产,一台用作存档.每天晚上,我们都有一个 SQL 作业运行并将当天的生产数据复制到存档中.随着我们的成长,这个过程需要的时间越来越长.当我观察运行归档进程的归档服务器的利用率时,我发现它只使用了一个内核.而且由于这个盒子有八个核心,这是一种巨大的资源浪费.作业在凌晨 3 点运行,因此它可以免费使用它可以找到的任何和所有资源.

如果弄清楚如何构建 SQL Server 作业以便它们可以利用多核,我需要做什么,但我找不到任何关于解决这个问题的文献.我们正在运行 SQL Server 2005,但如果 2008 解决了这个问题,我当然可以推动升级.

解决方案

您是否有自动维护计划来更新统计信息、重建索引等?如果没有,SQL Server 可能仍在根据较小表的旧统计信息构建查询计划.

如果满足某些条件,SQL Server 会自动生成并行查询计划.来自 MSDN 上的文章:><块引用>

1. SQL Server 是在一台有多个微处理器的计算机上运行还是CPU,如对称多处理计算机(SMP)?仅有的具有多个 CPU 的计算机可以使用并行查询.

2.SQL Server上活跃的并发用户数是多少此时安装?SQL服务器监控 CPU 使用率并进行调整查询的并行度启动时间.较低的度数如果 CPU 使用率是,则选择并行性高.

3.是否有足够的内存可用于并行查询执行?每个查询需要一定数量的要执行的内存.执行一个并行查询需要更多内存比非并行查询.数量执行一个所需的内存并行查询随着并行度.如果记忆中并行计划的要求给定的并行度不能满意,SQL Server 减少并行度自动或完全放弃平行计划对于给定工作负载中的查询上下文并执行串行计划.

4.执行的查询类型是什么?大量消耗 CPU 周期的查询是并行的最佳人选询问.例如,大的连接表格,大量聚合,以及大结果集的排序很好候选人.简单的查询,经常在事务处理中找到应用程序,找到额外的执行一项所需的协调并行查询的重要性超过潜在的性能提升.到区分查询受益于并行性和那些没有好处,SQL Server比较估计成本使用成本执行查询并行度阈值.虽然不推荐,但用户可以使用更改默认值 5sp_configure.

5.在给定的流中处理的行数是否足够?如果查询优化器决定数量流中的行数太少,它不引入交换操作符分发流.因此,本次运营商流是串行执行的.串行执行操作符计划避免出现以下情况启动、分发和协调成本大于收益由并行算子实现执行.

其他因素:

是否将 SQL Server 配置为与单个处理器具有关联性?

是否将最大并行度选项设置为 1?

-- 编辑 --

你试过分析这个过程吗?看看 SQL Server 生成的查询计划会很有趣.

你有可以发布的示例代码吗?

如果您有自动夜间备份作业,是否可以简单地将备份恢复到存档?

I have two SQL Servers, one for production, and one as an archive. Every night, we've got a SQL job that runs and copies the days production data over to the archive. As we've grown, this process takes longer and longer and longer. When I watch the utilization on the archive server running the archival process, I see that it only ever makes use of a single core. And since this box has eight cores, this is a huge waste of resources. The job runs at 3AM, so it's free to take any and all resources it can find.

So what I need to do if figure out how to structure SQL Server jobs so they can take advantage of multiple cores, but I can't find any literature on tackling this problem. We're running SQL Server 2005, but I could certainly push for an upgrade if 2008 takes care of this problem.

解决方案

Do you have an automated maintenance plan to update statistics, rebuild indexes, etc.? If not, SQL Server may still be building its query plans on your older statistics of smaller tables.

SQL Server generates parallel query plans automatically, if certain conditions are met. From an article on MSDN:

1.Is SQL Server running on a computer with more than one microprocessor or CPU, such as a symmetric multiprocessing computer (SMP)? Only computers with more than one CPU can use parallel queries.

2.What is the number of concurrent users active on the SQL Server installation at this moment? SQL Server monitors CPU usage and adjusts the degree of parallelism at the query startup time. Lower degrees of parallelism are chosen if CPU usage is high.

3.Is there sufficient memory available for parallel query execution? Each query requires a certain amount of memory to execute. Executing a parallel query requires more memory than a nonparallel query. The amount of memory required for executing a parallel query increases with the degree of parallelism. If the memory requirement of the parallel plan for a given degree of parallelism cannot be satisfied, SQL Server decreases the degree of parallelism automatically or completely abandons the parallel plan for the query in the given workload context and executes the serial plan.

4.What is the type of query executed? Queries heavily consuming CPU cycles are the best candidates for a parallel query. For example, joins of large tables, substantial aggregations, and sorting of large result sets are good candidates. Simple queries, often found in transaction processing applications, find the additional coordination required to execute a query in parallel outweigh the potential performance boost. To distinguish between queries that benefit from parallelism and those that do not benefit, SQL Server compares the estimated cost of executing the query with the cost threshold for parallelism value. Although not recommended, users can change the default value of 5 using sp_configure.

5.Is there a sufficient amount of rows processed in the given stream? If the query optimizer determines the number of rows in a stream is too low, it does not introduce exchange operators to distribute the stream. Consequently, the operators in this stream are executed serially. Executing the operators in a serial plan avoids scenarios when the startup, distribution, and coordination cost exceeds the gains achieved by parallel operator execution.

Other factors:

Is SQL Server configured to have affinity to a single processor?

Is the max degree of parallelism option is set to 1?

-- EDIT --

Have you tried profiling this process? It would be interesting to see the query plan SQL Server generates.

Do you have sample code you can post?

If you have an automated nightly backup job, can you simply restore the backup to the archive?

这篇关于如何在大型 SQL Server 查询中使用多核?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆