SQL查询-长时间运行/占用CPU资源 [英] SQL Query - long running / taking up CPU resource

查看:243
本文介绍了SQL查询-长时间运行/占用CPU资源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我有下面的SQL查询,该查询平均需要40分钟才能运行,它引用的表中有700万条记录。

Hello I have the below SQL query that is taking on average 40 minutes to run, one of the tables that it references has over 7 million records in it.

我已经通过数据库调整顾问运行了此操作并应用了所有建议,而且我已经在sql的活动监视器中对其进行了评估,因此未建议进一步的索引等。

I have ran this through the database tuning advisor and applied all recommendations, also I have assesed it within the activity monitor in sql and no further indexes etc have been recommended.

任何建议会很棒,谢谢您

Any suggestions would be great, thanks in advance

WITH CTE AS 
(
    SELECT r.Id AS ResultId,
    r.JobId,
    r.CandidateId,
    r.Email,
    CAST(0 AS BIT) AS EmailSent,
    NULL AS EmailSentDate,
    'PICKUP' AS EmailStatus,
    GETDATE() AS CreateDate,
    C.Id AS UserId,
    C.Email AS UserEmail,
    NULL AS Subject
    FROM Result R
    INNER JOIN Job J ON R.JobId = J.Id
    INNER JOIN User C ON J.UserId = C.Id
    WHERE 
    ISNULL(J.Approved, CAST(0 AS BIT)) = CAST(1 AS BIT)
    AND ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT)
    AND ISNULL(R.Email,'') <> '' -- has an email address
    AND ISNULL(R.EmailSent, CAST(0 AS BIT)) = CAST(0 AS BIT) -- email has not been sent
    AND R.EmailSentDate IS NULL -- email has not been sent
    AND ISNULL(R.EmailStatus,'') = '' -- email has not been sent
    AND ISNULL(R.IsEmailSubscribe, 'True') <> 'False' -- not unsubscribed
    -- not already been emailed for this job
    AND NOT EXISTS (
        SELECT SMTP.Email
        FROM SMTP_Production SMTP
        WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
    )
    -- not unsubscribed
    AND NOT EXISTS (

        SELECT u.Id FROM Unsubscribe u
        WHERE  ISNULL(u.EmailAddress, '') = ISNULL(R.Email, '')

    )
    AND NOT EXISTS (
        SELECT SMTP.Id FROM SMTP_Production SMTP
        WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
    )   
    AND C.Id NOT IN (
        -- list of ids
    )
    AND J.Id NOT IN (
        -- list of ids
    )
    AND J.ClientId NOT IN 
    (
        -- list of ids
    )
)
INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, ConsultantId, ConsultantEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT 
    CTE.ResultId,
    CTE.JobId,
    CTE.CandidateId,
    CTE.Email,
    CTE.EmailSent,
    CTE.EmailSentDate,
    CTE.EmailStatus,
    CTE.CreateDate,
    CTE.UserId,
    CTE.UserEmail,
    NULL
FROM CTE
  INNER JOIN 
    (
        SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
        FROM CTE

    ) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1

请在下面查看我的更新查询:

Please see my updated query below:

WITH CTE AS 
(
    SELECT R.Id AS ResultId,
    r.JobId,
    r.CandidateId,
    R.Email,
    CAST(0 AS BIT) AS EmailSent,
    NULL AS EmailSentDate,
    'PICKUP' AS EmailStatus,
    GETDATE() AS CreateDate,
    C.Id AS UserId,
    C.Email AS UserEmail,
    NULL AS Subject
    FROM RESULTS R
    INNER JOIN JOB J ON R.JobId = J.Id
    INNER JOIN Consultant C ON J.UserId = C.Id
    WHERE 
    J.DCApproved = 1
    AND (J.Closed = 0 OR J.Closed IS NULL)
    AND (R.Email <> '' OR R.Email IS NOT NULL)
    AND (R.EmailSent = 0 OR R.EmailSent IS NULL)
    AND R.EmailSentDate IS NULL -- email has not been sent
    AND (R.EmailStatus = '' OR R.EmailStatus IS NULL)
    AND (R.IsEmailSubscribe = 'True' OR R.IsEmailSubscribe IS NULL)
    -- not already been emailed for this job
    AND NOT EXISTS (
        SELECT SMTP.Email
        FROM SMTP_Production SMTP
        WHERE SMTP.JobId = R.JobId AND SMTP.CandidateId = R.CandidateId
    )
    -- not unsubscribed
    AND NOT EXISTS (

        SELECT u.Id FROM Unsubscribe u
        WHERE (u.EmailAddress = R.Email OR (u.EmailAddress IS NULL AND R.Email IS NULL))

    )
    AND NOT EXISTS (
        SELECT SMTP.Id FROM SMTP_Production SMTP
        WHERE SMTP.EmailStatus = 'PICKUP' AND SMTP.CandidateId = R.CandidateId
    )   
    AND C.Id NOT IN (
        -- LIST OF IDS
    )
    AND J.Id NOT IN (
        -- LIST OF IDS
    )
    AND J.ClientId NOT IN 
    (
        -- LIST OF IDS
    )
)

INSERT INTO smtp_production (ResultId, JobId, CandidateId, Email, EmailSent, EmailSentDate, EmailStatus, CreateDate, UserId, UserEmail, Subject)
OUTPUT INSERTED.ResultId,GETDATE() INTO ResultstoUpdate
SELECT 
    CTE.ResultId,
    CTE.JobId,
    CTE.CandidateId,
    CTE.Email,
    CTE.EmailSent,
    CTE.EmailSentDate,
    CTE.EmailStatus,
    CTE.CreateDate,
    CTE.UserId,
    CTE.UserEmail,
    NULL
FROM CTE
  INNER JOIN 
    (
        SELECT *, row_number() over(partition by CTE.Email, CTE.CandidateId order by CTE.EmailSentDate desc) as rn
        FROM CTE

    ) DCTE ON CTE.ResultId = DCTE.ResultId AND DCTE.rn = 1


GO


推荐答案

WHERE 中使用 ISNULL JOIN 子句可能是这里的主要原因。对查询中的列使用函数会使查询变得无法SARG(这意味着它无法使用表上的任何索引,因此可以扫描整个对象)。 注意;使用针对变量的函数,在 WHERE 通常可以。例如, WHERE SomeColumn = DATEADD(DAY,@n,@SomeDate)。像 WHERE SomeColumn = ISNULL(@ Variable,0)之类的东西有包罗万象的查询的味道,因此对性能影响很大。取决于您的设置。

Using ISNULL in your WHERE and JOIN clauses is probably the main cause here. Using functions against columns in your query causes the query to become non-SARGable (meaning that it can't use any of the indexes on your table(s) and so it has the scan the whole thing). Note; using functions against variables, in there WHERE is normally fine. For example WHERE SomeColumn = DATEADD(DAY, @n, @SomeDate). Things like WHERE SomeColumn = ISNULL(@Variable,0) have the smell of a "catch-all query", so can be performance hitters; depending on your set up. This isn't the discussion at hand though.

对于诸如 ISNULL(J.Closed,CAST(0 AS BIT))的子句)= CAST(0 AS BIT)因此,这对于查询优化器来说是一个头疼的问题,而且您的查询中充斥着它们。您需要用以下子句替换这些子句:

For clauses like ISNULL(J.Closed, CAST(0 AS BIT)) = CAST(0 AS BIT) this is therefore a big headache for the query optimiser and your query is riddled with them. You'll need to replace these with clauses like:

WHERE (J.Closed = 0 OR J.Closed IS NULL)

尽管没什么区别,但没有必要其中的 0 。 SQL Server可以看到您正在与 bit 作比较,因此也会将 0 解释为一个。

Although it makes no difference, there's no need to CAST the 0 there either. SQL Server can see you're making a comparison to a bit and will therefore interpret the 0 as one as well.

您还具有 EXISTS WHERE 子句 ISNULL(u.EmailAddress,'')= ISNULL(R.Email,'')。这将需要变为:

You also have a EXISTS with the WHERE clause ISNULL(u.EmailAddress, '') = ISNULL(R.Email, ''). This will need to become:

WHERE (u.EmailAddress = R.Email
  OR   (u.EmailAddress IS NULL AND R.Email IS NULL))

您需要更改全部 ISNULL 在 WHERE 子句(CTE和子查询)中的使用情况>,您应该会看到不错的性能提升

You'll need to change all of your ISNULL usage in your WHERE clauses (the CTE and the subqueries) and you should see a decent performance increase.

这篇关于SQL查询-长时间运行/占用CPU资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆