sql server如果不存在则插入并将插入的id放入另一个表中 [英] sql server Insert if not exists and get the inserted id into another table
问题描述
我是 SQL Server 的新手,正在从事一个记录日志的项目.
I am new to SQL Server and working on a project to record logs.
该表有一个具有重复值的 URL 列 varchar(max).我创建了另一个表,它只存储不同的 URL,ID 存储在主表中
The table has a URL column varchar(max) which has repeating value. I created another table which stores only the distinct URLs and the IDs are stored in the main table
这是我的存储过程:
CREATE TABLE #TestData (
logdate DATETIME,
id CHAR(15),
value VARCHAR(max)
)
DECLARE @sql VARCHAR(max)
SET @sql = 'BULK INSERT [dbo].[#TestData] FROM ''' + @pfile + ''' WITH (
firstrow = 2,
fieldterminator = ''\t'',
rowterminator = ''\n''
)'
EXEC (@sql)
create table #testurl(fld varchar(max))
INSERT INTO #testurl(fld) (
SELECT distinct (
CASE
WHEN (PATINDEX('%url="%', value) > 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%url="%', value) + 5), (CHARINDEX('"', value, (PATINDEX('%url="%', value) + 5)) - (PATINDEX('%url="%', value) + 5))), ''))
END
) FROM #TestData )
INSERT INTO url (urlvalue) (
SELECT tu.fld FROM #testurl tu WHERE NOT EXISTS (
SELECT urlid
FROM url u
WHERE u.urlvalue = tu.fld))
INSERT INTO [Cyberoam].[dbo].[logmst] (
DATETIME,
c1c2,
c3c4,
c5c6,
c7,
c8to12,
STATUS,
username,
usergrp,
application,
category,
categorytype,
urlid,
recvbytes,
sentbytes,
fw_rule_id,
srcip,
dstip,
contenttype
)
SELECT logdate,
SUBSTRING(value, (PATINDEX('%log_id=%', value) + 7), 2),
SUBSTRING(value, (PATINDEX('%log_id=%', value) + 9), 2),
SUBSTRING(value, (PATINDEX('%log_id=%', value) + 11), 2),
SUBSTRING(value, (PATINDEX('%log_id=%', value) + 13), 1),
SUBSTRING(value, (PATINDEX('%log_id=%', value) + 14), 5),
CASE
WHEN (SUBSTRING(value, (PATINDEX('%status="%', value) + 8), (CHARINDEX('"', value, (PATINDEX('%status="%', value) + 8)) - (PATINDEX('%status="%', value) + 8)))) = 'Allow'
THEN '1'
WHEN (SUBSTRING(value, (PATINDEX('%status="%', value) + 8), (CHARINDEX('"', value, (PATINDEX('%status="%', value) + 8)) - (PATINDEX('%status="%', value) + 8)))) = 'Deny'
THEN '0'
ELSE NULL
END,
CASE
WHEN (ISNULL(PATINDEX('%user_name="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%user_name="%', value) + 11),(CHARINDEX('"', value, (PATINDEX('%user_name="%', value) + 11)) - (PATINDEX('%user_name="%', value) + 11))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%user_gp="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%user_gp="%', value) + 9), (CHARINDEX('"', value, (PATINDEX('%user_gp="%', value) + 9)) - (PATINDEX('%user_gp="%', value) + 9))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%application="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%application="%', value) + 13), (CHARINDEX('"', value, (PATINDEX('%application="%', value) + 13)) - (PATINDEX('%application="%', value) + 13))), ''))
WHEN (isnull(PATINDEX('%application_name="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%application_name="%', value) + 18), (CHARINDEX('"', value, (PATINDEX('%application_name="%', value) + 18)) - (PATINDEX('%application_name="%', value) + 18))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%category="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%category="%', value) + 10), (CHARINDEX('"', value, (PATINDEX('%category="%', value) + 10)) - (PATINDEX('%category="%', value) + 10))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%category_type="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%category_type="%', value) + 15), (CHARINDEX('"', value, (PATINDEX('%category_type="%', value) + 15)) - (PATINDEX('%category_type="%', value) + 15))), ''))
ELSE NULL
END,
(
SELECT urlid
FROM url
WHERE urlvalue = (
CASE
WHEN (isnull(PATINDEX('%url="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%url="%', value) + 5), (CHARINDEX('"', value, (PATINDEX('%url="%', value) + 5)) - (PATINDEX('%url="%', value) + 5))), ''))
ELSE NULL
END
)
),
CASE
WHEN (isnull(PATINDEX('%recv_bytes=%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%recv_bytes=%', value) + 11), (PATINDEX('%[^0-9]%', (nullif(SUBSTRING(value, (PATINDEX('%recv_bytes=%', value) + 11), 20), ''))))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%sent_bytes=%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%sent_bytes=%', value) + 11), (PATINDEX('%[^0-9]%', (nullif(SUBSTRING(value, (PATINDEX('%sent_bytes=%', value) + 11), 20), ''))))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%fw_rule_id=%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%fw_rule_id=%', value) + 11), (CHARINDEX(' ', value, (PATINDEX('%fw_rule_id=%', value) + 11)) - (PATINDEX('%fw_rule_id=%', value) + 11))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%src_ip=%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%src_ip=%', value) + 7), (CHARINDEX(' ', value, (PATINDEX('%src_ip=%', value) + 7)) - (PATINDEX('%src_ip=%', value) + 7))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%dst_ip=%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%dst_ip=%', value) + 7), (CHARINDEX(' ', value, (PATINDEX('%dst_ip=%', value) + 7)) - (PATINDEX('%dst_ip=%', value) + 7))), ''))
ELSE NULL
END,
CASE
WHEN (isnull(PATINDEX('%contenttype="%', value), 0) <> 0)
THEN (nullif(SUBSTRING(value, (PATINDEX('%contenttype="%', value) + 13), (CHARINDEX('"', value, (PATINDEX('%contenttype="%', value) + 13)) - (PATINDEX('%contenttype="%', value) + 13))), ''))
ELSE NULL
END
FROM #TestData
此代码工作正常,但问题是运行包含大约 5k 条记录的文件的批量插入所花费的时间逐渐增加(随着 URL 表增加达到 20 分钟,其中约有 5k 条记录).许多这样的文件将被插入.
This code works fine, but the problem is that the time taken to run bulk insert with a file of around 5 k records increases gradually( as the URL table increases reaching 20 mins with ~ 5k records in it ).Many such file will be inserted.
需要您的建议,以了解我如何提高性能或者我做错了什么
Need your suggestions as how I can improve the performance or if am I doing something wrong
非常感谢您的帮助.谢谢!
Appreciate your help very much. Thank you!
注意:如果 URL 列在同一个表中,大约需要 4-7 秒.如果在同一个表中或分开,它会产生任何性能差异吗?
Note : It take around 4-7 secs if the URL column is in the same table. will it make any performance difference if its in the same table or separated?
推荐答案
有一个解决方案.如果有人需要,这里是解决方案.由于 URL 是 varchar(max),所以花费的时间很长.我删除了聚集 PK 索引.添加了一个用于 url 校验和的新列,并在其上创建了一个聚集索引.并改变了:
Got a solution for this.Here is the solution if someone needs it. The time taken was long because of the URL being varchar(max). I deleted the clustered PK index.Added a new column for checksum of url and created a clustered index on it. And changed :
SELECT urlid
FROM url
WHERE urlvalue = (@value)
到:
SELECT urlid
FROM url
WHERE checksum_urlvalue = checksum(@value) nad urlvalue = (@value)
这将执行时间缩短到 7-8 秒.谢谢大家的回复.快乐编码:)
This reduced the execution time to 7-8 secs. Thank you all for your responses. Happy coding :)
这篇关于sql server如果不存在则插入并将插入的id放入另一个表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!