SQL Server传输数据时生成重复行 [英] SQL Server generating duplicate rows when transfering data

查看:98
本文介绍了SQL Server传输数据时生成重复行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用AdventureWorks数据库为另一个数据库生成一些测试数据.当我执行以下查询时,它将产生24个重复的主键.因此,除非关闭目标数据库表上的PK,否则查询将不会运行.源数据库不包含这些重复项.

I''m trying to use the AdventureWorks database to generate some test data for another database. When I execute the following query it produces 24 duplicate primary keys. So the query will not run unless I turn off the PK on the target db table. The source db does not contain these duplicates.

INSERT INTO [TestDb] (TestId, Name, Location)
SELECT
  p.[BusinessEntityID],
  p.[FirstName]+ISNULL(p.[MiddleName], '')+p.[LastName],
  a.[City]
FROM [AdventureWorks2008R2].[Person].[Person] AS p
INNER JOIN [AdventureWorks2008R2].[Person].[BusinessEntityAddress] AS b
ON p.BusinessEntityID = b.BusinessEntityID
INNER JOIN [AdventureWorks2008R2].[Person].[Address] AS a
ON a.AddressID = b.AddressID;



关于为什么要创建重复项的任何建议?有没有更好的方法来生成测试数据?如果我不在乎丢失信息,那么删除重复行的最佳方法是什么?问候,Dave



Any suggestion as to why the duplicates are being created? Is there a better way to generate test data? What is the best way to delete duplicate rows, given that I don''t care if I lose information? Regards, Dave

推荐答案

如果您想要唯一的键(例如主键),则不要设置该表.不要插入主键
If you want unique keys, as in a primary, for the table then don''t set it. Don''t insert the primary key


您将得到重复,因为在此查询中,BusinessEntityID不是唯一的.

每个人的BusinessEntityID都是唯一的.但是每个人可能有多个地址.
通过将Person与BusinessEntityAdress结合在一起,您将获得该人拥有的每个地址的一行.因此,如果一个人有两个地址,则会得到两行.

You are getting duplicates because in this query BusinessEntityID is not unique.

BusinessEntityID is unique for every Person. But every person may have more than one Address.
By joining together Person with BusinessEntityAdress you get a row for every Address the person has. So if a Person has two Addresses you get two rows.

BusinessEntityID  (No column name)  City
2996                  AmandaSCook         Everett
2996                  AmandaSCook         Everett
2997                  AmandaLMorgan         San Gabriel



在此示例中,AmandaSCook在BusinessEntityAddress中具有两个条目.
但是您仍然将Person.BusinessEntityID作为要插入TestDB的键.那就是您重复和PK违规的来源.

如果您只需要testdata而不关心这些信息,请尝试以下操作:



In this example AmandaSCook has two entries in BusinessEntityAddress.
But you still take Person.BusinessEntityID as key to insert into TestDB. That''s where your duplicates and PK violations come from.

Given you just want testdata and don''t care for the information, try this:

select
  p.[BusinessEntityID],
  p.[FirstName]+ISNULL(p.[MiddleName], '')+p.[LastName],
  a.[City]
FROM [AdventureWorks2008R2].[Person].[Person] AS p
INNER JOIN [AdventureWorks2008R2].[Person].[BusinessEntityAddress] AS b
ON p.BusinessEntityID = b.BusinessEntityID
INNER JOIN [AdventureWorks2008R2].[Person].[Address] AS a
ON a.AddressID = b.AddressID
group by  p.[BusinessEntityID],
  p.[FirstName]+ISNULL(p.[MiddleName], '')+p.[LastName],
  a.[City]
having count(*) = 1


这篇关于SQL Server传输数据时生成重复行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆