批量加载XML,引用自动递增的parent-id [英] Bulk load XML referring to auto-increment parent-id

查看:96
本文介绍了批量加载XML,引用自动递增的parent-id的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简而言之:我想对SQL Server(2008)数据库进行XML批量加载,并为父级生成自动增量ID,该ID可在子级中使用。这似乎受到范围的限制:父节点尚未完成,因此尚未插入。有人知道解决这个问题的方法吗?

In short: I want to do an XML Bulk Load to a SQL Server (2008) database and generate auto-increment-id's for a parent, that can be used in the child. This seems limited by the scope: the parent-node is not finished, so not inserted yet. Does anybody know a way around this?

较长的描述(对不起,它确实很长,但我会尽力而为):

The longer description (sorry, it's really long, but I try to be complete):

从一个客户那里,我得到了许多具有类似结构的XML文档,可以从中生成测试数据库。它们被导出以供其他工具使用,我的客户没有权限也没有联系来影响结构或内容。 (这些工具是由另一方为母公司编写的。)他也没有对XML或它们从中导出的数据库的正式描述。

From a customer I got a lot of XML-documents with a similar structure from which to generate a test-DB. They are exported for use by another tool, my customer does not have authority nor contacts to influence the structure nor the contents. (The tools were written by another party for the mother-company.) Nor does he have a formal description of the XML or the database they are exported from.

事实证明,顶部 XML节点< Registration> 确实具有ID,但是这些在文档中不是唯一的。 (顶级节点是相对的,它们确实具有根节点和列表节点,但是在XML中,它们是将其连接到数据库的最高元素。)ID可以在其他XML文档中使用,因为它们引用到另一个不在导出中的对象< Case> 。因此,我需要生成自动增量ID,以使所有< Registration> -元素在整个文件中也保持唯一。

It turns out that the 'top' XML-nodes <Registration> do have ID's, but that these are not unique across documents. (Top nodes is relative, they do have a root node and a list-node, but in the XML they are the highest element that will make it to the database.) The ID's may be used in other XML-documents, because they refer to another object <Case> that is not in the export. So I need to generate auto-increment-id's to keep all <Registration>-elements unique even across files.

我的< Registration> 节点有很多女儿,例如< Activity> 节点。这些节点需要引用其父节点,因此它们应使用生成的auto-increment-id。但是,由于它们是未完成的父节点的一部分,因此该父节点仍在作用域中,并且尚未插入表中,如 msdn technet 。但是,这些站点上的示例具有显式的唯一 CustomerId ,而不是自动生成的ID。

My <Registration>-node has many daughters, e.g. the <Activity>-node. These nodes need to refer to their parent, so they should use the generated auto-increment-id. However, since they are part of an unfinished parent-node, the parent-node is still in scope, and it is not inserted in the table yet, as explained in "Record Subset and the Key Ordering Rule" on msdn and technet. However, the examples on these sites have an explicit unique CustomerId, not an auto-generated Id.

尽管有关密钥排序规则的文档看起来无法完成,但我无法相信缺少缺少XML文件的方法(唯一)ID。甚至更陌生的是:它确实在子代中插入了一个父代ID,但是数字少了一个。因此,我假设这是前一个作用域的auto-increment-id(其中0是默认值,未插入任何内容,但我确实期望为NULL)。因此,我确实看到了一种解决方法:在我的子表中增加父键( UPDATE活动SET RegistrationId = RegistrationId + 1 )。但是,这确实需要保持限制( WHERE TimeStamp> ... ),并且没有其他(手动或脚本)干预。

Although this documentation about the "Key Ordering Rule" makes it look like this can not be done, I can not believe there is no way around this for XML-files lacking (unique) ID's. Even stranger is: it does insert a parent-id in the child, but the number is one lower. So I assume this to be the auto-increment-id from the previous scope (where 0 is the default with nothing inserted yet, I did expect a NULL). So I do see one work-around: increment the parent-key in my child-table afterwards (UPDATE Activity SET RegistrationId = RegistrationId + 1). However, this does require keeping a limit (WHERE TimeStamp > ...) and no other (manual or scripting) interventions.

我尝试了许多不同的关系和VB脚本(例如,我希望自动生成表),但是我将发布最新的尝试。这也将说明从上一个作用域插入自动增量ID。

I have tried a lot of different relations-ships and VB-scripts (e.g. I would prefer auto-generation of my tables), but I'll just post my latest attempt. This will also serve to illustrate the insertion of the auto-increment-id from the previous scope.

我的主要问题是:


  • 是否有可能获得正确的自动递增的父母ID?

其他提示也非常受欢迎,例如:

But other tips are very welcome, like:


  • 在没有显式<$ c $的情况下用于自动生成自动增量标识的设置c>在SQL中创建表语句?

  • what setting to use to auto-generate the auto-increment identity without an explicit CREATE TABLE-statement in SQL?

生成表:

CREATE TABLE [dbo].[Registration](
  [Id] INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_Registration PRIMARY KEY,
  [XmlId] [nvarchar](40) NULL,
)
CREATE TABLE [dbo].[Activity](
  [Id] INT IDENTITY(1,1) NOT NULL CONSTRAINT PK_Activity PRIMARY KEY,
  [RegistrationId] INT CONSTRAINT FK_Activity_Registration FOREIGN KEY (RegistrationId) REFERENCES Registration (Id),
  [XmlId] [nvarchar](1000) NULL,
)

T要导入的XML文件:

The XML-file to import:

<Updates>
  <Registrations>
    <Registration ID="NonUniqCaseId-123">
      <Activities>
        <Activity ID="UniqActId-1234" />
        <Activity ID="UniqActId-1235" />
      </Activities>
    </Registration>
    <Registration ID="NonUniqCaseId-124">
      <Activities>
        <Activity ID="UniqActId-1241" />
        <Activity ID="UniqActId-1242" />
      </Activities>
    </Registration>
  </Registrations>
</Updates>

用于测试上传的VB脚本(我想稍后在程序中包含一个循环,以处理多个文件):

The VB-script to test the upload (I want to include a loop in a program later, to handle multiple files):

    Dim objBL 
Set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkload.4.0")
objBL.ConnectionString = "provider=SQLOLEDB;data source=localhost;database=Test;integrated security=SSPI"
objBL.ErrorLogFile = "error.log"

objBL.CheckConstraints = False
objBL.XMLFragment = False
objBL.SchemaGen = True
objBL.SGDropTables = False
objBL.KeepIdentity = False

objBL.Execute "BulkTestMapping.xsd", "BulkTestContents.xml"
Set objBL = Nothing

XSD:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"
    attributeFormDefault="qualified"
    elementFormDefault="qualified"
    xmlns:sql="urn:schemas-microsoft-com:mapping-schema">

  <xs:annotation>
    <xs:appinfo>
      <sql:relationship name="Registration_Activity"
            parent="Registration"
            parent-key="Id"
            child="Activity"
            child-key="RegistrationId"
            inverse="true"
            />
    </xs:appinfo>
  </xs:annotation>

  <xs:element name="Registration"
              sql:relation="Registration"
              sql:key-fields="Id" 
            >
    <xs:complexType>
      <xs:sequence>
        <xs:element name="Activities" minOccurs="0" maxOccurs="unbounded" sql:is-constant="true">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="Activity" minOccurs="0" maxOccurs="unbounded"
                     sql:relation="Activity" 
                     sql:key-fields="RegistrationId"
                     sql:relationship="Registration_Activity"
              >
                <xs:complexType>
                  <xs:attribute name="ID" sql:field="XmlId" form="unqualified" type="xs:string" />
                  <xs:attribute name="DbId" sql:identity="ignore" sql:field="Id" msdata:AutoIncrement="true" msdata:ReadOnly="true" type="xs:int" /> 
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
      <xs:attribute name="ID" form="unqualified" sql:field="XmlId" />
      <xs:attribute name="DbId" sql:identity="ignore" sql:field="Id" msdata:AutoIncrement="true" type="xs:int" /> 
    </xs:complexType>
  </xs:element>
</xs:schema>

结果表(注意, RegistrationId 是减一):

The resulting tables (note that RegistrationId is off by one):

[Registration]
Id  XmlId
1   NonUniqCaseId-123
2   NonUniqCaseId-124

[Activity]
Id  RegistrationId  XmlId
1   0   UniqActId-1234
2   0   UniqActId-1235
3   1   UniqActId-1241
4   1   UniqActId-1242

编辑:甚至更糟比我想象的要好。如果我再次添加记录,则外键(子键)再次从0开始!因此,很难(每张表)确定更正是什么:

Edit: It is even worse than I thought. If I add the records again, the foreign key (child key) starts at 0 again! So it is going to be hard to impossible to determine what the correction (per table) should be:

[Registration]
Id  XmlId
1   NonUniqCaseId-123
2   NonUniqCaseId-124
3   NonUniqCaseId-123
4   NonUniqCaseId-124

[Activity]
Id  RegistrationId  XmlId
1   0   UniqActId-1234
2   0   UniqActId-1235
3   1   UniqActId-1241
4   1   UniqActId-1242
5   0   UniqActId-1234
6   0   UniqActId-1235
7   1   UniqActId-1241
8   1   UniqActId-1242


推荐答案

答案很简单:只需在逆 > XSD ,因此请删除以下行:

Well the answer turns out to be very simple: just leave out the inverse in the XSD, so remove this line:

inverse="true"

之所以介绍这一点,是因为我有许多多对多的关系。 (我的示例是一个简短的摘录,用于重现该问题。)但是似乎我已经在太多地方引入了它。

I introduced this because I have many many-to-many-relationships. (My example is a very short extract to reproduce the problem.) But it seems that I have introduced it at too many places.

推测 :(很遗憾,我没有时间研究/确认下一个假设。)

Speculation: (Unfortunately I have no time to investigate/confirm this next hypothesis.)

我假设现在,即应该仅用于关系的女儿,而不是母亲的一方。例如。当A和B具有多对多关系A_B时,XML看起来像这样:

I am assuming now, that inverse should only be used for the side that is the daughter of relationship, not the side that is the mother. E.g. when A and B have a many-to-many relationship A_B, and the XML looks something like this:

<ListOfA>
  <A ID="Uniq_A123">
    <A_B>
      <B ID="NonUniq_B234" />
    </A_B>
    <A_B>
      <B ID="NonUniq_B235" />
    </A_B>
  </A>
  <A ID="Uniq_A124">
    <A_B>
      <B ID="NonUniq_B234" />
    </A_B>
  </A>
</ListOfA>

A是XML的母亲,因此是A_B中关系的隐式父,然后B应该通过指定 inverse 显式地从子级反转为父级。
但是,由于我正在为A和B生成自己的ID,因此我怀疑这是否对我有用,之后我将进行维修查询。

A is implicitly 'parent' of the relationship in A_B by being the XML-mother, B should then explicitly be inversed from child to parent by specifying inverse. However, since I am generating my own Id's for A and B, I doubt if this would work for me, and I will just run repair-queries afterwards.

这篇关于批量加载XML,引用自动递增的parent-id的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆