从插入的数据集中获取主键以链接到其他插入 [英] Obtaining the primary key from an inserted DataSet to chain into other insertions
问题描述
假设我在 Oracle DB 中有以下表
福:
+--------+---------+---------+|id_foo |字符串 1 |字符串 2 |+--------+---------+---------+|1 |富|酒吧||2 |巴兹|蝙蝠 |+--------+----------+---------+
酒吧:
+--------+-----------+--------+|id_bar |id_foo_fk |字符串 |+--------+-----------+--------+|1 |1 |嘘 ||2 |1 |流浪汉|+--------+-----------+--------+
当我插入 Foo 时,通过使用数据集和 JDBC,例如
数据集<行>fooDataset =//数据集初始化fooDataset.write().mode(SaveMode.Append).jdbc(url, table, properties)
ID 由数据库自动生成.现在,当我需要使用相同的策略保存 Bar
时,我希望能够通过 id_foo_fk
将其链接到 Foo
.>
我研究了一些可能性,例如按照 这个问题,但它不能解决问题,因为我需要数据库生成的 ID.我尝试了这个问题中的建议,但它导致了同样的问题,唯一的非数据库 ID
也不可能再次从 JDBC 中选择,因为 string1
和 string2
可能不是唯一的.也无法更改数据库.例如,我无法将其更改为 UUID,也无法为其添加触发器.这是一个我们只能使用的遗留数据库
我怎样才能做到这一点?Apache Spark 可以做到这一点吗?
我不是 Java 专家,因此您必须研究数据库层以了解如何准确进行,但有 3 种方法可以做到这一点:
- 如果您使用的数据库服务器能够(大多数情况下)并从您的代码中调用它,您可以创建一个存储过程.
- 创建一个触发器,在第一次插入时返回 ID 号,并在下一次数据库插入中使用它.
- 使用 UUID 并将其用作密钥,而不是数据库自动生成的密钥.
Suppose I have the following tables, in an Oracle DB
Foo:
+--------+---------+---------+
| id_foo | string1 | string2 |
+--------+---------+---------+
| 1 | foo | bar |
| 2 | baz | bat |
+--------+---------+---------+
Bar:
+--------+-----------+--------+
| id_bar | id_foo_fk | string |
+--------+-----------+--------+
| 1 | 1 | boo |
| 2 | 1 | bum |
+--------+-----------+--------+
When I insert into Foo, by using a Dataset and JDBC, such as
Dataset<Row> fooDataset = //Dataset is initialized
fooDataset.write().mode(SaveMode.Append).jdbc(url, table, properties)
an ID is auto-generated by the database. Now when I need to save Bar
, using the same strategy, I want to be able to link it to Foo
, via id_foo_fk
.
I looked into some possibilities, such as using monotonically_increasing_id()
as suggested in this question, but it won't solve the issue, as I need the ID generated by the database. I tried what was suggested in this question, but it leads to the same issue, of unique non-database IDs
It's also not possible to select from the JDBC again, as string1
and string2
may not be unique. Nor is it possible to change the database. For instance, I can't change it to be UUID, and I can't add a trigger for it. It's a legacy database that we can only use
How can I achieve this? Is this possible with Apache Spark?
I'm not a Java specialist so you will have to look into the database layer on how to proceed exactly but there are 3 ways you can do this:
- You can create a store procedure if the database server you are using is capable of (most do) and call it from your code.
- Create a trigger that returns the id number on the first insertion and use it in your next DB insertion.
- Use UUID and use this as the key instead of the database auto generated key.
这篇关于从插入的数据集中获取主键以链接到其他插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!