如何准备一个大的 txt 文件以使用 Hibernate 批量插入? [英] How to prepare a large txt File to batch insert using Hibernate?

查看:27
本文介绍了如何准备一个大的 txt 文件以使用 Hibernate 批量插入?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图向 sql 数据库插入超过 20 万行,每一行代表一个卡片信息(70+ 字符串字段).在大型 TXT 文件中.我(新 Java 开发人员)在这方面面临着相当困难的时期,我的方法:

Im trying to insert over 200k rows to sql database, each row represent a card info(70+ string field). Within a Large TXT File. I'm (new Java Developer) facing a quite hard time in this, My approach:

  1. 读取文件

    File file = ReadFile.loadCardFile(pathName);

  1. 将文件转换为流

Stream<String> cardsStream = new BufferedReader(new InputStreamReader(new FileInputStream(file), ("UTF-8"))).lines());

  1. 获取字符串数组中的每一行(卡片信息由|"分割,该字段可能或可能不隔开)

cardsStream.forEach(s -> {
                    String[] card = Arrays.stream(s.split("\\|")).map(String::trim).toArray(String[]::new);

  1. 插入每一行(卡片数据)

numberOfRows = insertCardService.setCard(card, numberOfRows);

  1. setCard 是将行数据映射到它的列然后我保存每张卡片

CardService.save(Card);

使用这种方法最多需要 2 小时,这真的是很长时间

with this approach it takes up to 2h which is really really Long time

是否有更好的方法建议,或者您能否为我提供链接以更好地阅读代码?

Is there any advice to better approach or could you provide me with links to read code it better?

顺便说一句,我想使用批量插入来显着缩短时间,但我认为我读取文件的方式是错误的!提前致谢!!

oh btw I want to use batch insert to shorten time significantly but I think my way of reading the file is wrong! Thanks in advance!!

推荐答案

JPA 是这种操作的错误工具.虽然可能可以使用 JPA 使其快速完成,但这样做并不困难.JPA 在加载一些实体、编辑一些属性并让 JPA 确定哪些确切更新是必要的工作流中效果最佳.为此 JPA 进行了大量缓存,这可能会消耗大量资源.

JPA is the wrong tool for this kind of operation. While it is probably possible to make it fast with JPA it is unnecessary difficult to do this. JPA works best in a workflow where you load some entities, edit some attributes and let JPA figure out which exact updates are necessary. For this JPA does a lot of caching which might cost considerable resources.

但在这里,您似乎只想将一些相关的数据量泵入数据库.你不需要 JPA 来弄清楚要做什么,这都是插入.您不需要 JPA 缓存.

But here it seems you just want to pump some relevant amount of data into the database. You don't need JPA to figure out what to do, it's all insert. You don't need JPAs cache.

我推荐 Springs JdbcTemplateNamedParameterJdbcTemplate.这可能已经大大加快了速度.

I recommend Springs JdbcTemplate or NamedParameterJdbcTemplate. This probably already speeds up things considerable.

一旦成功,请考虑以下事项:

Once that works consider the following:

  • 批量插入,即只向数据库发送一条语句.参见 https://mkyong.com/spring/spring-jdbctemplate-batchupdate-example/ 请注意,某些数据库需要特殊的驱动程序参数才能正确处理批量更新.
  • 进行间歇性提交.总的来说就是性价比,因为它迫使数据库实际写入数据.但长时间的事务也可能会导致问题,尤其是当数据库也在做其他事情时以及在出现错误/回滚的情况下.
  • 您需要对批次进行更多控制,请查看 Spring Batch.立>
  • Batch inserts, i.e. sending just one statement to the database. See https://mkyong.com/spring/spring-jdbctemplate-batchupdate-example/ Note that some database need special driver argument to properly handle batch updates.
  • Doing intermittent commits. In general commits cost performance, because it forces databases to actually write data. But to long transaction might cause trouble as well, especially when the database is doing other stuff as well and in case of errors/rollbacks.
  • You need more control over your batches, take a look at Spring Batch.

这篇关于如何准备一个大的 txt 文件以使用 Hibernate 批量插入?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆