批处理文件并与数据库进行比较 [英] Batch processing files and diff with database

查看：119 发布时间：2020/9/20 3:34:36 mysql spring-boot batch-processing

本文介绍了批处理文件并与数据库进行比较的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当前，我正在开发一个Spring-Boot应用程序，该应用程序正在定期尝试处理包含用户数据的文件，其中每行包含以|分隔的userId和departamentId(例如，123534|13).该文件将包含几百万条记录.

Currently I am developing a Spring-Boot application that is periodically trying to process a file containing user data where every line contains userId and departamentId separated by | for example 123534|13. That file will be containing few milions of records.

我的要求是以下列方式将该数据加载到mysql数据库中:

My requirement is to load this data into mysql database in such manner that:

如果存在具有已处理ID的用户，则不执行任何操作
如果该用户不存在，请创建新用户
如果用户不在列表中，但是在数据库中，请删除
如果数据库中没有当前部门，请创建

If the user with processed ID exists, do not do anything
If the user is not existing create new user
If the user is not on the list but is present in the database, remove it
If department with current is not present on the database, create it

我做了一些优化，例如

缓存部门以填充实体
批收集要保存的用户并通过JpaRepository saveAll方法对其进行保存

Caching Departaments to fill entity
Batch collecting Users to Save and saving it throuch JpaRepository saveAll method

但是我仍然对数据库进行了太多的数据库调用，我在检查用户是否存在，以便为每个记录创建要保存的实体...

But Still I am making too many database calls to the database, I am checking if user is present in order to create entity for save for every record...

我的实体非常简单:

@Entity
@Table(name = "departaments")
public class Departament{

    @Id
    @Column(name = "id")
    private Long id;

    @Column(name = "name")
    private String name;

和:

@Entity
@Table(name = "users")
public class User {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id")
    private Long id;

    @ManyToOne
    @JoinColumn(name = "departament_id")
    private Departament departament;

有人遇到这样的问题吗?

是否可以对其进行更多优化?

有什么好的处理模式吗?

推荐答案

如果这是替换"，请执行以下操作以避免停机:

If it is a "replacement", do this to avoid any downtime:

CREATE TABLE new LIKE old;
LOAD DATA INFILE ... (and any other massaging)
RENAME TABLE real TO old, new TO real;
DROP TABLE old;

如果它是增量"，则将LOAD放入单独的表中，然后执行适当的SQL语句以执行更新.在您的课题中，每个项目符号大约需要一条SQL语句.没有循环.

If it is a "delta", the LOAD it into a separate table, then execute suitable SQL statements to perform the updates. It would be approximately one SQL statement per bullet item in your Question. No loops.

这篇关于批处理文件并与数据库进行比较的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

批处理文件并与数据库进行比较 [英] Batch processing files and diff with database

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

批处理文件并与数据库进行比较 [英] Batch processing files and diff with database

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭