Spring RESTful Web服务-大量数据处理 [英] Spring RESTful web services - High volume data processing

查看:235
本文介绍了Spring RESTful Web服务-大量数据处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建Spring/Spring Boot- RESTful Web服务,

I'm trying to build a Spring/Spring Boot- RESTful web service,

  1. 这将接受具有100万行/每行40列的CSV文件作为输入(从基于Angular的前端)作为输入,并且将是同步调用.进入其他屏幕之前,必须先通知用户上传状态.因此,等待时间不能超过几分钟(例如5分钟).

  1. Which accepts a CSV file with 1 million rows/40 columns in each row as input (From a Angular based front end) and will be a synchronous call. User have to be notified on the upload status before proceeding to other screens. So, wait time can't be more than few mins(say 5 mins).

这些行中的每行都必须针对数据库中的内容进行验证,如果发现有效,则将其插入数据库中.简而言之,每一行可以是一个单独的独立事务.

Each of these rows has to be validated against what is in DB and if found to be valid, same will be inserted into DB. In short, each row can be a separate independent transaction.

您能否建议实现此目的的最佳方法是什么?

Can you please suggest what would be the best approach to implement this?

当前的遗留系统在存储过程中实现了相同的功能,这使得解决方案与数据库紧密结合,如果需要直接更改RDBMS,这将是一个问题.

Current legacy system implements the same functionality in Stored procedures, which made the solution closely coupled with the DB, which will be an issue if the RDBMS needs to be changed down the line.

  1. 有什么方法可以在异步Web服务调用中以块(例如20k)的形式处理这100万个数据?

  1. Any approach on processing these 1 million data in chunks (say 20k) in asynchronous web service calls?

春季批处理?

存储过程是否可以比上述两个选项更合适且性能更好(猜测不!)?

Can stored procedure be more suitable and better performing than above two options by any chance(guessing no!) ?

您能否提供一些至少与存储过程一样好的方法,以及如何根据建议的解决方案进行水平扩展?

Can you please help with some approach which is at least as good as stored procedure and how to horizontally scale based on the suggested solution.

推荐答案

使用三个建议的选项,您处于正确的轨道.不幸的是,您的问题的答案是这取决于.

You are on the right track with your three suggested options. The answer to your question is unfortunately, it depends.

以上任何一种方法都可以为您服务.我个人更喜欢Spring Batch,因为我发现编程模型简单直观.

Any of the above approaches could work for you. I personally prefer Spring Batch as I find the programming model simple and intuitive.

Spring批处理指南

另一种方法是使用消息传递并行处理行:/p>

Another approach would be to use Messaging to parallelize the processing of rows:

  1. 控制器接收包含大量数据的CSV文件
  2. 将数据拆分成较小的部分,然后发送到临时的Message Queue
  3. 多个工作节点接收消息并进行处理
  4. 监控临时队列的大小并相应地更新用户(已完成百分比)


简而言之,您对自己领域的了解最终将引导您为您的企业找到最佳解决方案.


In short, your knowledge of your own domain will ultimately guide you towards the best solution for your business.

这篇关于Spring RESTful Web服务-大量数据处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆