Spring Cloud Dataflow有什么好处? [英] What are the Benefits of Spring Cloud Dataflow?

查看:206
本文介绍了Spring Cloud Dataflow有什么好处?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据我所看到的,在Spring Cloud Dataflow(SCDF)中创建流将部署基础应用程序,绑定通信服务(如RabbitMQ),设置Spring Cloud Stream环境变量并启动应用程序。

Based on what I've seen, creating a stream in Spring Cloud Dataflow (SCDF) will deploy the underlying applications, bind the communication service (like RabbitMQ), set the Spring Cloud Stream environment variables, and start the applications. This could all be done manually easily using a cf push command.

同时,我一直在使用Spring Cloud Dataflow遇到一些缺点:

Meanwhile, I've been running into some drawbacks with Spring Cloud Dataflow:


  • SCDF服务器是PCF上的内存猪(我的流只有6个应用程序,但服务器大约需要10GB)

  • 在应用程序命名,内存,实例等方面没有灵活性(通常您在manifest.yml中设置的所有内容)

  • 与构建工具集成(例如Bamboo)将需要额外的工作,因为我们必须使用SCDF CLI而不是PCF CLI

  • 无法修改现有流。要进行蓝绿色部署,您必须手动部署应用程序(绑定服务并手动设置环境变量)。然后,完成蓝绿色部署后,SCDF将流显示为失败,因为它不知道基础应用程序之一已更改。

  • 我运行过各种错误尝试重新部署失败的流时,就像MySQL主键约束错误一样

  • SCDF Server is a memory hog on PCF (I have a stream with only 6 applications, and yet I'm needing about 10GB for the server)
  • No flexibility on application naming, memory, instances, etc. (All the things that you would typically set in the manifest.yml)
  • Integration with build tools (like Bamboo) are going to require extra work because we have to use the SCDF CLI rather than just the PCF CLI
  • Existing streams cannot be modified. To do a blue-green deployment, you have to deploy the application manually (binding the services and setting the environment variables manually). And then once a blue-green deployment is done, SCDF shows the stream as Failed, because it doesn't know that one of the underlying applications has changed.
  • Various errors I've run into, like MySQL Primary Key Constraint errors when trying to redeploy a failed stream

那么我到底缺少什么?为什么仅使用Spring Cloud Dataflow来手动部署应用程序会有好处?

So what am I missing? Why would using Spring Cloud Dataflow be beneficial to just manually deploying the applications?

推荐答案


基于什么我已经看到,在Spring Cloud Dataflow(SCDF)中创建流将部署底层应用程序,绑定通信服务(如RabbitMQ),设置Spring Cloud Stream环境变量并启动应用程序。可以使用cf push命令轻松地手动完成所有操作。

Based on what I've seen, creating a stream in Spring Cloud Dataflow (SCDF) will deploy the underlying applications, bind the communication service (like RabbitMQ), set the Spring Cloud Stream environment variables, and start the applications. This could all be done manually easily using a cf push command.

是的-您可以单独编排流应用程序,这样做有很多好处。 。但是,当您尝试使用 channelName destination 和绑定特定的属性手工连接每个流应用程序时,您将不得不处理更多的簿记工作。这一切都成为Spring Cloud Data Flow(SCDF)编排层中的幕后琐事。

Yes - you can individually orchestrate stream applications and there are benefits to that. However, when you try to hand-wire each of the stream applications with the channelName, destination and the binding specific properties, you'd have to deal with more bookkeeping. This all becomes a behind-the-scene chore in Spring Cloud Data Flow's (SCDF) orchestration layer.

尤其是当您涉及扩展或分区时在流传输管道中,您必须注意 instanceCount instanceIndex 及其相关属性。这些也通过DSL语义在SCDF中自动化。

Especially, when you've "scaling" or "partitions" involved in your streaming pipeline, you'd have to pay attention to instanceCount, instanceIndex and the related properties. These are automated in SCDF through the DSL semantics, too.


SCDF服务器是PCF上的内存猪(我的流只有6个应用程序,但是我需要大约10GB的空间)服务器)

SCDF Server is a memory hog on PCF (I have a stream with only 6 applications, and yet I'm needing about 10GB for the server)

根据我们的实验,通常会在您处于开发状态并反复创建>部署>销毁流时观察到一天几次。一般来说,服务器应该只需要1G。

Based on our experiments, this is typically observed when you're in "development" and repeatedly creating > deploying > destroying streams several times in a day. Generally speaking, the server should only require 1G.

人们普遍认为PCF中的JVM会报告其并未真正使用的内存;因此,PCF只能使用1G。这与Java的 rt.jar 有所关系。 PCF中的内存使用情况报告功能发生了一些新的内核更改,因此,JVM启动后(使用大量资源的 )它不再继续报告错误的数据。

There's a general consensus that the JVMs in PCF reporting memory that it isn't really using; this has to do something with java's rt.jar. There are some new kernel changes around 'memory usage reporting' functionality in PCF, so that after the JVM boots up (which uses a good deal of resources) it doesn't continue to report bad data. We are closely tracking this.

也就是说,我们还对服务器进行了性能分析,以确保没有任何内存泄漏。照原样,服务器没有任何内存状态-服务器所需的最小元数据状态(例如:流定义)保存在RDBMS中。请密切注意#107 的发展情况。

That said, we are also profiling the server to make sure there aren't any memory leaks. As-is, the server doesn't have any in-memory state - the minimal metadata state (eg: stream definitions) the server requires is persisted in an RDBMS. Please keep eye on #107 for developments.


在应用程序命名,内存,实例等方面没有灵活性(通常在manifest.yml中设置的所有内容)

No flexibility on application naming, memory, instances, etc. (All the things that you would typically set in the manifest.yml)

目前尚不清楚应用程序命名是什么意思。如果必须使用服务器名称,则可以通过 manifest.yml 或其他方式轻松更改它。如果与流应用程序名称有关,它们会自动以流名称作为前缀进行部署,因此当您从CF CLI或Apps-Mgr查看应用程序时,很容易识别。

It is not clear what you mean by "application naming". If this has to deal with the server name, you can change it easily through your manifest.ymlor by other means. If it has to do with stream-app names, they are automatically deployed with "stream name" as the prefix, so it is easy to identify when you review the apps from CF CLI or Apps-Mgr.

关于内存和磁盘使用情况,您可以通过 SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_MEMORY SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_DISK 令牌。更多详细信息这里

As for the memory and disk usages, you can control at each application level through SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_MEMORY and SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_STREAM_DISK tokens. More details here.


与构建工具(如Bamboo)的集成将需要额外的工作,因为我们必须使用SCDF CLI而不是PCF CLI

Integration with build tools (like Bamboo) are going to require extra work because we have to use the SCDF CLI rather than just the PCF CLI

您将在流/任务应用程序上运行CI构建,因为它们是开发工作流程的一部分。 SCDF仅提供编排机制来管理这些应用程序。我们还正在使用 Netflix的Spinnaker 工具进行本机集成,以提供现成的工具

You'd be running the CI builds on the stream/task applications, as they're are part of your development workflow. SCDF simply provides the orchestration mechanics to manage these applications. We are also working on native integration with Netflix's Spinnaker tooling to provide the out-of-the-box experience in near future.


现有流无法修改。要进行蓝绿色部署,您必须手动部署应用程序(绑定服务并手动设置环境变量)。然后,一旦蓝绿色部署完成,SCDF会将流显示为失败,因为它不知道基础应用程序之一已更改。

Existing streams cannot be modified. To do a blue-green deployment, you have to deploy the application manually (binding the services and setting the environment variables manually). And then once a blue-green deployment is done, SCDF shows the stream as Failed, because it doesn't know that one of the underlying applications has changed.

您可以像滚动升级。还有一个活跃的w-i-p也可以适应SCDF中更改流/任务应用程序的状态。顺便说一句,Spinnaker集成将进一步简化自定义应用程序位的滚动升级,而SCDF将适应动态变化-就此要求而言,这是最终目标。

You can perform blue-green like rolling upgrades on the apps individually. There's an active w-i-p to adapt to changing stream/task application state in SCDF, too. As an aside, Spinnaker integration would further simplify the rolling upgrades on custom application bits, and SCDF would adapt to dynamic changes - this is the end goal as far as this requirement goes.

我遇到的各种错误,例如试图重新部署失败的流时的MySQL主键约束错误

Various errors I've run into, like MySQL Primary Key Constraint errors when trying to redeploy a failed stream

我们很想听听您的反馈;具体来说,请考虑在积压中报告这些问题。在这方面的任何帮助都将受到高度赞赏。

We would love to hear your feedback; specifically, please consider reporting these problems in the backlog. Any help on this regard is highly appreciated.


那我想念什么?为什么使用Spring Cloud Dataflow仅对手动部署应用程序有益?

So what am I missing? Why would using Spring Cloud Dataflow be beneficial to just manually deploying the applications?

架构部分涵盖了一般功能。如果您有大量的流或任务应用程序(像其他微服务设置一样),则需要一个中央编排工具来在云设置中对其进行管理。 SCDF提供DSL,REST-API,仪表板,Flo,当然还提供现成的安全层。流和任务之间的互操作性是涉及闭环分析的用例的另一个重要要求-围绕它的DSL工具。当Spinnaker集成成为一流的公民时,我们预见到将通过数据管道进行端到端的连续交付。最后,Cloud Foundry的SCDF-tile将与 Spring Cloud Services 进行互操作进一步自动化配置方面以及全面的安全范围。

The architecture section covers the general capabilities. If you're to have numerous stream or task applications (like any other microservice setup), you'd need a central orchestration tooling to manage them in the cloud setting. SCDF provides DSL, REST-API, Dashboard, Flo and of course the security layer that comes out-of-the-box. Interoperability between streams and tasks is another important requirement for use-cases involving closed-loop analytics - there's DSL tooling around this. When Spinnaker integration becomes the first-class citizen, we foresee having an end-to-end continuous delivery over data pipelines. Lastly, the SCDF-tile for Cloud Foundry would interoperate with Spring Cloud Services to further automate the provisioning aspect along with comprehensive security coverage.

希望这会有所帮助。

这篇关于Spring Cloud Dataflow有什么好处?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆