如何在 Dataflow 中进行此类测试(在 twitter 上称为功能测试)? [英] How to do this type of testing in Dataflow(called feature testing at twitter)?

查看:32
本文介绍了如何在 Dataflow 中进行此类测试(在 twitter 上称为功能测试)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们像这样做一些叫做特性测试的事情 ->https://blog.twitter.com/Engineering/en_us/topics/insights/2017/the-testing-renaissance.html

We do something called feature testing like so -> https://blog.twitter.com/engineering/en_us/topics/insights/2017/the-testing-renaissance.html

那篇文章的 TLDR,我们向微服务发送请求(REST POST with body),模拟 GCP 存储,模拟下游 api 调用,以便可以重构整个微服务.此外,我们可以在不更改测试的情况下更换我们的平台/库,这使我们非常灵活.

TLDR of that article, we send request to microservice(REST POST with body), mock GCP Storage, mock downstream api call so the entire microservice can be refactored. Also, we can swap out our platforms/libs with no changes in our testing which makes us extremely agile.

我的第一个问题是 DataFlow (apache beam) 可以接收 REST 请求来触发作业吗?我看到大部分 api 都围绕创建作业",但我在文档中没有看到执行作业",而我确实看到 get status 返回作业执行的状态.我只是看不到触发工作的方法

My first questions is can DataFlow (apache beam) receive a REST request to trigger the job? I see much of the api is around 'create job' but I don't see 'execute job' in the docs while I do see get status returns the status of job execution. I just don't see a way to trigger a job to

  • 从我的存储 api(它是可模拟的并且位于 GCP 前面)读取
  • 希望跨多个节点处理文件
  • 调用下游的 apis(这也是可模拟的)

然后,我只是想在我的测试中模拟 http 调用,然后在读取文件时,返回一个真实的客户文件,然后在完成后,我的测试将验证所有正确的请求都发送到了下游的 api.

Then, I simply want to in my test simulate the http call, then when file is read, return a real customer file and then after done, my test will verify all the correct requests were sent to the apis downstream.

我们在功能测试中使用了 apache beam,但不确定它是否与谷歌的数据流版本相同:(因为那将是最理想的!!!->嗯,是否有谷歌数据流的 apache 光束版本的报告我们能得到吗?

We are using apache beam in our feature tests though not sure if it's the same version as google's dataflow :( as that would be the most ideal!!! -> hmmm, is there a reported apache beam version of google's dataflow we can get?

谢谢,院长

谢谢,院长

推荐答案

Apache Beam 的 DirectRunner 应该非常接近 Dataflow 的环境,我们推荐用于此类单进程流水线测试.

Apache Beam's DirectRunner should be very close to Dataflow's environment, and it's what we recommend for this type of single-process pipeline test.

我的建议是一样的:使用 DirectRunner 进行功能测试.

My advise would be the same: Use the DirectRunner for your feature tests.

您也可以使用 Dataflow 运行器,但这听起来像是一个完整的集成测试.根据数据源/数据接收器的不同,您可以向它传递模拟实用程序.

You can also use the Dataflow runner, but that sounds like it would be a full integration test. Depending on the data source / data sink, you may be able to pass it mocking utilities.

BigQueryIO 就是一个很好的例子.它有 一个 withTestServices 方法,您可以使用它来传递模拟外部服务行为的对象

BigQueryIO is a good example. It has a withTestServices method that you can use to pass objects that mock the behavior of external services

这篇关于如何在 Dataflow 中进行此类测试(在 twitter 上称为功能测试)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆