Data Factory v2-每行生成一个json文件 [英] Data Factory v2 - Generate a json file per row
问题描述
我正在使用Data Factory v2.我有一个复制活动,其中有一个Azure SQL数据集作为输入,而有一个Azure Storage Blob作为输出.我想将SQL数据集中的每一行写为一个单独的Blob,但我不知道如何做到这一点.
I'm using Data Factory v2. I have a copy activity that has an Azure SQL dataset as input and a Azure Storage Blob as output. I want to write each row in my SQL dataset as a separate blob, but I don't see how I can do this.
我在复制活动中看到了一个copyBehavior,但这仅适用于基于文件的源.
I see a copyBehavior in the copy activity, but that only works from a file based source.
另一个可能的设置是我的数据集中的filePattern:
Another possible setting is the filePattern in my dataset:
指示每个JSON文件中存储的数据模式.允许值 是:setOfObjects和arrayOfObjects.
Indicate the pattern of data stored in each JSON file. Allowed values are: setOfObjects and arrayOfObjects.
setOfObjects-每个文件包含单个对象,或行分隔/串联的多个对象.在输出数据集中选择此选项后,复制活动将生成一个JSON文件,每个对象每行(以行分隔).
setOfObjects - Each file contains single object, or line-delimited/concatenated multiple objects. When this option is chosen in an output dataset, copy activity produces a single JSON file with each object per line (line-delimited).
arrayOfObjects-每个文件都包含一个对象数组.
arrayOfObjects - Each file contains an array of objects.
该描述讨论的是每个文件",因此起初我认为有可能,但是现在我已经对其进行了测试,似乎setOfObjects创建了一个行分隔的文件,其中每一行都写入新行. setOfObjects设置会创建一个带有json数组的文件,并将每一行添加为数组的新元素.
The description talks about "each file" so initially I thought it would be possible, but now I've tested them it seems that setOfObjects creates a line separated file, where each row is written to a new line. The setOfObjects setting creates a file with a json array and adds each line as a new element of the array.
我想知道我是否在某处缺少配置,或者只是不可能?
I'm wondering if I'm missing a configuration somewhere, or is it just not possible?
推荐答案
我现在要做的是将行加载到SQL表中,并为表中的每个记录运行一个foreach.我使用Lookup活动使数组在Foreach活动中循环. foreach活动将每一行写入Blob存储.
What I did for now is to load the rows in to a SQL table and run a foreach for each record in the table. The I use a Lookup activity to have an array to loop in a Foreach activity. The foreach activity writes each row to a blob store.
对于Olga的documentDb问题,它看起来像这样:
For Olga's documentDb question, it would look like this:
在查找中,您将获得要复制的证件名的列表:
In the lookup, you get a list of the documentid's you want to copy:
您在foreach活动中使用该设置
You use that set in your foreach activity
然后,您在foreach活动中使用复制活动来复制文件.您在源中查询单个文档:
Then you copy the files using a copy activity within the foreach activity. You query a single document in your source:
您可以使用id在接收器中动态命名文件. (您还必须在数据集中定义参数):
And you can use the id to dynamically name your file in the sink. (you'll have to define the param in your dataset too):
这篇关于Data Factory v2-每行生成一个json文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!