将XML文件转换为JSON或CSV吗? [英] Converting XML files to JSON or CSV?

查看:54
本文介绍了将XML文件转换为JSON或CSV吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有带有嵌套元素的复杂XML文件.我建立了使用SSIS和T-SQL处理的过程.我们利用Azure数据工厂,并且我想探索将XML文件转换为JSON或CSV的方法,因为ADF不支持这些文件.

I have complex XML files with nested elements. I built a process to handle using SSIS and T-SQL. We utilize Azure Data Factory and I'd like to explore converting XML files to JSON or CSV, since those are supported by ADF and XML is not.

看来,逻辑应用程序是一种选择.有人在采用XML和在管道内进行转换方面还有其他运气吗?

It appears logic apps is one option. Has anyone had other luck with taking XML and converting within a pipeline?

当前工作流程: 从文件夹中提取XML文件,放置到网络驱动器上,将XML批量插入登台行,将XML解析为各种SQL表以进行分析.

Current Workflow: pick up XML files from folder, drop to on network drives, bulk insert XML into a staging row, parse XML to various SQL tables for analysis.

示例:

<HEADER>
<SurveyID> 1234 </SURVEYID>
  <RESPONSES>
      <VAR>Question1</VAR>
      <VALUE>Answer1</VALUE>
  </RESPONSES>
  <RESPONSES>
      <VAR>Question2</VAR>
      <VALUE>Answer2</VALUE>
  </RESPONSES>
<SurveyID> 1234 </SURVEYID>
 <RESPONSES>
      <VAR>Question1</VAR>
      <VALUE>DifferentAnswer</VALUE>
  </RESPONSES>
</HEADER>

注意:我不需要知道如何解析XML,就可以了.我也知道您可以在ADF中执行SSIS.我正在寻找替代整个过程的方法.

Note: I don't need to know how to parse XML, that is done. I also know that you can execute SSIS within ADF. I am looking at alternatives to the overall process.

推荐答案

我不确定为什么这个问题被否决了-几个月前我也有类似的需求.由于我们收到的XML格式不正确,甚至无法正确解析,这一事实使情况更加恶化.为了解决这个问题,我编写了一个.NET控制台应用程序并将其部署到Azure Batch.它从Blob存储读取XML,更正格式错误,然后解析XML并将其输出回Blob存储中的JSON文件. ADF通过自定义"活动支持Azure Batch,因此这直接插入了我们的管道.根据您的数据结构,如果更合适,可以将其输出到CSV.

I'm not sure why this question got downvoted - I had a similar need a few months ago. The situation was exacerbated by the fact the XML we receive is poorly formatted and wouldn't even parse correctly. To solve this, I wrote a .NET console app and deployed it to Azure Batch. It reads the XML from Blob Storage, corrects the formatting errors, then parses the XML and outputs it to a JSON file back in Blob Storage. ADF supports Azure Batch through the "Custom" activity, and so this plugs right into our pipeline. Depending on your data structure, you could output it to CSV if that is more appropriate.

使用ADF中的Azure Batch的棘手之处在于传递和处理参数数据.在ADF配置中,它们在扩展属性"下列出:

The tricky bits of using Azure Batch from ADF are in passing and processing parameter data. In the ADF configuration, these are listed under "Extended properties":

这些属性在运行时可用于批处理作业中的名为"activity.json"的JSON文件中:在控制台应用程序中,您将需要访问JSON文件以读取扩展属性:

These properties are available to the Batch job at runtime in a JSON file named "activity.json":In the Console app, you will need to access the JSON file to read the extended properties:

var activity_json = File.ReadAllText("activity.json");
dynamic activity = JsonConvert.DeserializeObject(activity_json);

parameters.Add("alertId", activity.typeProperties.extendedProperties.AlertId.ToString());
parameters.Add("hashKey", activity.typeProperties.extendedProperties.HashKey.ToString());
parameters.Add("startTime", activity.typeProperties.extendedProperties.StartTime.ToString());
parameters.Add("endTime", activity.typeProperties.extendedProperties.EndTime.ToString());

属性名称区分大小写. {请注意,在此示例中,我正在将它们写入参数"字典-这样做是为了可以在本地或Azure批处理中运行控制台应用程序.]使用Azure批处理还有其他一些有趣"方面,但是我认为这是最大的障碍.

The property names are Case Sensitive. {Note that in this example I am writing them to a "parameters" Dictionary - I do that so I can run the Console app either locally or in Azure Batch.] There are a few other "interesting" aspects to using Azure Batch, but this is the biggest hurdle in my opinion.

这篇关于将XML文件转换为JSON或CSV吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆