如何仅使用U-SQL和文件中的某些字段将大文件划分为文件/目录? [英] How do I partition a large file into files/directories using only U-SQL and certain fields in the file?

查看：73 发布时间：2020/9/16 23:56:56 azure-data-lake u-sql

本文介绍了如何仅使用U-SQL和文件中的某些字段将大文件划分为文件/目录?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个非常大的CSV，每行包含客户和商店ID，以及交易信息.当前的测试文件约为40 GB(大约2天)，因此对于选择查询，在任何合理的返回时间内，分区都是绝对必须的.

I have an extremely large CSV, where each row contains customer and store ids, along with transaction information. The current test file is around 40 GB (about 2 days worth), so partitioning is an absolute must for any reasonable return time on select queries.

我的问题是:当我们收到一个文件时，它包含多个商店的数据.我想使用虚拟列"功能将该文件分成相应的目录结构.该结构为"/Data/{CustomerId}/{StoreID}/file.csv".

My question is this: When we receive a file, it contains multiple store's data. I would like to use the "virtual column" functionality to separate this file into the respective directory structure. That structure is "/Data/{CustomerId}/{StoreID}/file.csv".

我尚未将其与OUTPUT语句配合使用.因此，该语句的使用是:

I haven't yet gotten it to work with the OUTPUT statement. The statement use was thus:

// Output to file
OUTPUT @dt
TO @"/Data/{CustomerNumber}/{StoreNumber}/PosData.csv"
USING Outputters.Csv();

它给出了以下错误:

Bad request. Invalid pathname. Cosmos Path: adl://<obfuscated>.azuredatalakestore.net/Data/{0}/{1}/68cde242-60e3-4034-b3a2-1e14a5f7343d

有人尝试过类似的事情吗?我试图将这些字段的输出路径连接起来，但这是不行的.我考虑过将其作为一个函数(UDF)来使用，它需要两个ID并过滤整个数据集，但这似乎效率很低.

Has anyone attempted the same kind of thing? I tried to concatenate the outputpath from the fields, but that was a no-go. I thought about doing it as a function (UDF) that takes the two ID's and filters the whole dataset, but that seems terribly inefficient.

在此先感谢您的阅读/回复！

Thanks in advance for reading/responding!

如何仅使用U-SQL和文件中的某些字段将大文件划分为文件/目录? [英] How do I partition a large file into files/directories using only U-SQL and certain fields in the file?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何仅使用U-SQL和文件中的某些字段将大文件划分为文件/目录? [英] How do I partition a large file into files/directories using only U-SQL and certain fields in the file?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭