使用Blob存储作为数据源按需对SQL中的数据进行分区 [英] Partitioning Data in SQL On-Demand with Blob Storage as Data Source
问题描述
在Amazon Redshift中,有一种方法可以在将S3存储桶用作数据源时创建分区键.链接.
In Amazon Redshift there is a way to create a partition key when using your S3 bucket as a data source. Link.
我正在尝试使用SQL On-Demand服务在Azure Synapse中做类似的事情.
I am attempting to do something similar in Azure Synapse using the SQL On-Demand service.
目前,我有一个存储帐户,该帐户已按以下方案进行了分区:
Currently I have a storage account that is partitioned such that it follows this scheme:
-Sales (folder)
- 2020-10-01 (folder)
- File 1
- File 2
- 2020-10-02 (folder)
- File 3
- File 4
要创建视图并提取所有4个文件,我运行了命令:
To create a view and pull in all 4 files I ran the command:
CREATE VIEW testview3 AS SELECT * FROM OPENROWSET ( BULK 'Sales/*/*.csv', FORMAT = 'CSV', PARSER_VERSION = '2.0', DATA_SOURCE = 'AzureBlob', FIELDTERMINATOR = ',', FIRSTROW = 2 ) AS tv1;
如果我运行 SELECT * FROM [myview]
的查询,我将从所有4个文件中接收数据.
If I run a query of SELECT * FROM [myview]
I receive data from all 4 files.
我该如何创建分区键,以便可以运行查询,例如
How can I go about creating a partition key so that I could run a query such as
SELECT * FROM [myview] WHERE folderdate > 2020-10-01
这样我只能分析文件3和4中的数据?
so that I can only analyze data from Files 3 and 4?
我知道我可以编辑OPENROWSET BULK语句,但是我希望能够首先从容器中获取所有数据,然后根据需要限制搜索.
I know I can edit my OPENROWSET BULK statement but I want to be able to get all the data from my container at first and then constrain searches as needed.
推荐答案
无服务器SQL可以使用文件名(您希望在其中加载一个或多个特定文件)和文件路径(您在其中加载所有文件)来解析分区文件夹结构.这个说的路径).有关语法和用法的更多信息,请参见在线文档.
Serverless SQL can parse partitioned folder structure's using the filename (where you wish to load a specific file or files) and filepath (where you wish to load all files in this said path). More information on syntax and usage is available on documentation online.
对于您而言,您可以使用文件路径语法(例如filepath(1)>'2020-10-01'
In your case, you can parse all files from '2020-10-01' and beyond using the filepath syntax such as filepath(1) > '2020-10-01'
这篇关于使用Blob存储作为数据源按需对SQL中的数据进行分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!