如何从Azure Data Lake中读取行分隔的json文件并使用usql查询 [英] How read line separated json file from azure data lake and query using usql

查看:75
本文介绍了如何从Azure Data Lake中读取行分隔的json文件并使用usql查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的ioT数据的天蓝色数据湖结构为{date}/{month}/{day}/abbs.杰森 每个文件都有多个记录,并用新行..分隔. 如何使用usql读取此数据并将其加载到表和查询中.

I have ioT data in azure datalake structure as {date}/{month}/{day}/abbs. Json Each file has multiple records separated by new line .. How to read this data using usql and load into table and query.

当我使用////将其加载到usql表中时,当将新文件添加到文件时,json将把数据加载到同一表中.

When I load it in usql table using ////.json will that load data into same table when new files added to files.

我已关注qzure文档,但未找到行分隔的json文件的任何答案.

I have followed qzure docs but did not find any answer to line separated json file.

推荐答案

在此示例中,我们将创建一个表来存储事件:

In this example we will create a table to store events:

CREATE TABLE dbo.Events  
(
     Event string
    ,INDEX cIX_EVENT CLUSTERED(Event ASC) DISTRIBUTED BY HASH(Event)
);

然后要提取json并将其插入数据库中

Then when it comes to extracting the json and inserting it into the database:

您首先必须使用简单的文本提取器提取行,然后才能对其进行解析.例如,给文件添加json对象,并用新行隔开

You first have to extract the lines using a simple text extractor, then you can parse it. For example, give a file with json objects separated with new lines

{ "Event": "One" }
{ "Event": "Tow" }
{ "Event":  "Three"}

然后此脚本将提取事件:

then this script will extract the events:

REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];

USING Microsoft.Analytics.Samples.Formats.Json;

@RawExtract = EXTRACT [RawString] string
    FROM @input
    USING Extractors.Text(delimiter:'\b', quoting : false);

@ParsedJSONLines = SELECT JsonFunctions.JsonTuple([RawString]) AS JSONLine
    FROM @RawExtract;

INSERT INTO Events  
SELECT JSONLine["Event"] AS Event
FROM @ParsedJSONLines;  

以后,您可以像这样从表中读取内容:

Later on you can read from the table like this:

@result =
    SELECT Event
    FROM Events;

OUTPUT @result
TO @output
USING Outputters.Csv(outputHeader : true, quoting : true);

现在,由于它是一个INSERT IMTO数据,将被添加到表中.

Now, since it is an INSERT IMTO data will be appended to the table.

资源:
GitHub示例

更多GitHub示例

这篇关于如何从Azure Data Lake中读取行分隔的json文件并使用usql查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆