将CSV导入Google Cloud数据存储区 [英] Import CSV into google cloud datastore

查看:206
本文介绍了将CSV导入Google Cloud数据存储区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含2列和20,000行的CSV文件,我想将其导入Google Cloud Datastore.我是Google Cloud和NoSQL数据库的新手.我尝试使用数据流,但需要提供Javascript UDF函数名称.有人有这样的例子吗?一旦数据存储在数据存储中,我将对其进行查询. 任何有关如何创建它的建议或指导都将不胜感激.

I have a CSV file with 2 columns and 20,000 rows I would like to import into Google Cloud Datastore. I'm new to the Google Cloud and NoSQL databases. I have tried using dataflow but need to provide a Javascript UDF function name. Does anyone have an example of this? I will be querying this data once it's in the datastore. Any advice or guidance on how to create this would be appreciated.

推荐答案

使用Apache Beam,您可以使用TextIO类读取CSV文件.请参见 TextIO 文档.

Using Apache Beam, you can read a CSV file using the TextIO class. See the TextIO documentation.

Pipeline p = Pipeline.create();

p.apply(TextIO.read().from("gs://path/to/file.csv"));

接下来,应用转换将解析CSV文件中的每一行并返回一个Entity对象.根据要存储每一行​​的方式,构造适当的Entity对象. 此页面中有一个示例,说明了如何创建一个Entity对象.

Next, apply a transform that will parse each row in the CSV file and return an Entity object. Depending on how you want to store each row, construct the appropriate Entity object. This page has an example of how to create an Entity object.

.apply(ParDo.of(new DoFn<String, Entity>() {
    @ProcessElement
    public void processElement(ProcessContext c) {
        String row = c.element();
        // TODO: parse row (split) and construct Entity object
        Entity entity = ...
        c.output(entity);
    }
}));

最后,将Entity对象写入Cloud Datastore.请参见

Lastly, write the Entity objects to Cloud Datastore. See the DatastoreIO documentation.

.apply(DatastoreIO.v1().write().withProjectId(projectId));

这篇关于将CSV导入Google Cloud数据存储区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆