将CSV导入Google Cloud数据存储区 [英] Import CSV into google cloud datastore
问题描述
我有一个包含2列和20,000行的CSV文件,我想将其导入Google Cloud Datastore.我是Google Cloud和NoSQL数据库的新手.我尝试使用数据流,但需要提供Javascript UDF函数名称.有人有这样的例子吗?一旦数据存储在数据存储中,我将对其进行查询. 任何有关如何创建它的建议或指导都将不胜感激.
I have a CSV file with 2 columns and 20,000 rows I would like to import into Google Cloud Datastore. I'm new to the Google Cloud and NoSQL databases. I have tried using dataflow but need to provide a Javascript UDF function name. Does anyone have an example of this? I will be querying this data once it's in the datastore. Any advice or guidance on how to create this would be appreciated.
推荐答案
使用Apache Beam,您可以使用TextIO
类读取CSV文件.请参见 TextIO 文档.
Using Apache Beam, you can read a CSV file using the TextIO
class. See the TextIO documentation.
Pipeline p = Pipeline.create();
p.apply(TextIO.read().from("gs://path/to/file.csv"));
接下来,应用转换将解析CSV文件中的每一行并返回一个Entity
对象.根据要存储每一行的方式,构造适当的Entity
对象. 此页面中有一个示例,说明了如何创建一个Entity
对象.
Next, apply a transform that will parse each row in the CSV file and return an Entity
object. Depending on how you want to store each row, construct the appropriate Entity
object. This page has an example of how to create an Entity
object.
.apply(ParDo.of(new DoFn<String, Entity>() {
@ProcessElement
public void processElement(ProcessContext c) {
String row = c.element();
// TODO: parse row (split) and construct Entity object
Entity entity = ...
c.output(entity);
}
}));
最后,将Entity
对象写入Cloud Datastore.请参见
Lastly, write the Entity
objects to Cloud Datastore. See the DatastoreIO documentation.
.apply(DatastoreIO.v1().write().withProjectId(projectId));
这篇关于将CSV导入Google Cloud数据存储区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!