无法读取GCS存储桶中存储的XML文件 [英] Unable to read XML File stored in GCS Bucket

查看:74
本文介绍了无法读取GCS存储桶中存储的XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图以最精确的方式遵循本文档:

I have tried to follow this documentation in the most precise way I could:

https://beam.apache.org/documentation/sdks/javadoc/2.0.0/org/apache/beam/sdk/io/xml/XmlIO.html

请在下面找到我的代码:

Please find below my codes :

public static void main(String args[])
{

    DataflowPipelineOptions options=PipelineOptionsFactory.as(DataflowPipelineOptions.class);
     options.setTempLocation("gs://balajee_test/stagging");
     options.setProject("test-1-130106");

     Pipeline p=Pipeline.create(options);

     PCollection<XMLFormatter> record= p.apply(XmlIO.<XMLFormatter>read()
             .from("gs://balajee_test/sample_3.xml")
             .withRootElement("book")
             .withRecordElement("author")
             .withRecordElement("title")
             .withRecordElement("genre")
             .withRecordElement("price")
             .withRecordElement("description")
             .withRecordClass(XMLFormatter.class)
             );

     record.apply(ParDo.of(new DoFn<XMLFormatter,String>(){
                @ProcessElement

                public void processElement(ProcessContext c)
                {
                    System.out.println(c.element().getAuthor());    
                }
             }));

     p.run(); 
}   

每个XML组件的值都为' null .您能否查看我的代码并向我建议所需的纠正措施?

I'm getting 'null' value for every XML component. Could you please review my code and suggest me the corrective course of action required?

package com.bitwise.cloud;

import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;

@XmlRootElement(name = "book")
@XmlType(propOrder = {"author", "title","genre","price","description"})
public class XMLFormatter {
private String author;
private String title;
private String genre;
private String price;
private String description;

public XMLFormatter() { }

public XMLFormatter(String author, String title,String genre,String price,String description) {
this.author = author;
this.title = title;
this.genre = genre;
this.price = price;
this.description = description;
}

@XmlElement
public void setAuthor(String author) {
this.author = author;
}

public String getAuthor() {
return author;
}

@XmlElement
public void setTitle(String title) {
this.title = title;
}

public String getTitle() {
return title;
}

@XmlElement
public void setGenre(String genre) {
this.genre = genre;
}

public String getGenre() {
return genre;
}

@XmlElement
public void setPrice(String price) {
this.price = price;
}

public String getPrice() {
return price;
}


@XmlElement
public void setDescription(String description) {
this.description = description;
}

public String getDescription() {
return description;
}
}

推荐答案

XmlIO.Read PTransform不支持​​提供多个记录元素(作者,标题,流派等).您必须提供一个根元素和一个record元素,并且XML文档必须包含具有相同record元素的记录.请参阅以下位置给出的示例.

XmlIO.Read PTransform doesn't support providing multiple record elements (author, title, genre, etc). You have to provide a single root element and a record element and your XML document has to contain records that have the same record element. See the example given in the following location.

查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆