使用 Apache Camel 通过地图列表将 CSV 文件转换为 JSON 文件 [英] Convert CSV file to JSON file with Apache Camel via List of Maps

查看:23
本文介绍了使用 Apache Camel 通过地图列表将 CSV 文件转换为 JSON 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目标是学习如何将 CSV 读取到地图列表,然后如何将其编组为 JSON.

The goal is to learn how to read CSV to List of Maps, then how to marshal it to JSON.

一旦我理解了如何去做,我就会知道如何定义更有用的路由.

Once I understand how to do it, I will get how to define more useful routes.

我使用 XML 来定义路由,另一个限制是不能创建任何转换 bean,而只能使用现有组件.

I use XML to define routes, another restriction is not to create any transformation beans, but use only the existing components.

我的理解显然缺乏一些概念.我知道你必须提供一个bean作为消费者,然后你可以传递它;但是文档说 csv 数据格式使用的地图列表有什么问题?

My understanding is obvoiusly lacks some concept. I understand that you have to provide a bean as a consumer, then you may pass it on; but what's wrong with the List of Maps that the doc says the csv dataformat uses?

    <dataFormats>
        <json id="jack" library="Jackson"/>
    </dataFormats>  

    <route>
        <from uri="file:///C:/tries/collApp/exchange/in?fileName=registerSampleSmaller.csv"/>
        <unmarshal>
            <csv />
        </unmarshal>            
        <marshal ref="jack">                
        </marshal>
        <to uri="file:///C:/tries/collApp/exchange/out?fileName=out.json"/>          
    </route>

默默地什么都不做.我只能看到锁文件是如何出现和消失的.

silently does nothing. I can only see how the lock file appears and disappears.

谢谢!

ps/我期待创建两条路线,第一个将读取 csv,将其转换 - 将它的扁平性质塑造成我的持久 bean 的平坦性质,而不是将其传递给我的 bean.第二个只是将我的 bean 保存为 json,这似乎是一个简单的部分;但我首先需要这样做以了解它是如何工作的

ps/ I am looking forward to create two routes, the first will read a csv, transform it - shaping it's flat nature to that of my persistent beans, than pass it to my beans. And the second will just save my beans as json, seems to be an easy part; but I first need to do this to understand how it works

推荐答案

随着我的进步,我正在提供答案.

I am providing an answer as I have moved forward.

我走在正确的轨道上,只是出现了一些小错误.Jérémie B 在对原始问题的评论中注意到了一个问题.

I was on the right track, there were just small errors. One was noticed by Jérémie B in comments to an original questions.

它无声无息地失败了,因为我没有启用日志记录,我通过在 pom.xml 中添加这样的 slf4j 做到了:

It failed silently because I haven't enabled logging, I did it by adding slf4j like this in my pom.xml:

    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>${slf4j-version}</version>
    </dependency>    
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-jdk14</artifactId>
        <version>${slf4j-version}</version>
    </dependency>    
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>jcl-over-slf4j</artifactId>
        <version>${slf4j-version}</version>
    </dependency>    

我看到了许多错误,甚至还有骆驼车的行为,但我设法使这条路线正常工作:

I saw numerous errors, and even Camel buggy behaviour, but I have manages to make this route work:

    <dataFormats>
        <json id="jack" library="Jackson" prettyPrint="true"/>
    </dataFormats>       

    <route>

        <from uri="file:///C:/tries/collApp/exchange/in?fileName=registerSampleUtf.csv&amp;charset=UTF-8"/>
        <log message="file: ${body.class.name} ${body}" loggingLevel="WARN"/>
        <unmarshal>
            <csv delimiter=";"  useMaps="true" />
        </unmarshal>           
        <log message="unmarshalled: ${body.class.name} ${body}" loggingLevel="WARN"/>
        <marshal ref="jack"/>
        <log message="marshalled: ${body}" loggingLevel="WARN"/>
        <to uri="file:///C:/tries/collApp/exchange/out?fileName=out.json"/>         
    </route>

所以基本上,在清理错别字之后,我不得不

So basically, after cleaning typos I had to

  • 指定输入文件字符集,

  • specify input file charset,

指定 Excel 用于创建我的 csv 的分隔符,

specify a delimiter that Excel used to create my csv,

告诉将它放在地图中.

不幸的是,这个特定的代码不起作用,可能是由于我向开发者社区报告的 Camel 错误(仍然没有反应,http://camel.465427.n5.nabble.com/A-possible-bug-in-IOConverter-with-Win-1251-charset-td5778665.html)

Unfortunately this particular code doesn't work, possibly due to a Camel bug which I reported to the developer comunity (no reaction yet still, http://camel.465427.n5.nabble.com/A-possible-bug-in-IOConverter-with-Win-1251-charset-td5778665.html)

虽然我向前迈进了一步,但现在我可能绕过了有缺陷的 Camel 的 IOConverter,目前我处于这个阶段(这不是问题的答案,只是为了提供信息,Camel 有多方便):

Though I moved forward, probably now I am bypassing the flawed Camel's IOConverter, and currently I am on this stage (this is not as an answer to the question, just for the info, how handy Camel can be):

    <route>
        <from uri="file:///C:/tries/collApp/exchange/in?fileName=registerSampleSmaller1.csv&amp;charset=windows-1251"/>
        <split streaming="true">
            <method ref="csvSplitter" method="tokenizeReader"/>  <!-- aprepends the first line of file for every subsequent line -->
            <log message="splitted: ${body}" loggingLevel="DEBUG"/>
            <unmarshal>
                <csv delimiter=";"  useMaps="true" />
            </unmarshal>            
            <log message="unmarshalled: size: ${body.size()}, ${body}" loggingLevel="DEBUG"/>
            <filter>
                <simple>${body.size()} == 1</simple><!-- be sure to have spaces around an operator -->
                <log message="filtered: listItem: ${body[0]['PATRONYMIC']}, list: ${body}" loggingLevel="DEBUG"/>
                <transform>
                    <spel>#{
                        {
                        lastName:body[0]['LAST_NAME'],
                        firstName: body[0]['FIRST_NAME'],
                        patronymic: body[0]['PATRONYMIC'],
                        comment:body[0]['COMMENT6']
                        }
                        }</spel><!-- split the spel {:} map creation notation in multiline is crucial-->
                </transform>                
                <log message="transformed: ${body}" loggingLevel="DEBUG"/>
                <marshal ref="jack"/>
                <log message="marshalled: ${body}" loggingLevel="DEBUG"/>
                <to uri="file:///C:/tries/collApp/exchange/out?fileName=out${exchangeProperty.CamelSplitIndex}.json"/>          
            </filter>
        </split>
    </route>

我必须编写自己的 CSV 拆分器(关于所有 Unicode 代码点等),这基本上是将第一行添加到所有后续行中,但现在我能够以流畅的方式将 CSV 拆分为一组 JSON,或者以不同的方式处理对象而不是编组.

I had to write my own CSV splitter (with respect to all Unicode codepoints etc), which is basically adds the first lines to all subsequent lines, but now I am able to split CSV into a set of JSONs in a streamish manner, or handle objects differently instead of marshalling.

**更新 - csvSplitter 代码 **

**update - csvSplitter code **

Reader Tokenizer - 围绕读者的迭代器:

Reader Tokenizer - an iterator around a reader:

public class ReaderTokenizer implements Iterator<String> {

private String _curString = null;
private boolean _endReached = false;
private final Reader _reader;
private char[] _token;

public ReaderTokenizer(Reader reader, String token) {
    setToken(token);
    _reader = reader;
}

public final void setToken(String token){
    _token = token.toCharArray();
    if(_token.length==0){
        throw new IllegalArgumentException("Can't tokenize with the empty string");
    }
}

private void _readNextToken() throws IOException {

    int curCharInt;
    char previousChar = (char) -1;
    int tokenPos = 0;
    StringBuilder sb = new StringBuilder(255);

    while (true) {
        curCharInt = _reader.read();
        if (curCharInt == -1) {
            _endReached = true;
            _reader.close();
            break;
        }
        if (curCharInt == _token[tokenPos]) {

            if (tokenPos != 0 || !Character.isHighSurrogate(previousChar)) {
                tokenPos++;

                if (tokenPos >= _token.length) {
                    tokenPos = 0;
                    previousChar = (char) curCharInt;
                    sb.append(previousChar);
                    break;
                }
            }
        }

        previousChar = (char) curCharInt;
        sb.append(previousChar);
    }
    _curString = sb.toString();
}

@Override
public boolean hasNext() {
    if (_curString == null) {
        if (_endReached) {
            return false;
        }
        try {
            _readNextToken();
        } catch (IOException ex) {
            throw new RuntimeException(ex);
        }

        if (_curString != null) {
            return true;
        }

        if (_endReached) {
            return false;
        }

        throw new RuntimeException("Someting wrong");

    } else {
        return true;
    }
}

@Override
public String next() {
    if (_curString != null) {
        String ret = _curString;
        _curString = null;
        return ret;
    }
    if (_endReached) {
        throw new NoSuchElementException();
    }

    try {
        _readNextToken();
    } catch (IOException ex) {
        throw new RuntimeException(ex);
    }

    if (_curString != null) {
        String ret = _curString;
        _curString = null;
        return ret;
    }

    throw new RuntimeException("Someting wrong");
}

@Override
public void remove() {
    throw new UnsupportedOperationException("Not supported.");
}

}

分离器本身:

public class CamelReaderSplitter {

private final String _token;
private final int _headerLinesNumber;

public CamelReaderSplitter(String token, int headerLinesNumber) {
    _token = token;
    _headerLinesNumber = headerLinesNumber;
}

public CamelReaderSplitter(String token) {
    _token = token;
    _headerLinesNumber = 1;
}

public CamelReaderSplitter(int headerLinesNumber) {
    _token = "\r\n";
    _headerLinesNumber = headerLinesNumber;
}

public CamelReaderSplitter() {
    _token = "\r\n";
    _headerLinesNumber = 1;
}

public Iterator<String> tokenizeReader(final Reader reader) throws IOException {

    Iterator<String> ret = new ReaderTokenizer(reader, _token) {

        private final String _firstLines;

        {
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < _headerLinesNumber; i++) {
                if (super.hasNext()) {
                    sb.append(super.next());
                }
            }
            _firstLines = sb.toString();
        }

        @Override
        public String next() {
            return _firstLines + super.next();
        }

    };

    return ret;

}

}

这篇关于使用 Apache Camel 通过地图列表将 CSV 文件转换为 JSON 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆