通过JDBC对CSV文件执行SQL [英] Execute SQL on CSV files via JDBC

查看:296
本文介绍了通过JDBC对CSV文件执行SQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对CSV文件(逗号分隔的文本文件)应用SQL查询。我的SQL是从另一个工具预定义的,不能更改。它可能在FROM部分包含嵌入的选择和表别名。



对于我的任务,我发现了两个开源需求)提供JDBC驱动程序的库:


  1. CsvJdbc

  2. XlSQL

  3. JBoss Teiid

  4. 创建

这是我遇到的问题:


  1. 它不接受SQL的语法(它使用内部选择和表别名)。

  2. 我无法让它工作,因为它具有作为依赖的SAX解析器,在解析其他文档时导致异常。类似地,自2004年以来没有变化。

  3. 尚未检查它是否支持语法,但似乎是一个开销。它需要几个实体定义(虚拟数据库,绑定)。从邮件列表,他们告诉我最后一个版本支持运行时创建所需的对象。有没有人用它做这样简单的任务(通常它可以连接到几种类型的数据,如CSV,XML或其他DBS,并创建一个虚拟的,统一的)?



  4. 从我考虑/尝试的4件事来看,只有3件和4件似乎对我可行。任何建议,这些,或任何其他方式,我可以查询我的CSV文件?



    干杯

    解决方案

    如果您的SQL是预定义的,您最好的选择是将CSV加载到数据库中并对其执行查询。



    Apache Derby是一个可行的选项,因此 MySQL ,甚至还有 CSV存储引擎 PostgreSQL



    您的SQL是否使用任何专有功能/扩展?如果是这样,这可能会限制您的选择。


    I need to apply an SQL query to CSV files (comma-separated text files). My SQL is predefined from another tool, and is not eligible to change. It may contain embedded selects and table aliases in the FROM part.

    For my task I have found two open-source (this is a project requirement) libraries that provide JDBC drivers:

    1. CsvJdbc
    2. XlSQL
    3. JBoss Teiid
    4. Create an Apache Derby DB, load all CSVs as tables and execute the query.

    These are the problems I encountered:

    1. it does not accept the syntax of the SQL (it uses internal selects and table aliases). Furthermore, it has not been maintained since 2004.
    2. I could not get it to work, as it has as dependency a SAX Parser that causes exception when parsing other documents. Similarly, no change since 2004.
    3. Have not checked if it supports the syntax, but seems like an overhead. It needs several entities defines (Virtual Databases, Bindings). From the mailing list they told me that last release supports runtime creation of required objects. Has anyone used it for such simple task (normally it can connect to several types of data, like CSV, XML or other DBS and create a virtual, unified one)?
    4. Can this even be done easily?

    From the 4 things I considered/tried, only 3 and 4 seem to me viable. Any advice on these, or any other way in which I can query my CSV files?

    Cheers

    解决方案

    If your SQL is predefined and cannot be changed your best option is to load your CSV into a database and run queries against it.

    Apache Derby is a viable option, so are MySQL, which even has a CSV storage engine or PostgreSQL.

    Does your SQL use any proprietary functions / extensions? If so, that may limit your choices.

    这篇关于通过JDBC对CSV文件执行SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆