我所有数据库查询的相同实例标头 ( arff ) [英] Same Instances header ( arff ) for all my database queries

查看:31
本文介绍了我所有数据库查询的相同实例标头 ( arff )的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 InstanceQuery 、SQL 查询来构建我的 实例.但是我的查询结果并不总是按照 SQL 中的正常顺序出现.因为从不同的 SQL 构造的实例具有不同的标头.下面是一个简单的例子.我怀疑我的结果会因为这种行为而改变.

标题 1

@attribute 持续时间数字@attribute protocol_type {tcp,udp}@attribute 服务 {http,domain_u}@attribute 标志 {SF}

标题 2

@attribute 持续时间数字@attribute protocol_type {tcp}@attribute 服务 {pm_dump,pop_2,pop_3}@attribute 标志 {SF,S0,SH}

我的问题是:如何为实例构造提供正确的标头信息.

以下工作流程是否可行?

  1. 从 arff 文件或其他地方获取预先准备好的头信息.
  2. 给实例构造这个头信息
  3. 调用 sql 函数并获取实例(标题 + 数据)

我正在使用以下 sql 函数从数据库中获取实例.

公共静态实例 getInstanceDataFromDatabase(String pSql,String pInstanceRelationName){尝试 {DatabaseUtils utils = new DatabaseUtils();InstanceQuery 查询 = new InstanceQuery();query.setUsername(用户名);query.setPassword(密码);query.setQuery(pSql);实例数据 = query.retrieveInstances();data.setRelationName(pInstanceRelationName);如果 (data.classIndex() == -1){data.setClassIndex(data.numAttributes() - 1);}返回数据;} 捕获(异常 e){抛出新的运行时异常(e);}}

解决方案

我尝试了各种方法来解决我的问题.但是现在似乎weka内部API不允许解决这个问题.为了我的目的,我修改了 weka.core.Instances 附加命令行代码.此代码也在此答案中给出>

据此,这是我的解决方案.我创建了一个 SampleWithKnownHeader.arff 文件,其中包含正确的标头值.我用以下代码阅读了这个文件.

公共静态实例 getSampleInstances() {实例数据 = null;尝试 {BufferedReader reader = new BufferedReader(new FileReader("datas\\SampleWithKnownHeader.arff"));数据 = 新实例(读者);reader.close();//设置类属性data.setClassIndex(data.numAttributes() - 1);}捕获(异常 e){抛出新的运行时异常(e);}返回数据;}

之后,我使用以下代码创建实例.我不得不使用 StringBuilder 和实例的字符串值,然后我将相应的字符串保存到文件中.

public static void main(String[] args) {实例 SampleInstance = MyUtilsForWeka.getSampleInstances();DataSource source1 = new DataSource(SampleInstance);实例数据 2 = InstancesFromDatabase.getInstanceDataFromDatabase(DatabaseQueries.WEKALIST_QUESTION1);MyUtilsForWeka.saveInstancesToFile(data2, "fromDatabase.arff");数据源源 2 = 新数据源(数据 2);实例结构1;实例结构2;StringBuilder sb = new StringBuilder();尝试 {结构 1 = source1.getStructure();sb.append(structure1);结构 2 = source2.getStructure();而 (source2.hasMoreElements(structure2)) {String elementAsString = source2.nextElement(structure2).toString();sb.append(elementAsString);sb.append("\n");}} 捕捉(异常前){throw new RuntimeException(ex);}MyUtilsForWeka.saveInstancesToFile(sb.toString(), "combined.arff");}

我的保存实例到文件代码如下.

public static void saveInstancesToFile(String contents,String filename) {FileWriter fstream;尝试 {fstream = new FileWriter(filename);BufferedWriter out = new BufferedWriter(fstream);out.write(内容);关闭();} 捕捉(异常前){throw new RuntimeException(ex);}

这解决了我的问题,但我想知道是否存在更优雅的解决方案.

I am using InstanceQuery , SQL queries, to construct my Instances. But my query results does not come in the same order always as it is normal in SQL. Beacuse of this Instances constucted from different SQL has different headers. A simple example can be seen below. I suspect my results changes because of this behavior.

Header 1

@attribute duration numeric
@attribute protocol_type {tcp,udp}
@attribute service {http,domain_u}
@attribute flag {SF}

Header 2

@attribute duration numeric
@attribute protocol_type {tcp}
@attribute service {pm_dump,pop_2,pop_3}
@attribute flag {SF,S0,SH}

My question is : How can I give correct header information to Instance construction.

Is something like below workflow is possible?

  1. get pre-prepared header information from arff file or another place.
  2. give instance construction this header information
  3. call sql function and get Instances (header + data)

I am using following sql function to get instances from database.

public static Instances getInstanceDataFromDatabase(String pSql
                                      ,String pInstanceRelationName){
    try {
        DatabaseUtils utils = new DatabaseUtils();

        InstanceQuery query = new InstanceQuery();

        query.setUsername(username);
        query.setPassword(password);
        query.setQuery(pSql);

        Instances data = query.retrieveInstances();
        data.setRelationName(pInstanceRelationName);

        if (data.classIndex() == -1)
        {
              data.setClassIndex(data.numAttributes() - 1);
        }
        return data;
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

解决方案

I tried various approaches to my problem. But it seems that weka internal API does not allow solution to this problem right now. I modified weka.core.Instances append command line code for my purposes. This code is also given in this answer

According to this, here is my solution. I created a SampleWithKnownHeader.arff file , which contains correct header values. I read this file with following code.

public static Instances getSampleInstances() {
    Instances data = null;
    try {
        BufferedReader reader = new BufferedReader(new FileReader(
                "datas\\SampleWithKnownHeader.arff"));
        data = new Instances(reader);
        reader.close();
        // setting class attribute
        data.setClassIndex(data.numAttributes() - 1);
    }
    catch (Exception e) {
        throw new RuntimeException(e);
    } 
    return data;

}

After that , I use following code to create instances. I had to use StringBuilder and string values of instance, then I save corresponding string to file.

public static void main(String[] args) {

    Instances SampleInstance = MyUtilsForWeka.getSampleInstances();

    DataSource source1 = new DataSource(SampleInstance);

    Instances data2 = InstancesFromDatabase
            .getInstanceDataFromDatabase(DatabaseQueries.WEKALIST_QUESTION1);

    MyUtilsForWeka.saveInstancesToFile(data2, "fromDatabase.arff");

    DataSource source2 = new DataSource(data2);

    Instances structure1;
    Instances structure2;
    StringBuilder sb = new StringBuilder();
    try {
        structure1 = source1.getStructure();
        sb.append(structure1);
        structure2 = source2.getStructure();
        while (source2.hasMoreElements(structure2)) {
            String elementAsString = source2.nextElement(structure2)
                    .toString();
            sb.append(elementAsString);
            sb.append("\n");

        }

    } catch (Exception ex) {
        throw new RuntimeException(ex);
    }

    MyUtilsForWeka.saveInstancesToFile(sb.toString(), "combined.arff");

}

My save instances to file code is as below.

public static void saveInstancesToFile(String contents,String filename) {

     FileWriter fstream;
    try {
        fstream = new FileWriter(filename);
      BufferedWriter out = new BufferedWriter(fstream);
      out.write(contents);
      out.close();
    } catch (Exception ex) {
        throw new RuntimeException(ex);
    }

This solves my problem but I wonder if more elegant solution exists.

这篇关于我所有数据库查询的相同实例标头 ( arff )的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆