在infobright中将数据输入到文件csv文件中 [英] Load data infile csv file in infobright

查看:71
本文介绍了在infobright中将数据输入到文件csv文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表(已在infobright中成功创建).

我正在使用Windows系统

CREATE TABLE `file_records` (
  `id` int(11) NOT NULL ,
  `file_id` int(11)  NULL,
  `file_url` varchar(255)  NULL,
  `switch_id` int(11)  NULL,
  `carrierid_supplier` int(11)  NULL,
  `technical_profileid_supplier` int(11)  NULL,
  `carrierid_customer` int(11)  NULL,
  `technical_profileid_customer` int(11)  NULL,
  `billing_increment_supplier` varchar(10)  NULL,
  `billing_increment_customer` varchar(10)  NULL,
  `billable_duration_supplier` int(11)  NULL,
  `call_duration_seconds` int(11)  NULL,
  `billable_duration_customer` int(11)  NULL,
  `destination` varchar(50)  NULL,
  `destination_country` varchar(50)  NULL,
  `destination_country_number` varchar(50)  NULL,
  `rateplanid_supplier` int(11)  NULL,
  `rateplanid_customer` int(11)  NULL,
  `rate_supplier` int(11)  NULL,
  `rate_customer` int(11)  NULL,
  `rate_total_supplier` varchar(10)  NULL,
  `rate_total_customer` varchar(10)  NULL,
  `rate_effective_date_supplier` date  NULL,
  `rate_effective_date_customer` date  NULL,
  `call_hour` varchar(10)  NULL,
  `sequence_number` int(11)  NULL,
  `version` varchar(10)  NULL,
  `record_type` varchar(1)  NULL,
  `connection_type` varchar(16)  NULL,
  `session_id` varchar(36)  NULL,
  `release_cause` smallint(6)  NULL,
  `start_time_date` datetime  NULL,
  `answer_time_date` datetime  NULL,
  `release_tod` datetime  NULL,
  `greenwich_mean_time` varchar(32)  NULL,
  `release_cause_protocol_stack` varchar(32)  NULL,
  `binary_value_protocol_stack` smallint(6)  NULL,
  `first_release_dialogue` varchar(1)  NULL,
  `origination_trunkid` int(11)  NULL,
  `voip_protocol` varchar(6)  NULL,
  `source_number` varchar(128)  NULL,
  `source_host_name` varchar(128)  NULL,
  `destination_number` varchar(128)  NULL,
  `destination_host_name` varchar(128)  NULL,
  `callid` varchar(128)  NULL,
  `remote_payload_ipaddress` varchar(16)  NULL,
  `remote_payload_udpaddress` varchar(6)  NULL,
  `local_payload_ipaddress` varchar(16)  NULL,
  `local_payload_udpaddress` varchar(6)  NULL,
  `codec_list` varchar(128)  NULL,
  `ingress_packets` int(11)  NULL,
  `egress_packets` int(11)  NULL,
  `ingress_octets` int(11)  NULL,
  `egress_octets` int(11)  NULL,
  `ingress_packet_loss` int(11)  NULL,
  `ingress_delay` int(11)  NULL,
  `ingress_packet_jitter` int(11)  NULL,
  `supplierid` mediumint(9)  NULL,
  `protocol` varchar(6)  NULL,
  `termination_source_number` varchar(128)  NULL,
  `termination_source_host` varchar(128)  NULL,
  `termination_destination_number` varchar(128)  NULL,
  `termination_destination_host_name` varchar(128)  NULL,
  `termination_callid` varchar(128)  NULL,
  `termination_remote_payload_ipaddress` varchar(16)  NULL,
  `termination_remote_payload_udpaddress` varchar(6)  NULL,
  `termination_local_payload_ipaddress` varchar(16)  NULL,
  `termination_local_payload_udpaddress` varchar(6)  NULL,
  `termination_codec_list` varchar(128)  NULL,
  `termination_ingress_packets` int(11)  NULL,
  `termination_egress_packets` int(11)  NULL,
  `termination_ingress_octets` int(11)  NULL,
  `termination_egress_octets` int(11)  NULL,
  `termination_ingress_packet_loss` int(11)  NULL,
  `termination_ingress_delay` int(11)  NULL,
  `termination_ingress_packet_jitter` int(11)  NULL,
  `final_route_indication` varchar(1)  NULL,
  `routing_digits` varchar(64)  NULL,
  `call_duration` mediumint(9)  NULL,
  `post_dial_delay` mediumint(9)  NULL,
  `ring_time` mediumint(9)  NULL,
  `call_duration_ms` int(11)  NULL,
  `confid` varchar(32)  NULL,
  `rpid` varchar(32)  NULL,
  `route_entry_index` tinyint(4)  NULL,
  `route_table_used` mediumint(9)  NULL,
  `lnp_dipped` varchar(1)  NULL,
  `ingress_lrn` varchar(32)  NULL,
  `egress_lrn` varchar(32)  NULL,
  `cnam_dipped` tinyint(4)  NULL,
  `dnc_dipped` tinyint(4)  NULL,
  `origination_device_name` varchar(15)  NULL,
  `termination_device_name` varchar(15)  NULL,
  `ers_dipped` varchar(1)  NULL,
  `oli_digits` varchar(8)  NULL
) ENGINE=Brighthouse;

并使用

加载CSV数据文件

LOAD DATA INFILE 'C:/Users/dhairya/Desktop/a.csv' INTO TABLE file_records FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' ( sequence_number, version, record_type, connection_type, session_id, release_cause, @start_time_date, @answer_time_date, @release_tod, greenwich_mean_time, release_cause_protocol_stack, binary_value_protocol_stack, first_release_dialogue, origination_trunkid, voip_protocol, source_number, source_host_name, destination_number, destination_host_name, callid, remote_payload_ipaddress, remote_payload_udpaddress, local_payload_ipaddress, local_payload_udpaddress, codec_list, ingress_packets, egress_packets, ingress_octets, egress_octets, ingress_packet_loss, ingress_delay, ingress_packet_jitter, supplierid, protocol, termination_source_number, termination_source_host, termination_destination_number, termination_destination_host_name, termination_callid, termination_remote_payload_ipaddress, termination_remote_payload_udpaddress, termination_local_payload_ipaddress, termination_local_payload_udpaddress, termination_codec_list, termination_ingress_packets, termination_egress_packets, termination_ingress_octets, termination_egress_octets, termination_ingress_packet_loss, termination_ingress_delay, termination_ingress_packet_jitter, final_route_indication, routing_digits, call_duration, post_dial_delay, ring_time, call_duration_ms, confid, rpid, route_entry_index, route_table_used, lnp_dipped, ingress_lrn, egress_lrn, cnam_dipped, dnc_dipped, origination_device_name, termination_device_name, ers_dipped, oli_digits, routing_match, termination_pdd, proxy_charge_information, jurisdictoinal_indication_parameter, matched_digits_ers, route_table_irs, route_sequence_number_ers, jurisdiction_type_ers) set start_time_date = 0, answer_time_date = 0, release_tod = 0, file_id = 1;

我总是收到错误2 Hy0000:错误的数据或列定义

csv数据示例如下

018002724;V1.17;R;No-Media;316773032-878328180@98.122.122.122;0001;Sat Apr 27 23:57:21 2013;Sat Apr 27 23:57:54 2013;Sat Apr 27 23:58:44 2013;0;Normal BYE; 200;O;030057;SIP;0186645212;80.84.30.12;6282330238419;98.123.123.000;7006204640450523785-1367107040-32218;80.84.30.12;39264;NA;NA;G.729,G.723.1,RFC 2833,G.711a_64k,G.711u_64k;0;0;0;0;0;0;0;090069;SIP;0186645212;98.122.122.122;6282330238419;112.123.123.123;316773032-0-878328180@98.158.145.166;123.123.123.2;24992;NA;NA;G.729,RFC 2833;0;0;0;0;0;0;0;F;6282330238419;49;1;32;49680;12E192A8 345A3974 ADF7A11C A6919E62;;1;103;0;;;0;0;SP Belgacom - B;Vinculum WHS Ve;0;;62823;1140;;;;0;0;0;

我将csv文件的终止符检查为'\ n'

上述查询将在mysql中正常工作

解决方案

问题是BRIGHTHOUSE加载程序不支持在LOAD DATA INFILE命令中指定列.它还不支持在此命令中为列设置默认值.

要像这样正确使用BRIGHTHOUSE表,您有4个选择:

  • 使表架构与CSV文件布局匹配,
  • 使CSV布局与表模式匹配,
  • 使用设置SET @BH_DATAFORMAT = 'mysql'来使用标准的mysql加载程序,而不是BRIGHTHOUSE加载程序,
  • 或者创建一个与您的CSV布局匹配的新临时表,加载到其中,然后以与原始表所需的CSV格式匹配的格式将该表的内容写入新的CSV文件.

我知道这很痛苦,过去6个月来我一直在使用Infobright数据库,而使用它确实让人头疼.

此外,在使用BRIGHTHOUSE加载程序时,请不要忘记以下便捷设置:

  • SET @BH_DATAFORMAT = 'txt_variable';这意味着使用BRIGHTHOUSE加载程序.
  • SET @BH_DATAFORMAT = 'mysql';这意味着使用MySQL加载程序.
  • SET @BH_REJECT_FILE_PATH = '/tmp/new_file_name';这意味着当它与您要加载的模式不匹配时,将CSV文件中的任何行复制到此输出文件中.这对于调试或具有无法删除的标头的大型CSV文件非常有用,因为BRIGHTHOUSE加载程序未在LOAD DATA INFILE命令中实现IGNORE n LINES参数.
  • SET @BH_ABORT_ON_COUNT = 1000;当输入CSV文件中的此行数与您要加载到的表的模式不匹配时,中止文件加载命令.

还请记住,BRIGHTHOUSE CSV 导出器中存在一个错误,这意味着反斜杠不会在生成的CSV文件中转义.因此,如果您想将此CSV重新加载到数据库模式中,则需要使用LOAD DATA INFILE '...' INTO TABLE ... FIELDS ESCAPED BY '',否则导入程序将使用反斜杠作为转义序列的一部分来更改其后的字符的含义.

这是一个方便的SQL块,可以帮助您将CSV加载到架构不匹配的BRIGHTHOUSE表中:

CREATE TABLE `tmptable` ENGINE=MySQL SELECT * FROM `maintable` WHERE 0 LIMIT 1;

@SET BH_DATAFORMAT = 'mysql';

LOAD DATA INFILE '/tmp/myfile' INTO TABLE `tmptable`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n' 
IGNORE 1 LINES
(field1, field2, field3) SET field4 = 'somevalue'; /*as per your CSV layout*/

SELECT * FROM tmptable INTO OUTFILE '/tmp/mynewfile.csv' 
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n';

DROP TABLE `tmptable`;

@SET BH_DATAFORMAT = 'txt_variable';
@SET BH_REJECT_FILE_PATH = '/tmp/failed_csv_lines.csv';
@SET BH_ABORT_ON_COUNT = 10;

LOAD DATA INFILE '/tmp/mynewfile.csv' INTO TABLE `maintable`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n';

(WHERE 0 LIMIT 1位在LIMIT 0不能正常工作的地方起作用-在MySQL LIMIT 0中可以正常工作,但是在使用BRIGHTHOUSE时会引发一些非常奇怪的错误,这使我花了一些时间来追踪,所以我使用而是这种方法.)

我知道这个答案来得有点晚,但是我希望对那些偶然发现这个问题的人有用.

I have a table ( created successfully in infobright).

I m using windows system

CREATE TABLE `file_records` (
  `id` int(11) NOT NULL ,
  `file_id` int(11)  NULL,
  `file_url` varchar(255)  NULL,
  `switch_id` int(11)  NULL,
  `carrierid_supplier` int(11)  NULL,
  `technical_profileid_supplier` int(11)  NULL,
  `carrierid_customer` int(11)  NULL,
  `technical_profileid_customer` int(11)  NULL,
  `billing_increment_supplier` varchar(10)  NULL,
  `billing_increment_customer` varchar(10)  NULL,
  `billable_duration_supplier` int(11)  NULL,
  `call_duration_seconds` int(11)  NULL,
  `billable_duration_customer` int(11)  NULL,
  `destination` varchar(50)  NULL,
  `destination_country` varchar(50)  NULL,
  `destination_country_number` varchar(50)  NULL,
  `rateplanid_supplier` int(11)  NULL,
  `rateplanid_customer` int(11)  NULL,
  `rate_supplier` int(11)  NULL,
  `rate_customer` int(11)  NULL,
  `rate_total_supplier` varchar(10)  NULL,
  `rate_total_customer` varchar(10)  NULL,
  `rate_effective_date_supplier` date  NULL,
  `rate_effective_date_customer` date  NULL,
  `call_hour` varchar(10)  NULL,
  `sequence_number` int(11)  NULL,
  `version` varchar(10)  NULL,
  `record_type` varchar(1)  NULL,
  `connection_type` varchar(16)  NULL,
  `session_id` varchar(36)  NULL,
  `release_cause` smallint(6)  NULL,
  `start_time_date` datetime  NULL,
  `answer_time_date` datetime  NULL,
  `release_tod` datetime  NULL,
  `greenwich_mean_time` varchar(32)  NULL,
  `release_cause_protocol_stack` varchar(32)  NULL,
  `binary_value_protocol_stack` smallint(6)  NULL,
  `first_release_dialogue` varchar(1)  NULL,
  `origination_trunkid` int(11)  NULL,
  `voip_protocol` varchar(6)  NULL,
  `source_number` varchar(128)  NULL,
  `source_host_name` varchar(128)  NULL,
  `destination_number` varchar(128)  NULL,
  `destination_host_name` varchar(128)  NULL,
  `callid` varchar(128)  NULL,
  `remote_payload_ipaddress` varchar(16)  NULL,
  `remote_payload_udpaddress` varchar(6)  NULL,
  `local_payload_ipaddress` varchar(16)  NULL,
  `local_payload_udpaddress` varchar(6)  NULL,
  `codec_list` varchar(128)  NULL,
  `ingress_packets` int(11)  NULL,
  `egress_packets` int(11)  NULL,
  `ingress_octets` int(11)  NULL,
  `egress_octets` int(11)  NULL,
  `ingress_packet_loss` int(11)  NULL,
  `ingress_delay` int(11)  NULL,
  `ingress_packet_jitter` int(11)  NULL,
  `supplierid` mediumint(9)  NULL,
  `protocol` varchar(6)  NULL,
  `termination_source_number` varchar(128)  NULL,
  `termination_source_host` varchar(128)  NULL,
  `termination_destination_number` varchar(128)  NULL,
  `termination_destination_host_name` varchar(128)  NULL,
  `termination_callid` varchar(128)  NULL,
  `termination_remote_payload_ipaddress` varchar(16)  NULL,
  `termination_remote_payload_udpaddress` varchar(6)  NULL,
  `termination_local_payload_ipaddress` varchar(16)  NULL,
  `termination_local_payload_udpaddress` varchar(6)  NULL,
  `termination_codec_list` varchar(128)  NULL,
  `termination_ingress_packets` int(11)  NULL,
  `termination_egress_packets` int(11)  NULL,
  `termination_ingress_octets` int(11)  NULL,
  `termination_egress_octets` int(11)  NULL,
  `termination_ingress_packet_loss` int(11)  NULL,
  `termination_ingress_delay` int(11)  NULL,
  `termination_ingress_packet_jitter` int(11)  NULL,
  `final_route_indication` varchar(1)  NULL,
  `routing_digits` varchar(64)  NULL,
  `call_duration` mediumint(9)  NULL,
  `post_dial_delay` mediumint(9)  NULL,
  `ring_time` mediumint(9)  NULL,
  `call_duration_ms` int(11)  NULL,
  `confid` varchar(32)  NULL,
  `rpid` varchar(32)  NULL,
  `route_entry_index` tinyint(4)  NULL,
  `route_table_used` mediumint(9)  NULL,
  `lnp_dipped` varchar(1)  NULL,
  `ingress_lrn` varchar(32)  NULL,
  `egress_lrn` varchar(32)  NULL,
  `cnam_dipped` tinyint(4)  NULL,
  `dnc_dipped` tinyint(4)  NULL,
  `origination_device_name` varchar(15)  NULL,
  `termination_device_name` varchar(15)  NULL,
  `ers_dipped` varchar(1)  NULL,
  `oli_digits` varchar(8)  NULL
) ENGINE=Brighthouse;

And loading csv data file using

LOAD DATA INFILE 'C:/Users/dhairya/Desktop/a.csv' INTO TABLE file_records FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' ( sequence_number, version, record_type, connection_type, session_id, release_cause, @start_time_date, @answer_time_date, @release_tod, greenwich_mean_time, release_cause_protocol_stack, binary_value_protocol_stack, first_release_dialogue, origination_trunkid, voip_protocol, source_number, source_host_name, destination_number, destination_host_name, callid, remote_payload_ipaddress, remote_payload_udpaddress, local_payload_ipaddress, local_payload_udpaddress, codec_list, ingress_packets, egress_packets, ingress_octets, egress_octets, ingress_packet_loss, ingress_delay, ingress_packet_jitter, supplierid, protocol, termination_source_number, termination_source_host, termination_destination_number, termination_destination_host_name, termination_callid, termination_remote_payload_ipaddress, termination_remote_payload_udpaddress, termination_local_payload_ipaddress, termination_local_payload_udpaddress, termination_codec_list, termination_ingress_packets, termination_egress_packets, termination_ingress_octets, termination_egress_octets, termination_ingress_packet_loss, termination_ingress_delay, termination_ingress_packet_jitter, final_route_indication, routing_digits, call_duration, post_dial_delay, ring_time, call_duration_ms, confid, rpid, route_entry_index, route_table_used, lnp_dipped, ingress_lrn, egress_lrn, cnam_dipped, dnc_dipped, origination_device_name, termination_device_name, ers_dipped, oli_digits, routing_match, termination_pdd, proxy_charge_information, jurisdictoinal_indication_parameter, matched_digits_ers, route_table_irs, route_sequence_number_ers, jurisdiction_type_ers) set start_time_date = 0, answer_time_date = 0, release_tod = 0, file_id = 1;

I always get err 2 Hy0000 : wrong data or column definition

sample csv data is as below

018002724;V1.17;R;No-Media;316773032-878328180@98.122.122.122;0001;Sat Apr 27 23:57:21 2013;Sat Apr 27 23:57:54 2013;Sat Apr 27 23:58:44 2013;0;Normal BYE; 200;O;030057;SIP;0186645212;80.84.30.12;6282330238419;98.123.123.000;7006204640450523785-1367107040-32218;80.84.30.12;39264;NA;NA;G.729,G.723.1,RFC 2833,G.711a_64k,G.711u_64k;0;0;0;0;0;0;0;090069;SIP;0186645212;98.122.122.122;6282330238419;112.123.123.123;316773032-0-878328180@98.158.145.166;123.123.123.2;24992;NA;NA;G.729,RFC 2833;0;0;0;0;0;0;0;F;6282330238419;49;1;32;49680;12E192A8 345A3974 ADF7A11C A6919E62;;1;103;0;;;0;0;SP Belgacom - B;Vinculum WHS Ve;0;;62823;1140;;;;0;0;0;

I check the termination as '\n' for the csv file

The above queries will work fine in mysql

解决方案

The problem is the BRIGHTHOUSE loader does not support specifying columns in the LOAD DATA INFILE command. It also doesn't support setting default values for columns in this command.

To use BRIGHTHOUSE tables properly like this, you have 4 options:

  • Make the table schema match the CSV file layout,
  • Make the CSV layout match the table schema,
  • Use the setting SET @BH_DATAFORMAT = 'mysql' to use the standard mysql loader instead of the BRIGHTHOUSE one,
  • Or create a new temporary table that matches your CSV layout, load into that, then write the contents of that table to a new CSV file in a format that matches the desired CSV format for the original table.

I know it's painful, I've been working with an Infobright database for the past 6 months and it's a real headache to use.

Also, don't forget these handy settings when working with the BRIGHTHOUSE loader:

  • SET @BH_DATAFORMAT = 'txt_variable'; This means use the BRIGHTHOUSE loader.
  • SET @BH_DATAFORMAT = 'mysql'; This means use the MySQL loader.
  • SET @BH_REJECT_FILE_PATH = '/tmp/new_file_name'; This means copy any line from your CSV file into this output file when it doesnt match the schema you're loading into. This is especially helpful for debugging, or for when you have large CSV files with headers that you cannot remove, because the BRIGHTHOUSE loader does not implement the IGNORE n LINES parameter in the LOAD DATA INFILE command.
  • SET @BH_ABORT_ON_COUNT = 1000; Aborts the file load command when this number of lines in your input CSV file does not match the schema of the table you're loading into.

Also bear in mind that there is a bug in the BRIGHTHOUSE CSV exporter that means backslashes are not escaped in the resulting CSV file. So if you want to load this CSV back into your database schema you will need to use LOAD DATA INFILE '...' INTO TABLE ... FIELDS ESCAPED BY '' or the importer will use the backslashes as part of an escape sequence to change the meaning of the character following it.

Here is a handy SQL block that might help you to load CSV into a BRIGHTHOUSE table with mismatched schema:

CREATE TABLE `tmptable` ENGINE=MySQL SELECT * FROM `maintable` WHERE 0 LIMIT 1;

@SET BH_DATAFORMAT = 'mysql';

LOAD DATA INFILE '/tmp/myfile' INTO TABLE `tmptable`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n' 
IGNORE 1 LINES
(field1, field2, field3) SET field4 = 'somevalue'; /*as per your CSV layout*/

SELECT * FROM tmptable INTO OUTFILE '/tmp/mynewfile.csv' 
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n';

DROP TABLE `tmptable`;

@SET BH_DATAFORMAT = 'txt_variable';
@SET BH_REJECT_FILE_PATH = '/tmp/failed_csv_lines.csv';
@SET BH_ABORT_ON_COUNT = 10;

LOAD DATA INFILE '/tmp/mynewfile.csv' INTO TABLE `maintable`
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"' 
ESCAPED BY '' 
LINES TERMINATED BY '\n';

(The WHERE 0 LIMIT 1 bit works where LIMIT 0 does not - in MySQL LIMIT 0 works fine, but when using BRIGHTHOUSE it throws some really weird error that took me a while to track down, so I use this method instead.)

I know this answer is a bit late coming, but I hope it's useful to anyone who stumbles upon this question.

这篇关于在infobright中将数据输入到文件csv文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆