将文件加载到Teradata [英] Loading files into Teradata

查看:164
本文介绍了将文件加载到Teradata的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每天,我目前必须:

  1. 下载50个文件,40个为.csv(逗号分隔值)格式,10个为.txt.
  2. 在每个.csv文件中,有5列(分别称为B,A,D,C,E),在.txt文件中有5个不同的列(分别称为G,H,I,J, K).
  3. 然后我必须在Excel中手动打开,并为40个文件中的每个文件将.csv列重新排列为(A,B,C,D,E). .txt文件已经是必需的格式.
  4. 然后使用unix上的BTEQ将所有这些加载到15个teradata表中

大约有3000行数据

我已经在stackoverflow和我的反复试验的帮助下创建了VBA,该VBA:

I have, with the help of stackoverflow and my trial and error, created VBA which:

  1. 下载全部50个文件,
  2. 正确重新排列csv文件的列,
  3. 然后在包含.csv文件的"INSERT INTO DBNAME.TABLENAME VALUES(A,B,C,D,E,F)"和"INSERT INTO DBNAME.TABLENAME VALUES(G,H,I)的新工作表上,.,J,K).将所有50个文件中的实际值从A替换为K
  4. 创建与teradata的ODBC连接以遍历所有3000条创建的语句.

它可以工作,但是速度很慢,因为它循环了约3000行.由于teradata数据库始终被大量人员访问,因此每次发生错误时,我的insert语句都必须转到队列的后面,从而增加了完成该过程所需的时间.

It works, but it's painfully slow because it loops through 3000 or so rows. Because the teradata database is accessed by a large number of people at all times, everytime an error occurrs, my insert statements have to go to the back of the queue, adding to the time taken to complete the procedure.

是否可以创建将批量加载50个文件的VBA,而不是创建3000个插入语句并循环的当前方法? 我已经了解了使用"Jet Provider"导入到Teradata的过程,但是对此我知之甚少,到目前为止我的努力都失败了.

Is it possible to create VBA that will batch load the 50 files rather than my current method to create 3000 insert statements and loop? I have read about a process that imports into Teradata by using 'Jet Provider', but I know very little about this and my efforts so far have failed.

有人知道这种方法还是其他更快的方法?我想避免BTEQ,MLOAD以及不能FLOAD(因为我加载到的表不是空的). 谢谢大家.

Anyone know about this method or any other quicker method? I want to avoid BTEQ, MLOAD and can't FLOAD (because tables I load into are not empty). Thanks everyone.

推荐答案

无论有多少用户,Teradata都能轻松处理大量插入.但是,Teradata在插入小刀片时非常不好.您需要将它们捆绑成一大堆,然后一次性加载. Teradata(mload)提供了一些工具.然后,您将获得最佳性能.

Teradata is easily capable of handling massive inserts, no matter how many users. However, Teradata is very bad at small inserts. You need to bundle them into large volumes that are loaded in one go. There is tooling for that delivered with Teradata (mload). Then you get top performance.

作为另一种选择,也许速度较慢,请确保在SQL中使用绑定变量而不是显式值.解析可能需要很多时间.您可能需要考虑使用 Kettle-ETL工具

As an alternative, maybe less speedy, make sure that you are using bind variables and not explicit values in your SQL. Parsing can take a lot of time. You might want to consider tools such as Kettle - ETL tool or Invantive Control - Excel add-in (warning, I work there) to load from Excel into Teradata. They can also load in parallel, improving performance.

这篇关于将文件加载到Teradata的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆