SAS中的Blob字段被截断 [英] Blob fields in SAS gets truncated

查看:324
本文介绍了SAS中的Blob字段被截断的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在从事SAS作业,该作业从SQL Server提取表,然后将该表加载到Oracle表。

I have been working on a SAS job that extracts a table from SQL server and then loads that table to an Oracle table.

SQL中的其中一个字段服务器是blob,它们可以大到1G。当我在oracle表上运行此blob似乎被截断时,我收到了长度警告,结果文件损坏了。

One of the fields there in SQL server is blob and they can be as big as 1G. I am getting length warnings when I run this blobs on oracle table seems to be truncated and as a result files there are corrupt.

我已经看到SAS指出字符变量可以最高为32K,但SAS还声明它可以访问高达2G的Blob。

I have seen SAS stating that character variable can be max 32K but SAS also states it can access blobs up to 2G.

我们如何实现?

proc sql;
create view work.W2K3NU8 as
  select
     ID,
     DNUMBER,
     FILENAME,
     FILE   
        format = $HEX2048.
        informat = $HEX2048.,
     (input(compress(DATEENTERED),YYMMDD10.)) as DATEENTERED length = 8
        format = date.
        informat = date.
        label = 'DATEENTERED',
     (input(compress(DATEADDED),YYMMDD10.)) as DATEADDED length = 8
        format = date.
        informat = date.
        label = 'DATEADDED',
     (input(compress(DATECHANGED),YYMMDD10.)) as DATECHANGED length = 8
        format = date.
        informat = date.
        label = 'DATECHANGED',
     TYPE
from &SYSLAST;
quit;

这是数据步骤

      data trd.GAFILES
          (dbnull = (
                     ID = NO
                     DNUMBER = YES
                     FILENAME = YES
                     GA_FILE = YES
                     DATEENTERED = YES
                     DATAADDED = YES
                     DATECHANGED = YES
                     TYPE = YES
                     ETL_CREATE = YES
                     ETL_UPDATE = YES));
     attrib ID length = $255
        format = $255.
        informat = $255.
        label = 'ID'; 
     attrib DNUMBER length = $10
        format = $10.
        informat = $10.
        label = 'DNUMBER'; 
     attrib FILENAME length = $255
        format = $255.
        informat = $255.
        label = 'FILENAME'; 
     attrib GA_FILE length = $4096
        format = $HEX2048.
        informat = $HEX2048.
        label = 'GA_FILE'; 
     attrib DATEENTERED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATEENTERED'; 
     attrib DATAADDED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATAADDED'; 
     attrib DATECHANGED length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'DATECHANGED'; 
     attrib TYPE length = $100
        format = $100.
        informat = $100.
        label = 'TYPE'; 
     attrib ETL_CREATE length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'ETL_CREATE'; 
     attrib ETL_UPDATE length = 8
        format = DATETIME20.
        informat = DATETIME20.
        label = 'ETL_UPDATE'; 
     call missing(of _all_);
     stop;
  run;


推荐答案

SAS数据集不支持> 32767字符大小。我不确定您在哪里看到它支持更大的功能;您可能正在阅读ACCESS参考,该参考描述了DBMS的不同数据类型(即,在DB2部分中,它描述了BLOB和CLOB允许最大2GB的大小,但这是在描述DB2支持的内容,而不是SAS的支持)。

SAS datasets won't support > 32767 character size. I'm not sure where you saw that it supports greater; you're probably reading the ACCESS reference, which is describing the different data types of the DBMSs (ie, in the DB2 section, it describes BLOBs and CLOBs as allowing up to 2GB in size, but that's describing what DB2 supports - not SAS's support).

SAS会很高兴地访问 BLOB,但最多不会花费32767。您将必须分块阅读,或者在直通会话中使用特定于DBMS的语言(这将使直通会话不接触它)。您可以像这样读取大块(填写适当的子字符串函数和连接信息):

SAS will happily access BLOBs, but it won't take more than 32767 from it. You would have to read in chunks, or use DBMS-specific language in a pass through session (which would have to pass it through without touching it). You can read in chunks like so (fill in appropriate substring function and connection information):

proc sql;
connect to <>;
create table SASTBL as 
  select * from connection to <> (
   select substring_Function(blobfield,1,32767) as blob_1,
          substring_Function(blobfield,32768,32767) as blob_2,
          substring_Function(blobfield,65535,32767) as blob_3,
(... etc ... )
  from your_tbl;
);
quit;

如果您的版本为9.4,则也可以使用FedSQL进行转换。我对FedSQL不太熟悉,但是它的目的是支持比SAS支持的更多的数据类型。它没有明确表示它可以支持BLOB(BLOB上的注释始终映射到类似的数据类型,这大概意味着char或varchar),但是如果您的版本为9.4,则可能值得一试。

If you have 9.4, you also might be able to use FedSQL to do the conversion; I'm not very familiar with FedSQL, but the purpose of it is to support more data types than SAS supports. It doesn't explicitly say that it can support BLOBs (the notes on BLOB are consistently 'mapped to a similar data type', which presumably means char or varchar) but it might be worth a shot if you have 9.4.

这篇关于SAS中的Blob字段被截断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆