PROC SQL 中的限制结果 [英] Limiting results in PROC SQL

查看:30
本文介绍了PROC SQL 中的限制结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 PROC SQL 查询包含数亿条记录的 DB2 表.在开发阶段,我想对这些记录的任意小子集(比如 1000)运行我的查询.我试过使用 INOBS 来限制观察,但我相信这个参数只是限制了 SAS 正在处理的记录数.我希望 SAS 只从数据库中获取任意数量的记录(然后处理所有记录).

如果我自己编写 SQL 查询,我会简单地使用 SELECT * FROM x FETCH FIRST 1000 ROWS ONLY ...(相当于 SELECT TOP 1000 * FROM xSQL Server 中的代码>).但是 PROC SQL 似乎没有这样的选择.获取记录需要很长时间.

问题:如何指示SAS任意限制从数据库返回的记录数.

我读过 PROC SQL 使用 ANSI SQL,它没有任何行限制关键字的规范.也许 SAS 不想努力将其 SQL 语法转换为供应商特定的关键字?没有办法解决吗?

解决方案

当 SAS 通过 SAS 语法与数据库对话时,部分查询可以转换为等效的 DBMS 语言 - 这称为隐式传递.查询的其余部分由 SAS 进行后处理"以产生最终结果.根据 SAS 版本、DBMS 供应商和 DBMS 版本,在某些情况下甚至是一些连接/libname 选项,SAS 语法的不同部分在 SAS 和 DBMS 之间是可翻译/被认为兼容的,因此发送给 DBMS 而不是 SAS.

使用 SAS SQL 选项 - INOBS 和 OUTOBS - 我通过不同版本的 SAS 与 MS SQL 和 Oracle 进行了很多合作,但我还没有看到那些转换为 TOP xxx 类型的查询,所以这可能不是尚受支持,尽管当查询仅涉及 DMBS 数据(不连接到 SAS 数据等)时,应该是完全可行的.

所以我认为你只剩下所谓的显式传递 - 特定的 SAS SQL 语法来连接到数据库.这种类型的查询如下所示:

proc sql;以 db1 身份连接到 oracle(user=user1 pw=pasw1 path=DB1);创建表 test_table 作为选择 *从连接到 db1(/* 这里我们在 oracle */select * from test.table1 where rownum <20);与 db1 断开连接;放弃;

在 SAS 9.3 中可以简化语法 - 如果已经有 LIBNAME 连接,您可以重用它进行显式传递:

LIBNAME ORALIB ORACLE user=...;进程SQL;使用ORALIB连接到oracle;创建表 work.test_table 作为选择 *从连接到 ORALIB (....

使用 libname 连接时,请确保在加载数据库时使用 READBUFF(我通常设置一些 5000 左右)或 INSERTBUFF 选项(1000 或更多).

要查看是否发生隐式传递,请设置 sastrace 选项:

option sastrace=',,,ds' sastraceloc=saslog nostsuffix;

I am trying to use PROC SQL to query a DB2 table with hundreds of millions of records. During the development stage, I want to run my query on an arbitrarily small subset of those records (say, 1000). I've tried using INOBS to limit the observations, but I believe that this parameter is simply limiting the number of records which SAS is processing. I want SAS to only fetch an arbitrary number of records from the database (and then process all of them).

If I were writing a SQL query myself, I would simply use SELECT * FROM x FETCH FIRST 1000 ROWS ONLY ... (the equivalent of SELECT TOP 1000 * FROM x in SQL Server). But PROC SQL doesn't seem to have any option like this. It's taking an extremely long time to fetch the records.

The question: How can I instruct SAS to arbitrarily limit the number of records to return from the database.

I've read that PROC SQL uses ANSI SQL, which doesn't have any specification for a row limiting keyword. Perhaps SAS didn't feel like making the effort to translate its SQL syntax to vendor-specific keywords? Is there no work around?

解决方案

When SAS is talking to a database via SAS syntax, part of the query can be translated to DBMS language equivalent - this is called implicit pass through. The rest of the query is "post-processed" by SAS to produce final result. Depending on SAS version, DBMS vendor and DBMS version, and in some cases even some connection/libname options, different parts of SAS syntax are translatable/considered compatible between SAS and DBMS and thus sent to be performed by DBMS instead of SAS.

With SAS SQL options - INOBS and OUTOBS - I've worked a lot with MS SQL and Oracle via different versions of SAS, but I haven't seen those ever translated to TOP xxx type of queries, so this is probably not supported yet, although when query touches just DMBS data (no joins to SAS data etc), should be quite doable.

So I think you're left with the so called explicit pass-through - specific SAS SQL syntax to connect to database. This type of queries look like this:

proc sql;
    connect to oracle as db1 (user=user1 pw=pasw1 path=DB1);
    create table test_table as
    select *
    from connection to db1
        ( /* here we're in oracle */
                  select * from test.table1 where rownum <20 
                )
    ;
    disconnect from db1;
quit;

In SAS 9.3 the syntax can be simplified - if there's already a LIBNAME connection, you can reuse it for explicit pass-through:

LIBNAME ORALIB ORACLE user=...;

PROC SQL;
connect to oracle using ORALIB;
create table work.test_table as
        select *
        from connection to ORALIB (
....

When connecting using libname be sure to use READBUFF (I usually set some 5000 or so) or INSERTBUFF options (1000 or more) when loading database.

To see if implicit pass-through takes place, set sastrace option:

option sastrace=',,,ds' sastraceloc=saslog nostsuffix;

这篇关于PROC SQL 中的限制结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆