从 sas 数据步骤动态调用宏 [英] Dynamically call macro from sas data step

查看:121
本文介绍了从 sas 数据步骤动态调用宏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当作为 SAS 程序运行时,此代码执行良好:

This code executes fine when Run as a SAS program:

%MyMacro(foo_val, bar_val, bat_val);

我使用以下方法创建了一个表:

I have created a table using:

DATA analyses;
   input title : $32. weight : $32. response : $32.;
   datalines;
foo1 bar1 bat1
foo2 bar2 bat2
;

我想对 analysis 表的每一行执行一次 MyMacro.

I want to execute MyMacro once for each row of the analyses table.

以下代码似乎只传递字符串值titleweightresponse(而不是数据值foo1 等)到我的宏(通过调用 %put 命令进行测试):

The following code appears to only pass the string values title, weight and response (rather than the data values foo1 etc.) to my macro (tested with calls to the %put command) :

DATA _NULL_ ;
    set analyses;
    %MyMacro(title, weight, response);

RUN;

如何在将数据值作为参数传递给宏的同时,为 analysis 表的每个记录调用一次宏?其目的是为大量分析实际运行此程序,因此解决方案必须适当地扩展到 analysis 表中的更多记录.

How can I invoke the macro once per record of the analyses table whilst passing data values as arguments to the macro? The intention is to actually run this for a very large number of analyses so the solution must scale appropriately to many more records in the analyses table.

推荐答案

这部分取决于您的宏正在做什么.如果我们假设您的宏正在执行一些旨在在数据步骤之外运行的操作(即,它不仅仅是分配数据步骤变量),那么您有多种选择.

This in part depends on what your macro is doing. If we assume that your macro is doing something that is intended to be run outside of a data step (ie, it's not just assigning a data step variable), then you have several options.

CALL EXECUTE 已经解释过了,在某些情况下是一个不错的选择.然而,它有一些缺点,特别是在宏时序方面,在某些情况下需要额外小心保护 - 特别是当您在宏中创建宏变量时.Quentin 在他的评论中展示了一种解决这个问题的方法(在调用中添加 %NRSTR),但我发现我更喜欢只使用 CALL EXECUTE 时这样做比其他方法有优势 -特别是,如果我想在创建宏调用时使用 SAS 数据步骤技术(例如 FIRST 或 LAST,或某种形式的循环),或者当我无论如何必须在数据步骤中执行操作并且可以避免开销时再次读取文件.如果我只是写一个像你上面那样的数据步骤 - 数据一些东西,设置一些东西,调用执行,运行 - 我不会使用它.

CALL EXECUTE has already been explained, and is a good option for some cases. It has some downsides, however, particularly with macro timing, that requires some extra care to protect in some cases - particularly when you are creating macro variables inside your macro. Quentin in his comments shows a way to get around this (adding %NRSTR to the call), but I find that I prefer to only use CALL EXECUTE when there's an advantage to doing so over the other methods - particularly, if I want to use SAS data step techniques (such as FIRST or LAST, for example, or some form of looping) in creating my macro calls, or when I have to do things in a data step anyway and can avoid the overhead of reading the file another time. If I'm just writing a data step like yours above - data something, set something, call execute, run - I wouldn't use it.

PROC SQL SELECT INTO 通常是我用于列表处理的(主要是这样).在做不太复杂的事情时,我更喜欢 SQL 的简单性;例如,您可以使用 DISTINCT 轻松获得每个宏调用的一个版本,而无需显式编写 proc sort nodupkey 或使用第一个/最后一个处理.它还具有调试优势,您可以将所有宏调用写入结果窗口(如果您不添加 noprint),这对我来说比日志更容易阅读,如果我想看看为什么我的调用没有被正确生成(并且没有使用任何额外的 PUT 语句).

PROC SQL SELECT INTO is typically what I use for list processing (which is largely what this is). I like SQL's simplicity a bit better when doing things that aren't too complicated; for example, you can get just one version of each macro call easily with DISTINCT without having to explicitly write a proc sort nodupkey or use first/last processing. It also has the advantage for debugging that you can write all of your macro calls to your results window (if you don't add noprint), which is a bit easier to read than the log for me if I'm trying to see why my calls didn't get generated properly (and doesn't take any extra PUT statements).

proc sql;
  select catx(',','%macro(',arg1,arg2,arg3)||')' 
    into :mvarlist separated by ' '
    from dataset;
quit;

&mvarlist.

这非常简单地运行它们,并且没有时间问题(因为您只是在编写一堆宏调用).

That runs them quite simply, and has no timing issues (As you're just writing a bunch of macro calls out).

这种方法的主要缺点是宏变量中最多有 64k 个字符,因此如果您要编写大量字符,就会遇到这种情况.在这种情况下,使用 CALL EXECUTE%INCLUDE 文件.

The main downside to this method is that you have a maximum of 64k characters in a macro variable, so if you're writing a huge number of these you'll run into that. In that case use CALL EXECUTE or %INCLUDE files.

%INCLUDE 文件在调用超过字符限制时作为 SELECT INTO 的替代品非常有用,或者如果您发现将文本文件用于查看您的调用(例如,如果您以批处理模式运行它,这可能比日志或列表输出更容易获取和/或解析).您只需将调用写入一个文件,然后 %INCLUDE 该文件.

%INCLUDE files are largely useful either as replacement for SELECT INTO when the call is over the character limit, or if you find it useful to have a text file to look at with your calls (if you're running this in batch mode for example, this could be easier to get to and/or parse than log or listing output). You just write your calls out to a file, and then %INCLUDE that file.

filename myfile temp; *or a real file if you want to look at it.;
data _null_;
 set dataset;
 file myfile;
 length str $200;
 str=catx(',','%macro(',arg1,arg2,arg3)||')';
 put str;
run;

%include myfile;

我真的不再经常使用它了,但这是一种常用技术,特别是老 SAS 程序员使用的技术,很高兴知道.

I don't really use this much anymore, but it's a common technique used particularly by older SAS programmers so good to know.

DOSUBL 是一种相对较新的方法,在某种程度上可以用来替代 CALL EXECUTE 因为它的默认行为通常比 CALL EXECUTE 的.文档页面确实是说明其工作方式不同的最佳示例;基本上,它通过让每个单独的调用看起来从/向调用环境导入和导出宏变量来解决计时问题,这意味着 DOSUBL 的每次迭代都在不同的时间运行,而不是 CALLEXECUTE 一切都在一堆中运行并且宏环境是固定的"(即,对宏变量的任何引用在运行时都是固定的,除非您使用 %NRSTR 乱七八糟地转义它)).

DOSUBL is a relatively new method, and to some extent can be used to replace CALL EXECUTE as its default behavior is typically closer to what you expect intuitively than CALL EXECUTE's. The doc page has really the best example for how this works differently; basically, it fixes the timing issue by letting each separate call look import and export the macro variables from/to the calling environment, meaning that each iteration of DOSUBL is run at a distinct time versus CALL EXECUTE where everything is run in one bunch and the macro environment is 'fixed' (ie, any reference to a macro variable is fixed at run time, unless you escape it messily with %NRSTR).

另外值得一提的是RUN_MACRO,它是FCMP 语言的一部分.这允许您完全运行宏并将其内容导入回数据步骤,这在某些情况下是一个有趣的选项(例如,您可以围绕选择计数的 PROC SQL 进行调用的东西,然后将其作为变量导入到数据集,所有这些都在一个数据步骤中).如果您这样做是为了调用宏来分配数据步骤变量,而不是运行一个执行不需要导入数据步骤的事情的进程,那么它是适用的,但是如果您这样做,这是值得考虑的事情确实希望这些数据全部返回到调用进程的数据集中.

One more thing worth mentioning is RUN_MACRO, a part of the FCMP language. That allows you to completely run a macro and import its contents back to the data step, which is an interesting option in some cases (for example, you could wrap a call around a PROC SQL that selected a count of something, and then import that to the dataset as a variable, all in one datastep). It's applicable if you're doing this for the purpose of calling a macro to assign a data step variable, not to run a process that does things that don't need to be imported into the data step, but it's something worth considering if you do want that data back all in the dataset that called the process.

这篇关于从 sas 数据步骤动态调用宏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆