从SAS中的字符串中提取子字符串 [英] Extract substring from a string in SAS

查看:131
本文介绍了从SAS中的字符串中提取子字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是这个论坛的新手.但是我在该网站上也阅读了许多有关SAS编码问题的答案.我在工作中遇到了SAS编码问题,希望有人能提供帮助.

I am new to this forum. But I have read many answers to questions that I had in SAS coding as well on this website. I have run into a problem with SAS coding at work that I hope somebody can help.

我正在尝试从文本字符串中提取数字子字符串.数字字符串始终位于"YR"或"YEAR"之类的单词之前.有时,数字子字符串和"YR"或"YEAR"之间会有一个空格.数字子字符串和文本字符串的长度从obs到obs都不同.这是一个看起来像的例子: SAS数据集的屏幕截图

I am trying to extract a numeric substring from a text string. The numeric string is always before words like "YR" or "YEAR". Sometimes there is a space between the numeric substring and "YR" or "YEAR". Both the numeric substring and the text string vary in length from obs to obs. Here is an example of what it looks like: Screenshot of SAS dataset

"YR"或"YEAR"之前的数字是我要提取的数字字符串.我尝试使用find fn定位"YR"或"YEAR"的位置,然后使用substrn提取周围的字符串.然后压缩字符.但是结果并不理想,因为有时它会在字符串的第一部分中拉出数字,而有时并不能将整数中的数字拉出(例如4.75).这是我使用的代码:

The number right before "YR" or "YEAR" is the numeric string I want to extract. I have tried to use find fn to locate where the "YR" or "YEAR" is and then use substrn to extract the surrounding string. Then compress the characters. But the result is not ideal as sometimes it pulls the number in the first part of the string and sometimes it doesn't pull in the whole number (e.g. 4.75). Here is the code that I have used:

if find(deal_type_oss, "YR","i") ne 0
then term=compress(substrn(deal_type_oss, find(deal_type_oss, "YR","i")-4,6),"","a");
if find(deal_type_oss,"Year","i") ne 0 
then term=compress(substrn(deal_type_oss, find(deal_type_oss, "Year","i")-4,6),"","a"); 

这是此代码的结果:代码结果

提前谢谢!

推荐答案

尝试使用前瞻正则表达式.在这里,\ s表示空格,\ S +表示多个空格字符,\ s?表示可能的空间,?=等于第一个正则表达式后面的YR或YEAR.

Try to use look ahead regular expression. Here,\s means space, \S+ means any more than one none space character, \s? means possible space, ?= is equal to YR or YEAR behind first regular expression.

data have;
input string & $200.;
year=prxchange('s/.*\s(\S+\s?)(?=YR|YEAR).*/$1/',-1,string);
DATALINES ;
USD2.75BN 4.5YR REV
USD110MM 5YR REV
USD340MM 5YR REV
USE40MM 5YR REVOLVER
USD3.5BN 5YEAR REVOLVER
USD2BN 4YR REV
USD3.5BN 4.75 YEAR REVOLVER
CAD500MM REV 3YR EXP
CAD75MM 5YR REVOLVER
USD1BN 5YR REVOLVER
;
RUN ;

这篇关于从SAS中的字符串中提取子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆