如何摆脱字符串中的尾随和嵌入空格? [英] How do I get rid of trailing and embedded spaces in a string?

查看:79
本文介绍了如何摆脱字符串中的尾随和嵌入空格?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个程序,将国家和国际帐号转换为IBAN帐号。首先,我需要形成一个字符串:银行ID +分行ID +帐号+ ISO国家/地区代码,而这些字段中可能没有尾随空格。但是,并非每个帐号都有相同的长度,有些帐号带有分支标识符,而有些帐号没有分支标识符,因此,我总是以这些字段的结尾空格结尾。

I am writing a program that converts national and international account numbers into IBAN numbers. To start, I need to form a string: Bank ID + Branch ID + Account Number + ISO Country Code without the trailing spaces that may be present in these fields. But not every account number has the same length, some account numbers have branch identifiers while others don't, so I will always end up with trailing spaces from these fields.

我的工作存储如下:

      01 Input-IBAN.
          05 BANK-ID                    PIC N(10) VALUE "LOYD".
          05 BRANCH-ID                  PIC N(10) VALUE "     ".
          05 ACCOUNT-NR                 PIC N(28) VALUE "012345678912   ". 
          05 COUNTRY-CODE               PIC N(02) VALUE "GB".
      01 Output-IBAN                    PIC N(34).

我在示例中输入了一些值;实际上,这取决于输入。分支代码是可选的,因此在示例中将其保留为空。

I've put some values in there for the example; in reality it would depend on the input. The branch code is optional, hence me leaving it empty in the example.

我基本上想将输入串在一起:
LOYD 012345678912 GB

I basically want to go from this input strung together: "LOYD 012345678912 GB"

为此:
LOYD012345678912GB

有人知道这样做的方法不会导致性能问题吗?我曾经考虑过使用功能反转,然后使用 INSPECT 来计算前导空格。但我听说这样做很慢。有人有什么想法吗?也许是一个有关如何使用上述想法的例子?

Does anyone know a way to do this that does not result in performance issues? I have thought of using the FUNCTION REVERSE and then using an INSPECT for tallying leading spaces. But I've heard that's a slow way to do it. Does anyone have any ideas? And maybe an example on how to use said idea?

编辑:
我被告知基本字段可能包含嵌入的空格。

I've been informed that the elementary fields may contain embedded spaces.

推荐答案

我现在看到您已经在数据中嵌入了空白。到目前为止,您都没有答案。吉尔伯特挤出嵌入的空格,在每个字段中第一个空格之后,我的数据将丢失。

I see now that you have embedded blanks in the data. Neither answer you have so far works, then. Gilbert's "squeezes out" the embedded blanks, mine would lose any data after the first blank in each field.

但是,我要指出的是,我真的不相信如果您以任何方式生成 IBAN,则可以嵌入空白。例如, https://en.wikipedia.org/wiki/International_Bank_Account_Number#Structure ,特别是

However, just to point out, I don't really believe you can have embedded blanks if you are in any way generating an "IBAN". For instance, https://en.wikipedia.org/wiki/International_Bank_Account_Number#Structure, specifically:


IBAN以电子方式传输时不应包含空格。
打印时以四个字符为一组,用
a单个空格隔开,最后一组是可变长度

The IBAN should not contain spaces when transmitted electronically. When printed it is expressed in groups of four characters separated by a single space, the last group being of variable length

如果您的源数据在字段级别具有嵌入的空白,那么您需要在行中引用该空白以决定要做什么。假设您收到正确的答案(在字段级别没有嵌入的空白),那么两个现有答案都将重新出现在表格上。您可以通过(从逻辑上)将LENGTH OF更改为FUNCTION LENGTH并处理输出溢出的任何可能性来修正Gilbert's。

If your source-data has embedded blanks, at the field level, then you need to refer that back up the line for a decision on what to do. Presuming that you receive the correct answer (no embedded blanks at the field level) then both existing answers are back on the table. You amend Gilbert's by (logically) changing LENGTH OF to FUNCTION LENGTH and dealing with any possibility of overflowing the output.

使用STRING时,您必须再次处理

With the STRING you again have to deal with the possibility of overflowing the output.

基于没有嵌入空格的假设的原始答案。

Original answer based on the assumption of no embedded blanks.

我假设您在构成您的结构的基本项目中没有嵌入空格,因为它们是由不包含嵌入空格的标准值提供的。

I'll assume you don't have embedded blanks in the elementary items which make up your structure, as they are sourced by standard values which do not contain embedded blanks.

       MOVE SPACE                   TO OUTPUT-IBAN
       STRING                       BANK-ID 
                                    BRANCH-ID 
                                    ACCOUNT-NR 
                                    COUNTRY-CODE 
         DELIMITED                  BY SPACE 
         INTO                       OUTPUT-IBAN 

STRING 仅复制值unt il会用完所有要复制的数据,因此必须在STRING之前清除OUTPUT-IBAN。

STRING only copies the values until it runs out of data to copy, so it is necessary to clear the OUTPUT-IBAN before the STRING.

每个源字段的数据复制将在在每个源字段中都遇到第一个SPACE。如果一个字段完全是空间,则不会从该字段复制任何数据。

Copying of the data from each source field will end when the first SPACE is encountered in each source field. If a field is entirely space, no data will be copied from it.

STRING几乎可以肯定会导致运行时例程被执行,并且会产生一些开销那。 Gilbert LeBlanc的示例可能会稍快一些,但是使用STRING时,编译器会自动处理所有字段的所有长度。因为您有国家字段,所以请确保使用图形常数空间(或空格,它们是相同的),而不是您认为包含空格 的文字值。可以,但是不包含国家/地区空格。

STRING will almost certainly cause a run-time routine to be executed and there will be some overhead for that. Gilbert LeBlanc's example may be slightly faster, but with STRING the compiler deals automatically with all the lengths of all the fields. Because you have National fields, ensure you use the figurative-constant SPACE (or SPACES, they are identical) not a literal value which you think contains a space " ". It does, but it doesn't contain a National space.

如果STRING的结果大于34个字符,多余的字符将被自动截断。如果您想解决此问题,STRING会使用 ON OVERFLOW 短语,在此情况下您可以指定要执行的操作。如果使用ON OVERFLOW,或者实际上是 NOT ON OVERFF ,则应使用 END-STRING 范围终止符。句号/句点也将终止STRING语句,但是当像这样使用时,永远不能在ON / NOT ON的情况下在任何类型的条件语句中使用。

If the result of the STRING is greater than 34 characters, the excess characters will be quietly truncated. If you want to deal with that, STRING has an ON OVERFLOW phrase, where you specify what you want done in that case. If using ON OVERFLOW, or indeed NOT ON OVERFLOW you should use the END-STRING scope-terminator. A full-stop/period will terminate the STRING statement as well, but when used like that it can never, with ON/NOT ON, be used within a conditional statement of any type.

不要使用句号/句号终止范围。

Don't use full-stops/periods to terminate scopes.

COBOL没有字符串 。除非数据填满该字段,否则您不能摆脱定长字段中的尾随空格。当数据很短时,您的输出IBAN将始终包含尾随空格。

COBOL doesn't have "strings". You cannot get rid of trailing spaces in fixed-length fields, unless the data fills the field. Your output IBAN will always contain trailing spaces when the data is short.

如果您实际上在字段级别:

If you were to actually have embedded blanks at the field level:

首先,如果您想挤出嵌入的空白以使它们不出现在输出中,那么我想不出一种更简单的方法(使用COBOL),而不是吉尔伯特。

Firstly, if you want to "squeeze out" embedded blanks so that they don't appear in the output, I can't think of a simpler way (using COBOL) than Gilbert's.

否则,如果要保留嵌入的空格,除了计算尾随空格以计算出每个字段中实际数据的长度。

Otherwise, if you want to preserve embedded blanks, you have no reasonable choice other than to count the trailing blanks so that you can calculate the length of the actual data in each field.

COBOL实现确实具有语言扩展。不清楚您使用的是哪个COBOL编译器。如果碰巧是AcuCOBOL(现在来自Micro Focus),则INSPECT支持TRAILING,您可以用这种方式计算尾随空白。 GnuCOBOL还支持INSPECT上的TRAILING,此外还具有有用的内在函数TRIM,您可以使用它在STRING语句中精确地执行所需的操作(修剪尾随空白)。

COBOL implementations do have Language Extensions. It is unclear which COBOL compiler you are using. If it happens to be AcuCOBOL (now from Micro Focus) then INSPECT supports TRAILING, and you can count trailing blanks that way. GnuCOBOL also supports TRAILING on INSPECT and in addition has a useful intrinsic FUNCTION, TRIM, which you could use to do exactly what you want (trimming trailing blanks) in a STRING statement.

       move space                   to your-output-field
       string function 
               trim 
                ( your-first-national-source 
                  trailing )
              function 
               trim 
                ( your-second-national-source 
                  trailing )
              function 
               trim 
                ( your-third-national-source 
                  trailing )
              ...
         delimited                  by size
         into                       your-output-field

请注意,除了定义中的PIC N外,代码与使用字母数字字段的代码相同。

Note that other than the PIC N in your definitions, the code is the same as if using alphanumeric fields.

但是,f或标准COBOL 85代码...

However, for Standard COBOL 85 code...

您提到使用FUNCTION REVERSE后跟INSPECT。 INSPECT可以计算前导空格,但按标准不能计数尾随空格。因此,您可以反转字段中的字节,然后计算前导空格。

You mentioned using FUNCTION REVERSE followed by INSPECT. INSPECT can count leading spaces, but not, by Standard, trailing spaces. So you can reverse the bytes in a field, and then count the leading spaces.

您有国家数据(PIC N)。与此不同的是,不是需要计数的字节,而是由两个字节组成的字符。由于编译器知道您正在使用PIC N字段,因此只有一件事可以使您跳闸-特殊寄存器LENGTH OF对字节进行计数,因此需要FUNCTION LENGTH来对字符进行计数。

You have National data (PIC N). A difference with that is that it is not bytes you need to count, but characters, which are made up of two bytes. Since the compiler knows you are using PIC N fields, there is only one thing to trip you - the Special Register, LENGTH OF, counts bytes, you need FUNCTION LENGTH to count characters.

国家数据为UTF-16。当一个字节碰巧代表一个可显示的字符时,这恰好意味着每个字符的两个字节恰好是 ASCII。在EBCDIC机器的z / OS上运行也没关系,因为编译器将自动为文字或字母数字数据项进行必要的转换。

National data is UTF-16. Which happens to mean the two bytes for each character happen to be "ASCII", when one of the bytes happens to represent a displayable character. That doesn't matter either, running on z/OS, an EBCDIC machine, as the compiler will do necessary conversions automatically for literals or alpha-numeric data-items.

       MOVE ZERO                    TO a-count-for-each-field 
       INSPECT FUNCTION 
                REVERSE 
                 ( each-source-field )
         TALLYING                   a-count-for-each-field 
          FOR LEADING               SPACE 

每个都做完一个之后字段,则可以使用参考修改。

After doing one of those for each field, you could use reference-modification.

如何为此使用引用修改?

How to use reference-modification for this?

首先,您必须小心。其次,您不用。

Firstly, you have to be careful. Secondly you don't.

第二个:

MOVE SPACE                   TO output-field
STRING field-1 ( 1 : length-1 )
       field-2 ( 1 : length-2 )
  DELIMITED BY               SIZE
  INTO                       output-field

如果可能/有必要,再次进行溢出处理。

Again deal with overflow if possible/necessary.

也可以使用简单的MOVE和参考修改,例如此答案 https://stackoverflow.com/a/31941665/ 1927206 ,其问题与您的问题很相似。

It is also possible with plain MOVEs and reference-modification, as in this answer, https://stackoverflow.com/a/31941665/1927206, whose question is close to a duplicate of your question.

为什么要小心?再次,从前面链接的答案来看,理论上参考修改的长度不能为零。

Why do you have to be careful? Again, from the answer linked previously, theoretically a reference-modification can't have a zero length.

在实践中,它可能会起作用。一般而言,COBOL程序员似乎非常热衷于引用修改,以至于他们不必费心阅读完整的内容,因此不必担心零长度不是标准,也不必担心它不是非标准的,因为它有效。目前。在编译器更改之前。

In practice, it will probably work. COBOL programmers generally seem to be so keen on reference-modification that they don't bother to read about it fully, so don't worry about a zero-length not being Standard, and don't notice that it is non-Standard, because it "works". For now. Until the compiler changes.

如果您使用的是Enterprise COBOL V5.2或更高版本(可能也是V5.1,我只是没有检查过),那么您可以如果需要,请确保通过编译器选项按预期进行零长度引用修改。

If you are using Enterprise COBOL V5.2 or above (possibly V5.1 as well, I just haven't checked) then you can be sure, by compiler option, if you want, that a zero-length reference-modification works as expected.

如果可以存在嵌入的空格,则可以通过其他一些方法来完成任务并且在输出中可能很重要,在该答案中进行了介绍。对于National,请始终注意使用FUNCTION LENGTH(计数字符),而不是LENGTH OF(计数字节)。通常,LENGTH OF和FUNCTION LENGTH给出相同的答案。对于多字节字符,则不会。

Some other ways to achieve your task, if embedded blanks can exist and can be significant in the output, are covered in that answer. With National, just always watch to use FUNCTION LENGTH (which counts characters), not LENGTH OF (which counts bytes). Usually LENGTH OF and FUNCTION LENGTH give the same answer. For multi-byte characters, they do not.

这篇关于如何摆脱字符串中的尾随和嵌入空格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆