正则表达式在字符串中查找整数和小数 [英] Regex to find integers and decimals in string

查看:1229
本文介绍了正则表达式在字符串中查找整数和小数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串:

$str1 = "12 ounces";
$str2 = "1.5 ounces chopped;

我想从中获得金额字符串是否是小数或不是(12或1.5),然后抓住前一个测量值(盎司)。

I'd like to get the amount from the string whether it is a decimal or not (12 or 1.5), and then grab the immediately preceding measurement (ounces).

我能够使用一个非常基本的正则表达式获取测量值,但得到小数/整数一直给我带来问题。

I was able to use a pretty rudimentary regex to grab the measurement, but getting the decimal/integer has been giving me problems.

感谢您的帮助!

推荐答案

如果您只想获取数据,可以使用松散的正则表达式:

If you just want to grab the data, you can just use a loose regex:

([\d.]+)\s+(\S+)




  • ([\d。] +) [\d。] + 将匹配一系列严格数字和(表示 4.5.6 .... 将匹配,但这些情况并不常见,这只是为了抓取数据),括号表示我们将捕获匹配的文本。这里是字符类 [] ,所以不需要转义。

    • ([\d.]+): [\d.]+ will match a sequence of strictly digits and . (it means 4.5.6 or .... will match, but those cases are not common, and this is just for grabbing data), and the parentheses signify that we will capture the matched text. The . here is inside character class [], so no need for escaping.

      后跟任意空格 \s + 和非空格字符的最大序列(由于贪婪量词) \S + (非空格真的是非空格:它几乎可以匹配所有内容在Unicode中,除了空格,制表符,新行,回车字符)。

      Followed by arbitrary spaces \s+ and maximum sequence (due to greedy quantifier) of non-space character \S+ (non-space really is non-space: it will match almost everything in Unicode, except for space, tab, new line, carriage return characters).

      您可以在第一个捕获组和第二个捕获组中的单位。

      You can get the number in the first capturing group, and the unit in the 2nd capturing group.

      你可以对这个数字更加严格:

      You can be a bit stricter on the number:

      (\d+(?:\.\d*)?|\.\d+)\s+(\S+)
      




      • 唯一的变化是(\d +(?:\ 。\d *)?| \.\d +),所以我只会解释这一部分。这有点严格,但根据输入域和您的要求,是否更严格更好。它将匹配整数 34 ,带小数部分的数字 3.40000 并允许 .5 34。要通过的案件。它将拒绝超过的数字,或仅包含 | 充当OR,它分隔了两种不同的模式: \.\d + \ + +(?:\。\ d *)?

      • \d +(?:\。\ \\ n *)?:这将匹配并(隐式)断言整数部分中的至少一个数字,然后是可选 (需要使用 \ 转义,因为表示任何字符)和小数部分(可以是0或更多数字)。最终选项由表示。 ()可用于分组和捕获 - 但如果不需要捕获,那么(?:) 可以是用于禁用捕获(保存内存)。

      • \.\d + :这将匹配 0.78 。它匹配后跟至少一个(由 + 表示)数字。

        • The only change is (\d+(?:\.\d*)?|\.\d+), so I will only explain this part. This is a bit stricter, but whether stricter is better depending on the input domain and your requirement. It will match integer 34, number with decimal part 3.40000 and allow .5 and 34. cases to pass. It will reject number with excessive ., or only contain a .. The | acts as OR which separate 2 different pattern: \.\d+ and \d+(?:\.\d*)?.
        • \d+(?:\.\d*)?: This will match and (implicitly) assert at least one digit in integer part, followed by optional . (which needs to be escaped with \ since . means any character) and fractional part (which can be 0 or more digits). The optionality is indicated by ? at the end. () can be used for grouping and capturing - but if capturing is not needed, then (?:) can be used to disable capturing (save memory).
        • \.\d+: This will match for the case such as .78. It matches . followed by at least one (signified by +) digit.
        • 如果你想确保从输入字符串中获得有意义的东西,这不是一个好的解决方案。在编写仅捕获有效数据的正则表达式之前,您需要定义所有预期单位。

          This is not a good solution if you want to make sure you get something meaningful out of the input string. You need to define all expected units before you can write a regex that only captures valid data.

          这篇关于正则表达式在字符串中查找整数和小数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆