正则表达式从字符串中提取温度和温度范围 [英] Regex to extract temperatures and temperature ranges from a string

查看:516
本文介绍了正则表达式从字符串中提取温度和温度范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好吧,我有这个字符串:

Okay I have this string:

-64.5(Ethylene glycol monobutyl ether acetate)- -24.4 deg C(N-Methylpyrrolidone)

我要寻找的最终结果是:

And the final result I am looking for is this:

-64.5 - -24.4 deg C

在化学名称和负号中加上破折号以及破折号分隔符表示温度范围正在使我丧命!

The inclusion of dashes in the chemical name and the negative numbers as well as the dash separator to indicate a temperature range is killing me!!

任何帮助将不胜感激!

示例输入:

> 1000 °C ( > 1832 °F )
> -64,6 deg C (Ethylene glycol monobutyl ether acetate)
-30 to -15 deg C ( -22 to 5 deg F )
-64.5(Ethylene glycol monobutyl ether acetate)- -24.4 deg C(N-Methylpyrrolidone)

预期产量:

two results: > 1000 deg C and > 1832 deg F
> -64.6 deg C
-31 - -15 deg C
-64.5 - -24.4 deg C

对不起,如果我没有描述我要很好地完成的工作!

Sorry if I am not describing what I am trying to accomplish very well!

推荐答案

这似乎在做什么您想要的是,尽管到目前为止,它还没有拆分/删除parens中的温度,因为尚不清楚为什么示例1应该有两个结果,而示例3只有一个结果? (一个是范围,另一个不是范围是否相关?)

This appears to do what you want, although so far it doesn't split/remove temperatures in parens, because it's not clear why example 1 should have two results whilst example 3 only has one result? (Is it relevant that one is a range and the other not?)

它的工作原理是删除不需要的位,只保留相关信息-它确实使用正则表达式负前瞻(?! .. 来指定如果当前位置与前瞻匹配

It works by removing the bits you don't want, leaving only the relevant information - it does this using a regex negative lookahead (?!..) to specify that if the current position matches the lookahead it should not be accepted as a match at this position.

(此外,它会将更改为-°C到摄氏度C / code>根据您的期望值。)

(Also, it changes to to - and °C to deg C as per your expected values.)

<cfsavecontent variable="TempsRx">(?x)

    ## Exclude numbers, "deg", "C", "F", and GT sign.
    (?!
        \d+(?:[.,]\d+)?
    |
        \bdeg\b
    |
        \b[CF]\b
    |
        >
    )

    ## Match words
    \b[\w]+[\w-]*\b

</cfsavecontent>

<cfsavecontent trim variable="Inputs">
> 1000 °C ( > 1832 °F )
> -64,6 deg C (Ethylene glycol monobutyl ether acetate)
-30 to -15 deg C ( -22 to 5 deg F )
-64.5(Ethylene glycol monobutyl ether acetate)- -24.4 deg C(N-Methylpyrrolidone)
</cfsavecontent>

<cfloop index="CurIn" array=#Inputs.split('\n')# >

    <!---
        Replace 1/2: Normalise to/- and °/deg as per expected values
        Replace 3: Remove unwanted words
        Replace 4: Cleanup leftover parens
    --->
    <cfset Out = CurIn
        .replaceAll(' to ',' - ')
        .replaceAll('°(?=[CF]\b)','deg ')
        .replaceAll(TempsRx,'')
        .replaceAll('\(\s*\)',' ')
         />

    <cfdump var=#[CurIn,Out]# />

</cfloop>

这篇关于正则表达式从字符串中提取温度和温度范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆