Haskell一次通过正则表达式替换多个子字符串 [英] Haskell replace multiple substrings with regex in a single pass
问题描述
假设我有一个字符串s
和一个字符串ts
元组列表,其中该元组中的第一个元素是s
的子字符串,我想用相应的第二个元素替换.在我的情况下,字符串s
始终是用空格分隔的不同单词的列表;此外,我希望将s
中的每个单词全部替换为ts
中的相应值(我希望我的意图在这里很清楚).第一次尝试可能是:
Suppose I have a string s
and a list of tuples of strings ts
, where the first element in the tuple is the substring of s
I'd like to replace with the corresponding second element. In my case, the string s
will always be a space-separated list of distinct words; moreover, I wish to replace each word in s
in its entirety with the corresponding value in ts
(I hope my intent is clear here). A first attempt at this might be:
import qualified Text.Regex as R -- from regex-compat
replaceAllIn :: String -> [(String, String)] -> String
replaceAllIn = foldl (\acc (k, v) -> R.subRegex (R.makeRegex k) acc v)
当一个键是另一个键的子串时,这当然不起作用
This, of course, doesn't work when one key is a substring of another
λ> s = "blah blahblee"
λ> ts = [("blah", "asdf"), ("blahblee", ";lkj")]
λ> replaceAllIn s ts
"asdf asdfblee"
因为在第一次通过时第一个键替换了两次出现的"blah",而在第二次通过时剩下的字符串不再与"blahblee"匹配.
because the first key replaces both occurrences of "blah" upon the first pass, leaving a string that no longer has anything matching "blahblee" for the second pass.
有没有一种方法可以一次通过字符串来实现我想要的?还是有一种内置的方法(在某个库中的某个地方)一次替换多个模式?
Is there a way to achieve what I want in one pass through the string? Or is there a built-in way (in some library somewhere) to replace multiple patterns at once?
编辑:发布后,我立即意识到我不知道为什么在这里使用正则表达式.但是,如果我用MissingH的Data.String.Utils中的replace
之类的东西替换正则表达式,此问题仍然有效.
Immediately after posting I realized I don't know why I'm using regex here. But the question remains valid if I replaced regex substitution with something like replace
from MissingH's Data.String.Utils.
推荐答案
在前面的评论中,您可以先处理最长要替换的最长字符串:
Expanding on my comment from earlier, you can just process the longest string to be replaced first:
λ> s = "blah blahblee"
λ> ts = [("blah", "asdf"), ("blahblee", ";lkj")]
λ> import qualified Data.List as L
λ> replaceAllIn s (L.sortOn fst ts)
"asdf ;lkj"
这篇关于Haskell一次通过正则表达式替换多个子字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!