Ruby Regex捕获两个字符串(包括两个字符串)之间的所有内容 [英] Ruby Regex to capture everything between two strings (inclusive)
问题描述
我正在尝试清理一些HTML并只删除一个标签(我真的很想避免使用nokogiri等).所以我出现了以下要删除的字符串:
I'm trying to sanitize some HTML and just remove a single tag (and I'd really like to avoid using nokogiri, etc). So I've got the following string appearing I want to get rid of:
<div class="the_class>Some junk here that's different every time</div>
此字符串在我的字符串中仅出现一次,因此我想找到一种删除它的方法.我已经尝试过使用正则表达式来捕获所有内容,但是找不到合适的东西.
This appears exactly once in my string, and I'd like to find a way to remove it. I've tried coming up with a regex to capture it all but I can't find one that works.
我已经尝试过/<div class="the_class">(.*)<\/div>/m
,并且可以使用,但是它也可以匹配并包含文档中不需要的任何</div>
标签,这是我不想要的.
I've tried /<div class="the_class">(.*)<\/div>/m
and that works, but it'll also match up to and including any further </div>
tags in the document, which I don't want.
关于如何解决此问题的任何想法?
Any ideas on how to approach this?
推荐答案
我相信您正在寻找一种非贪婪的正则表达式,例如:
I believe you're looking for an non-greedy regex, like this:
/<div class="the_class">(.*?)<\/div>/m
请注意添加的?
.现在,捕获组将捕获尽可能少的(非贪婪),而不是捕获尽可能多的(贪婪).
Note the added ?
. Now, the capturing group will capture as little as possible (non-greedy), instead of as most as possible (greedy).
这篇关于Ruby Regex捕获两个字符串(包括两个字符串)之间的所有内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!