字符串范围正向和反向环顾 [英] String Range forward and backward lookaround

查看:239
本文介绍了字符串范围正向和反向环顾的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个脚本,该脚本从用户处获取输入并在格式化区域中返回输入.我一直在使用string range函数,但是它显然会在我给定的范围内剪切输入.有什么办法可以在指定范围内四处寻找下一个空格字符并在该位置剪切输入吗?

I am trying to write a script that gets input from a user and returns the input in a formatted area. I have been using the string range function however it obviously cuts the the input at the range that I give. Is there any way to do a look around at the specified range to find the next space character and cut the input at that location?

例如,如果我输入以下内容:

For example, if I have the input of:


Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris

我当前的string range函数使用\r\n格式化输入,例如:

My current string range function formats the input with \r\n as such:


Lorem ipsum dolor sit amet, consectetur a
dipisicing elit, sed do eiusmod tempor in
cididunt ut labore et dolore magna aliqua
. Ut enim ad minim veniam, quis nostrud e
xercitation ullamco laboris

如您在第1行上看到的,adipisicing行2 incididunt单词已被截断.我正在寻找一种寻找与该位置最近的空间的方法.因此,对于第1行,它应该在第2行的a之前,而应该在i之前. …在某些情况下,可能在单词之后.

As you can see on line 1 the adipisicing line 2 incididunt words have been cut off. I am looking for a way to look for the closest space to that location. So for line 1 it would have been before the a on line 2 it would have been before the i. …In some cases it may be after the word.

是否清楚我要寻找的东西?任何帮助都会很棒!

Is that clear what I am looking for? Any assistance would be great!

推荐答案

string range操作非常愚蠢;除了包含字符之外,它对拆分的字符串一无所知.为了更智能地进行拆分,最好的选择可能是明智选择的正则表达式:

The string range operation is pretty stupid; it doesn't know anything about the string it is splitting other than that it contains characters. To get smarter splitting, your best bet is probably an intelligently chosen regular expression:

set s "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod\
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis\
nostrud exercitation ullamco laboris."

# Up to 40 characters, from word-start, to word-start or end-of-string
set RE {\m.{1,40}(?:\m|\Z)}
# Extract the split-up list of "lines" and print them as lines
puts [join [regexp -all -inline $RE $s] "\n"]

这会为我生成以下输出:

This produces this output for me:


Lorem ipsum dolor sit amet, consectetur 
adipisicing elit, sed do eiusmod tempor 
incididunt ut labore et dolore magna 
aliqua. Ut enim ad minim veniam, quis 
nostrud exercitation ullamco laboris.

读者可以通过插入空格来实现完全对齐(因为这确实比贪婪的行拆分要困难得多!)

Implementing full justification by inserting spaces is left as an exercise for the reader (because it's really quite a lot harder than greedy line splitting!)

这篇关于字符串范围正向和反向环顾的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆