使用 Ruby 将字符串拆分为单词和标点符号 [英] Splitting a string into words and punctuation with Ruby
本文介绍了使用 Ruby 将字符串拆分为单词和标点符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在 Ruby 中工作,我想将一个字符串及其标点符号拆分为一个数组,但我想将撇号和连字符视为单词的一部分.例如,
I'm working in Ruby and I want to split a string and its punctuation into an array, but I want to consider apostrophes and hyphens as parts of words. For example,
s = "here...is a happy-go-lucky string that I'm writing"
应该变成
["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"].
我得到的最接近的仍然不够,因为它没有正确地将连字符和撇号视为单词的一部分.
The closest I've gotten is still inadequate because it doesn't properly consider hyphens and apostrophes as part of the word.
这是我迄今为止最接近的:
This is the closest I've gotten so far:
s.scan(/\w+|\W+/).select {|x| x.match(/\S/)}
产生
["here", "...", "is", "a", "happy", "-", "go", "-", "lucky", "string", "that", "I", "'", "m", "writing"]
.
推荐答案
您可以尝试以下操作:
s.scan(/[\w'-]+|[[:punct:]]+/)
#=> ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
这篇关于使用 Ruby 将字符串拆分为单词和标点符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文