使用 Ruby 将字符串拆分为单词和标点符号 [英] Splitting a string into words and punctuation with Ruby

查看:73
本文介绍了使用 Ruby 将字符串拆分为单词和标点符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Ruby 中工作,我想将一个字符串及其标点符号拆分为一个数组,但我想将撇号和连字符视为单词的一部分.例如,

I'm working in Ruby and I want to split a string and its punctuation into an array, but I want to consider apostrophes and hyphens as parts of words. For example,

s = "here...is a     happy-go-lucky string that I'm writing"

应该变成

["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"].

我得到的最接近的仍然不够,因为它没有正确地将连字符和撇号视为单词的一部分.

The closest I've gotten is still inadequate because it doesn't properly consider hyphens and apostrophes as part of the word.

这是我迄今为止最接近的:

This is the closest I've gotten so far:

s.scan(/\w+|\W+/).select {|x| x.match(/\S/)}

产生

["here", "...", "is", "a", "happy", "-", "go", "-", "lucky", "string", "that", "I", "'", "m", "writing"]

.

推荐答案

您可以尝试以下操作:

s.scan(/[\w'-]+|[[:punct:]]+/)
#=> ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]

这篇关于使用 Ruby 将字符串拆分为单词和标点符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆