如何在 Ruby 中将 ...(省略号)更改为 ...(三个句点)? [英] How to change … (elipses) to ... (three periods) in Ruby?

查看:72
本文介绍了如何在 Ruby 中将 ...(省略号)更改为 ...(三个句点)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 nokogiri<解析本文档/代码>.我发现该页面中有一些 ...(省略号)字符并且无法删除.我想知道如何使用Ruby将所有...(省略号)替换为...(三个句点).

I'm parsing this document using nokogiri. I found there are some (elipses) characters in that page and can't be removed. I want to know how to use Ruby to replace all (elipses) to ... (three periods).

顺便说一句,您可以搜索此字符串以找到所有 ...s

BTW, you can search this string to find all …s

指定是否 ALTER TABLE

Specifies whether ALTER TABLE

我添加了我的程序和错误消息.

I added my program and the error message.

# encoding: UTF-8
require 'nokogiri'
require 'open-uri'
require 'terminal-table'

def change s
    {Nokogiri::HTML("&nbsp;").text => " ", 
     Nokogiri::HTML("&quot;").text => '"',
     Nokogiri::HTML("&trade;").text => '(TM)',
     Nokogiri::HTML("&amp;").text => "&",
     Nokogiri::HTML("&lt;").text => "<",
     Nokogiri::HTML("&gt;").text => ">",
     Nokogiri::HTML("&copy;").text => "(C)",
     Nokogiri::HTML("&reg;").text => "(R)",
     Nokogiri::HTML("&yen;").text => " "}.each do |k, v|
         s.gsub!(k, v)
     end
     s
end

doc = Nokogiri::HTML(open('http://msdn.microsoft.com/en-us/library/ms189782.aspx').read.tr("…","..."))
temp = []
doc.xpath('//div[@class="tableSection"]/table[position() = 1]/tr').each do |e|
    temp << e.css("td, th").map(&:text).map(&:strip).map {|x| x = change x; x.split(/\n/).map {|z| z.gsub(/.{80}/mi, "\\0\n")}.join("\n")}
end

table = Terminal::Table.new
table.headings = temp.shift
table.rows = temp


puts table

错误:

F:\dropbox\Dropbox\temp>ruby nokogiri.rb
nokogiri.rb:21: invalid multibyte char (UTF-8)
nokogiri.rb:21: invalid multibyte char (UTF-8)
nokogiri.rb:21: syntax error, unexpected $end, expecting ')'
...ary/ms189782.aspx').read.tr("í¡","..."))
...                               ^

F:\dropbox\Dropbox\temp>

推荐答案

这可能取决于您正在使用的文件的编码,但请尝试使用

It probably depends on the encoding of the file you're working with, but try using

"\u2026"

对于单字符 3 点又名水平省略号"(您要替换的那个).

for the single-character 3-dots aka "horizontal ellipsis" (the one you want to replace).

这篇关于如何在 Ruby 中将 ...(省略号)更改为 ...(三个句点)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆