如何在 Ruby 中将 ...(省略号)更改为 ...(三个句点)? [英] How to change … (elipses) to ... (three periods) in Ruby?
问题描述
我正在使用 nokogiri<解析本文档/代码>.我发现该页面中有一些
...
(省略号)字符并且无法删除.我想知道如何使用Ruby将所有...
(省略号)替换为...
(三个句点).
I'm parsing this document using nokogiri
. I found there are some …
(elipses) characters in that page and can't be removed. I want to know how to use Ruby to replace all …
(elipses) to ...
(three periods).
顺便说一句,您可以搜索此字符串以找到所有 ...s
BTW, you can search this string to find all …s
指定是否 ALTER TABLE
Specifies whether ALTER TABLE
我添加了我的程序和错误消息.
I added my program and the error message.
# encoding: UTF-8
require 'nokogiri'
require 'open-uri'
require 'terminal-table'
def change s
{Nokogiri::HTML(" ").text => " ",
Nokogiri::HTML(""").text => '"',
Nokogiri::HTML("™").text => '(TM)',
Nokogiri::HTML("&").text => "&",
Nokogiri::HTML("<").text => "<",
Nokogiri::HTML(">").text => ">",
Nokogiri::HTML("©").text => "(C)",
Nokogiri::HTML("®").text => "(R)",
Nokogiri::HTML("¥").text => " "}.each do |k, v|
s.gsub!(k, v)
end
s
end
doc = Nokogiri::HTML(open('http://msdn.microsoft.com/en-us/library/ms189782.aspx').read.tr("…","..."))
temp = []
doc.xpath('//div[@class="tableSection"]/table[position() = 1]/tr').each do |e|
temp << e.css("td, th").map(&:text).map(&:strip).map {|x| x = change x; x.split(/\n/).map {|z| z.gsub(/.{80}/mi, "\\0\n")}.join("\n")}
end
table = Terminal::Table.new
table.headings = temp.shift
table.rows = temp
puts table
错误:
F:\dropbox\Dropbox\temp>ruby nokogiri.rb
nokogiri.rb:21: invalid multibyte char (UTF-8)
nokogiri.rb:21: invalid multibyte char (UTF-8)
nokogiri.rb:21: syntax error, unexpected $end, expecting ')'
...ary/ms189782.aspx').read.tr("í¡","..."))
... ^
F:\dropbox\Dropbox\temp>
推荐答案
这可能取决于您正在使用的文件的编码,但请尝试使用
It probably depends on the encoding of the file you're working with, but try using
"\u2026"
对于单字符 3 点又名水平省略号"(您要替换的那个).
for the single-character 3-dots aka "horizontal ellipsis" (the one you want to replace).
这篇关于如何在 Ruby 中将 ...(省略号)更改为 ...(三个句点)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!