如何阅读别人的论坛 [英] How to read someone else's forum

查看:62
本文介绍了如何阅读别人的论坛的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的朋友有一个论坛,上面满是包含信息的帖子.有时她想查看论坛中的帖子并得出结论.目前,她通过单击自己的论坛来查看帖子,并生成(不一定是正确的)数据数据(在她的大脑中),并据此得出结论.我今天的想法是,我可能会敲出一个快速的Ruby脚本,该脚本将解析必要的HTML,以使她对数据在说什么有个真实的了解.

My friend has a forum, which is full of posts containing information. Sometimes she wants to review the posts in her forum, and come to conclusions. At the moment she reviews posts by clicking through her forum, and generates a not necessarily accurate picture of the data (in her brain) from which she makes conclusions. My thought today was that I could probably bang out a quick Ruby script that would parse the necessary HTML to give her a real idea of what the data is saying.

今天我第一次使用Ruby的net/http库,但是遇到了问题.虽然我的浏览器可以轻松浏览朋友的论坛,但似乎Net :: HTTP.new("forumname.net")方法会产生以下错误:

I am using Ruby's net/http library for the first time today, and I have encountered a problem. While my browser has no trouble viewing my friend's forum, it seems that the method Net::HTTP.new("forumname.net") produces the following error:

无法建立连接,因为目标计算机主动拒绝了该连接. -connect(2)

No connection could be made because the target machine actively refused it. - connect(2)

搜索该错误,我了解到它与MySQL(或类似的问题)有关,不希望像我这样的爱管闲事的人远程到那里逛逛:出于安全原因.这对我来说很有意义,但让我感到奇怪:浏览器如何在我朋友的论坛上四处浏览,但是我的小Ruby脚本却没有戳权限.我的脚本可以通过某种方式告诉服务器这不是威胁吗?我只想要阅读权而不想要写作权?

Googling that error, I have learned that it has to do with MySQL (or something like that) not wanting nosy guys like me remotely poking around in there: for security reasons. This makes sense to me, but it makes me wonder: how is it that my browser gets to poke around on my friend's forum, but my little Ruby script gets no poking rights. Is there some way for my script to tell the server that it is not a threat? That I only want reading rights and not writing rights?

谢谢大家,

z.

推荐答案

抓取网站?使用机械化:

#!/usr/bin/ruby1.8

require 'rubygems'
require 'mechanize'

agent = WWW::Mechanize.new
page = agent.get("http://xkcd.com")
page = page.link_with(:text=>'Forums').click
page = page.link_with(:text=>'Mathematics').click
page = page.link_with(:text=>'Math Books').click
#puts page.parser.to_html    # If you want to see the html you just got
posts = page.parser.xpath("//div[@class='postbody']")
for post in posts
  title = post.at_xpath('h3//text()').to_s
  author = post.at_xpath("p[@class='author']//a//text()").to_s
  body = post.xpath("div[@class='content']//text()").collect do |div|
    div.to_s
  end.join("\n")
  puts '-' * 40
  puts "title: #{title}"
  puts "author: #{author}"
  puts "body:", body
end

输出的第一部分:

----------------------------------------
title: Math Books
author: Cleverbeans
body:
This is now the official thread for questions about math books at any level, fr\
om high school through advanced college courses.
I'm looking for a good vector calculus text to brush up on what I've forgotten.\
 We used Stewart's Multivariable Calculus as a baseline but I was unable to pur\
chase the text for financial reasons at the time. I figured some things may hav\
e changed in the last 12 years, so if anyone can suggest some good texts on thi\
s subject I'd appreciate it.
----------------------------------------
title: Re: Multivariable Calculus Text?
author: ThomasS
body:
The textbooks go up in price and new pretty pictures appear. However, Calculus \
really hasn't changed all that much.
If you don't mind a certain lack of pretty pictures, you might try something li\
ke Widder's Advanced Calculus from Dover. it is much easier to carry around tha\
n Stewart. It is also written in a style that a mathematician might consider no\
rmal. If you think that you might want to move on to real math at some point, i\
t might serve as an introduction to the associated style of writing.

这篇关于如何阅读别人的论坛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆