使用 luasocket 和代理获取 url 页面 [英] Fetching page of url using luasocket and proxy

查看:43
本文介绍了使用 luasocket 和代理获取 url 页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

到目前为止,我有以下部分:

So far, I have the following piece:

local socket = require "socket.http"
client,r,c,h = socket.request{url = "http://example.com/", proxy="<my proxy and port here>"}
for i,v in pairs( c ) do
  print( i, v )
end

它给了我如下输出:

connection  close
content-type    text/html; charset=UTF-8
location    http://www.iana.org/domains/example/
vary    Accept-Encoding
date    Tue, 24 Apr 2012 21:43:19 GMT
last-modified   Wed, 09 Feb 2011 17:13:15 GMT
transfer-encoding   chunked
server  Apache/2.2.3 (CentOS)

这意味着刚刚完美建立了连接.现在,我想使用这个 socket.http 获取我的 url's 的标题.我搜索了以前的 SO 问题和 luasocket 的 http 文档.但是,我仍然不知道如何在变量中获取/存储页面的整个/部分并对其进行处理.

which means that the connection established just perfectly. Now, I want to fetch the title of my url's using this socket.http. I searched previous SO questions and the luasocket's http documentation. but, I still have no idea on how to fetch/store the whole/part of the page in a variable and do something with it.

请帮忙.

推荐答案

您正在使用 http.request() 的通用"形式,它需要通过 LTN12 接收器存储正文.它并不像听起来那么复杂,试试这个代码:

You are using the 'generic' form of http.request(), which requires storing the body via a LTN12 sink. It's not as complicated as it sounds, try this code:

local socket = require "socket.http"
local ltn12 = require "ltn12"; -- LTN12 lib provided by LuaSocket

-- This table will store the body (possibly in multiple chunks):
local result_table = {};
client,r,c,h = socket.request{
    url = "http://example.com/",
    sink = ltn12.sink.table(result_table),
    proxy="<my proxy and port here>"
}
-- Join the chunks together into a string:
local result = table.concat(result_table);
-- Hacky solution to extract the title:
local title = result:match("<[Tt][Ii][Tt][Ll][Ee]>([^<]*)<");
print(title);

如果您的代理在整个应用程序中保持不变,那么更直接的解决方案是使用 http.request() 的简单形式,并通过 http.PROXY 指定代理:

If your proxy is constant throughout your application then a more straightforward solution would be to use the simple form of http.request(), and specify the proxy via http.PROXY:

local http = require "socket.http"
http.PROXY="<my proxy and port here>"

local result = http.request("http://www.youtube.com/watch?v=_eT40eV7OiI")
local title = result:match("<[Tt][Ii][Tt][Ll][Ee]>([^<]*)<");
print(title);

输出:

    Flanders and Swann - A song of the weather
  - YouTube

这篇关于使用 luasocket 和代理获取 url 页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆