无法在 urllib.request 中使用 https 代理 [英] Unable to use https proxy within urllib.request

查看:52
本文介绍了无法在 urllib.request 中使用 https 代理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 urllib.request 在 python 中创建了一个脚本,在其中应用 https 代理.我已经尝试过如下,但它遇到了不同类型的问题,如 urllib.error.URLError: <urlopen error [WinError 10060] 连接尝试失败----.该脚本应该从该站点获取 IP 地址.脚本中使用的 IP 地址是一个占位符.我已经遵守了此处提出的建议.

I've created a script in python using urllib.request applying https proxy within it. I've tried like the following but it encounters different types of issues, as in urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed----. The script is supposed to grab the ip address from that site. The ip address used in the script is a placeholder. I've complied with the suggestion made here.

第一次尝试:

import urllib.request
from bs4 import BeautifulSoup

url = 'https://whatismyipaddress.com/proxy-check'

headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}
proxy_host = '60.191.11.246:3128'

req = urllib.request.Request(url,headers=headers)
req.set_proxy(proxy_host, 'https')
resp = urllib.request.urlopen(req).read()
soup = BeautifulSoup(resp,"html5lib")
ip_addr = soup.select_one("td:contains('IP')").find_next('td').text
print(ip_addr)

另一种方式(使用os.environ):

headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}
proxy = '60.191.11.246:3128'

os.environ["https_proxy"] = f'http://{proxy}'
req = urllib.request.Request(url,headers=headers)
resp = urllib.request.urlopen(req).read()
soup = BeautifulSoup(resp,"html5lib")
ip_addr = soup.select_one("td:contains('IP')").find_next('td').text
print(ip_addr)

我尝试过的另一种方法:

One more approach that I've tried with:

agent = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'
proxy_host = '205.158.57.2:53281'
proxy = {'https': f'http://{proxy_host}'}

proxy_support = urllib.request.ProxyHandler(proxy)
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)
opener.addheaders = [('User-agent', agent)]
res = opener.open(url).read()

soup = BeautifulSoup(res,"html5lib")
ip_addr = soup.select_one("td:contains('IP')").find_next('td').text
print(ip_addr)

如何以正确的方式在 urllib.request 中使用 https 代理?

How can I use https proxy within urllib.request in the right way?

推荐答案

在我们测试代理时,来自您计算机网络的异常流量用于 Google 服务,这就是响应错误的原因,因为 whatismyipaddress 使用 Google 的服务.但该问题并未影响其他网站,例如 stackoverflow.

While we were testing the proxes, there was unusual traffic from your computer network for Google services and that was the reason of response error, because whatismyipaddress uses Google's services. But the issue was not affect other sites like stackoverflow.

from urllib import request
from bs4 import BeautifulSoup

url = 'https://whatismyipaddress.com/proxy-check'

proxies = {
    # 'https': 'https://167.172.229.86:8080',
    # 'https': 'https://51.91.137.248:3128',
    'https': 'https://118.70.144.77:3128',
}

user_agent = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'
headers = {
    'User-Agent': user_agent,
    'accept-language': 'ru,en-US;q=0.9,en;q=0.8,tr;q=0.7'
}

proxy_support = request.ProxyHandler(proxies)
opener = request.build_opener(proxy_support)
# opener.addheaders = [('User-Agent', user_agent)]
request.install_opener(opener)

req = request.Request(url, headers=headers)
try:
    response = request.urlopen(req).read()
    soup = BeautifulSoup(response, "html5lib")
    ip_addr = soup.select_one("td:contains('IP')").find_next('td').text
    print(ip_addr)
except Exception as e:
    print(e)

这篇关于无法在 urllib.request 中使用 https 代理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆