在 Python 3 中使用套接字获取 400 错误请求错误 [英] Getting a 400 Bad Request Error Using Socket in Python 3

查看:82
本文介绍了在 Python 3 中使用套接字获取 400 错误请求错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚开始使用 Python 3.6.1 中的 Python 网络数据.我正在学习套接字,但我的代码有问题,我无法弄清楚.我的代码中的网站工作正常,但是当我运行此代码时,我收到 400 Bad Request 错误.我不太确定我的代码有什么问题.提前致谢.

I'm just starting out with Python web data in Python 3.6.1. I was learning sockets and I had a problem with my code which I couldn't figure out. The website in my code works fine, but when I run this code I get a 400 Bad Request error. I am not really sure what the problem with my code is. Thanks in advance.

import socket

mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

mysock.connect(('data.pr4e.org', 80))

mysock.send(('GET http://data.pr4e.org/romeo.txt HTTP/1.0 \n\n').encode())

while True:
    data = mysock.recv(512)
    if ( len(data) < 1 ):
        break
    print (data)

mysock.close()

推荐答案

GET http://data.pr4e.org/romeo.txt HTTP/1.0 \n\n

欢迎来到 HTTP 的奇妙世界,大多数用户认为这是一个简单的协议,因为它是人类可读的,但实际上它可能是一个非常复杂的协议.鉴于您的上述要求,存在几个问题:

Welcome in the wonderful world of HTTP where most users think that this is an easy protocol since it is a human readable but in fact it can be a very complex protocol. Given your request above there are several problems:

  • 路径不应该是完整的 URL,而应该是 /romeo.txt.只有在向代理发出请求时才会使用完整 URL.
  • 行尾必须是 \r\n 而不是 \n.
  • 行尾前的 HTTP/1.0 后不应有空格.
  • 虽然 HTTP/1.1 只需要 Host 标头,但许多服务器(包括您尝试访问的服务器)也需要 HTTP/1.0 标头,因为它们在同一 IP 地址上有多个主机名,并且需要区分您使用的名称想要.
  • The path should be not a full URL but only /romeo.txt. Full URL's will be used only when doing a request to a proxy.
  • The line end must be \r\n not \n.
  • There should be no space after HTTP/1.0 before the end of the line.
  • While a Host header is only required with HTTP/1.1 many servers (including the one you are trying to access) need it also with HTTP/1.0 since they have multiple hostnames on the same IP address and need to distinguish which name you want.

考虑到这一点,您发送的数据应该改为

With this in mind the data you send should be instead

GET /romeo.txt HTTP/1.0\r\nHost: data.pr4e.org\r\n\r\n

我已经测试过它与此修改完美配合.

And I've tested that it works perfectly with this modification.

但是,鉴于 HTTP 并不像看起来那么简单,我真的建议使用像访问目标的请求这样的库.如果这对您来说开销太大,请研究 HTTP 标准 以正确实施它只是从一些例子中猜测 HTTP 是如何工作的 - 并且猜错了.

But, given that HTTP is not as simple as it might look I really recommend to use a library like requests for accessing the target. If this looks like too much overhead to you please study the HTTP standard to implement it properly instead of just guessing how HTTP works from some examples - and guessing it wrong.

另请注意,服务器对像您这样的损坏实现的宽容程度有所不同.因此,在某些软件升级后,曾经在一台服务器上工作的内容可能不适用于下一台服务器,甚至可能不适用于同一台服务器.使用一个强大的、经过良好测试和维护的库,而不是自己做所有事情,也可能会在以后为您省去很多麻烦.

Note also that servers differ in how forgiving they are regarding broken implementations like yours. Thus, what once worked with one server might not work with the next server or even with the same server after some software upgrade. Using a robust and well tested and maintained library instead of doing everything on your own might thus save you lots of troubles later too.

这篇关于在 Python 3 中使用套接字获取 400 错误请求错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆