httplib.BadStatusLine:''在Linux上而不是在Mac上 [英] httplib.BadStatusLine: '' on Linux but not Mac
问题描述
这个错误已经存在我几个小时了.我决定编写一个单独的项目,以查看是否可以复制它,并且可以复制,但是只能在我的服务器上进行.在我的Mac上可以使用.
This error has been under my skin for a few hours now. I decided to code up a separate project just to see if I can replicate it and I can, but ONLY on my server. This works on my Mac.
-
Mac:OSX El Capitan 10.11.6
Mac: OSX El Capitan 10.11.6
服务器:CentOS 7.2.1511
Server: CentOS 7.2.1511
都具有PhantomJS版本:2.1.1
Both have PhantomJS version: 2.1.1
Python Mac:Python 2.7.11
Python Mac: Python 2.7.11
Python服务器:2.7.5
Python Server: 2.7.5
两者都有Selenium版本:2.53.0
Both have Selenium version: 2.53.0
相同的代码同时在两个代码上运行:
Identical code ran on both:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.common.exceptions import NoSuchElementException
import time
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
dcap["phantomjs.page.customHeaders.accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
dcap["phantomjs.page.customHeaders.Accept-Language"] = "en-US,en;q=0.8"
dcap["phantomjs.page.customHeaders.connection"] = "keep-alive"
driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.set_window_size(1120, 700)
driver.get("https://www.instagram.com/espn/")
while True:
print len(driver.find_elements_by_css_selector("a[href*='/p/']"))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
try:
loadMore = driver.find_element_by_link_text("Load more")
loadMore.click()
except NoSuchElementException:
print "No such"
driver.save_screenshot('none.png')
Mac输出:
12
24
No such
24
No such
36
No such
48
No such
48
No such
60
No such
72
No such
84
# This goes until I end it
服务器输出:
12
24
No such
Traceback (most recent call last):
File "junk.py", line 27, in <module>
driver.save_screenshot('none.png')
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 790, in get_screenshot_as_file
png = self.get_screenshot_as_png()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 809, in get_screenshot_as_png
return base64.b64decode(self.get_screenshot_as_base64().encode('ascii'))
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 819, in get_screenshot_as_base64
return self.execute(Command.SCREENSHOT)['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 444, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
删除屏幕截图行后的服务器输出:
Server output after removing the screenshot line:
12
24
No such
24
Traceback (most recent call last):
File "junk.py", line 23, in <module>
loadMore = driver.find_element_by_link_text("Load more")
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 744, in find_element
{'using': by, 'value': value})['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 444, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
我在这里找到一个相关的答案:无法在python中运行PhantomJS通过硒
One related answer I found was here: Can't run PhantomJS in python via Selenium
所以我安装了Selenium 2.37,它给出了相同的错误.
So I installed Selenium 2.37 and it gave the same error.
我阅读了此答案关于问题可能与更改标头有关,因此我通过将驱动程序更改为driver = webdriver.PhantomJS()
来删除了标头,但仍然遇到相同的错误.
I read this answer about the problem perhaps behind related to changing the headers, so I removed the headers by changing the driver to driver = webdriver.PhantomJS()
and still get the same error.
我还在服务器上安装了2.7.12,以查看是否存在差异.输出为:
I also installed 2.7.12 on the server, to see if there was a difference. Output was:
# python2.7 junk.py
12
24
No such
24
Traceback (most recent call last):
File "junk.py", line 29, in <module>
loadMore = driver.find_element_by_link_text("Load more")
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 744, in find_element
{'using': by, 'value': value})['value']
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/local/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1228, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1201, in do_open
r = h.getresponse(buffering=True)
File "/usr/local/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/local/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
检查系统上的空间.这是一个全新的VPS,但仍然需要确认:
Checking space on system. It's a brand new VPS, but still, to confirm:
推荐答案
编辑3
添加以下内容:
except httplib.BadStatusLine:
pass
编辑2
Python WebDriver和phantomJs遇到 keep_alive 的问题.这可能是您的问题.因此,如下添加keep_alive = False:
Python WebDriver and phantomJs have a problem with keep_alive. This could be your problem. So add keep_alive=False as follows:
driver = webdriver.PhantomJS(desired_capabilities=dcap,keep_alive=False)
结束编辑
添加以下内容
import httplib
import socket
from selenium.webdriver.remote.command import Command
def get_status(driver):
try:
driver.execute(Command.STATUS)
return "Alive"
except (socket.error, httplib.CannotSendRequest):
return "Dead"
在save_screenshot语句之前调用get_status(driver)并打印结果.这将告诉我们驱动程序是否已过早关闭.
Call get_status(driver) just before the save_screenshot statement and print the result. This will tell us if the driver has prematurely shutdown.
编辑
在driver = webdriver.PhantomJS(desired_capabilities = dcap)之后添加以下内容
Add the following after driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.implicitly_wait(10) #wait 10 seconds when doing a find_element before carrying on
这篇关于httplib.BadStatusLine:''在Linux上而不是在Mac上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!