无法通过 Python 中的 Selenium 使用 ChromeDriver 和 Chrome 杀死 Chrome 进程并耗尽内存 [英] Unable to kill Chrome process and running out of memory with ChromeDriver and Chrome through Selenium in Python

查看:49
本文介绍了无法通过 Python 中的 Selenium 使用 ChromeDriver 和 Chrome 杀死 Chrome 进程并耗尽内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个在自定义类中启动 selenium 的爬行过程,如下所示:

I have a crawling process that kicks off selenium in a custom class that looks like this:

class BrowserInterface:

def __init__(self, base_url, proxy_settings):

    self.base_url = base_url

    self.display = Display(visible=0, size=(1024, 768))
    self.display.start()

    proxy_argument = '--proxy-server={0}'.format(PROXY_URL.format(
        proxy_settings.get('proxy_host'),
        proxy_settings.get('proxy_port')
    ))

    logger.debug(proxy_argument)

    options = webdriver.ChromeOptions()
    options.add_argument('--no-sandbox')
    options.add_argument(proxy_argument)

    selenium_chrome_driver_path = os.path.join(settings.DEFAULT_DRIVER_PATH,
                                               settings.CHROME_DRIVERS[settings.CURRENT_OS])

    self.driver = webdriver.Chrome(executable_path=selenium_chrome_driver_path, chrome_options=options)

def visit(self, url):
    url = urljoin(self.base_url, url)
    self.driver.get(url)

def body(self):
    soup = BeautifulSoup(self.driver.page_source)
    return soup.find("body").text

def quit(self):
    self.driver.quit()
    self.display.stop()

这个 BrowserInterface 类在批处理队列中初始化,并在批处理结束时调用 quit() 方法.启动chrome和获取数据没有问题.问题是,在每个作业结束时调用 quit() 方法时,chrome 会进入僵尸模式.当下一个 BrowserInterface 被初始化时,它会启动一个新的 chrome 实例.因此,该框的内存不足.我已经尝试在 chrome 进程上运行 a kill 命令,但它保持运行.任何方向都将不胜感激,因为我即将把头发拉出来.

This BrowserInterface class is initialized in a batch queue and the quit() method is called at the end of the batch. There are no issues starting chrome and getting the data. The trouble is, at the end of each job when the quit() method is called chrome goes into zombie mode. When the next BrowserInterface is initialized it starts a new chrome instance. Due to this, the box is running out of memory. I've tried running the a kill command as well on the chrome process but it stays running. Any direction would be greatly appreciated as I'm about to pull my hair out over this.

在 Ubuntu 18.04、Google Chrome 70.0.3538.110、ChromeDriver 2.44、Python3.6.6 上运行

Running on Ubuntu 18.04, Google Chrome 70.0.3538.110, ChromeDriver 2.44, Python3.6.6

提前致谢!

推荐答案

从您的代码试验中可以明显看出您已经调用了 self.driver.quit()这应该是完美的.

From your code trials it is pretty much evident you have invoked self.driver.quit() which should have worked perfect.

但是,由于 僵尸 chrome 进程,该框仍然内存不足,您采取了正确的方法来执行 >kill 命令,您可以在 quit() 方法中添加以下解决方案:

However as the box is still running out of memory due to zombie chrome processes you took the right approach to execute the a kill command and you can add the following solution within the quit() method:

from selenium import webdriver
import psutil

driver = webdriver.Chrome()
driver.get('http://google.com/')

PROCNAME = "chrome" # to clean up zombie Chrome browser
#PROCNAME = "chromedriver" # to clean up zombie ChromeDriver
for proc in psutil.process_iter():
    # check whether the process name matches
    if proc.name() == PROCNAME:
        proc.kill()

这篇关于无法通过 Python 中的 Selenium 使用 ChromeDriver 和 Chrome 杀死 Chrome 进程并耗尽内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆