在Python中,如何通过保存的浏览器会话使Selenium正常运行? [英] In Python, how do I make Selenium work headless with a Saved Browser Session?

查看:84
本文介绍了在Python中,如何通过保存的浏览器会话使Selenium正常运行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正试图绕过web.whatsapp.com QR扫描页面.这是我到目前为止使用的代码:

I'm trying to bypass the web.whatsapp.com QR scan page. This is the code I used so far:

options = webdriver.ChromeOptions();
options.add_argument('--user-data-dir=./User_Data')
driver = webdriver.Chrome(options=options)
driver.get('https://web.whatsapp.com/')

第一次尝试时,我必须手动扫描QR码,而在以后的尝试中,它不会要求提供QR码.

On first attempt i have to manually scan the QR code and on later attempts it doesn't ask for the QR code.

但是,如果我在添加此行之后尝试执行相同的操作chrome_options.add_argument(-headless"),将DevTools活动端口写入文件时出现错误.我尝试了至少十二种不同的Google搜索解决方案,但都没有用.任何帮助,将不胜感激!谢谢.

HOWEVER, if i try to do the same after adding this line chrome_options.add_argument("--headless") I get Error writing DevTools active port to file. I tried at least a dozen different google search solutions, but none of them are working. Any help on this would be highly appreciated! Thank you.

到目前为止,尝试了一堆不同组合的差异参数,但没有任何效果:

Tried a bunch of differet arguments in different combinations so far but nothing worked:

options = Options() #decomment for local debugging
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-setuid-sandbox')
options.add_argument('--remote-debugging-port=9222')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')  # Last I checked this was necessary.
options.add_argument('start-maximized')
options.add_argument('disable-infobars')
options.add_argument('--user-data-dir=./User_Data')
driver = webdriver.Chrome('chromedriver.exe', options=options)

driver.get('https://web.whatsapp.com/')

推荐答案

最近,我制作了一个whatsapp机器人,并遇到了同样的问题.经过很长时间的搜索,我想到了以下解决方案:

Recently I made a whatsapp bot and had the same problems. After searching for a long time I came up with this solution:

第一个问题是浏览器缓存,如果它没有在浏览器apdata中缓存QR码,它将一直等待扫描.

The first problem was the browser cache memory, if it doesn't get the QR code cached in the browser apdata it will keep waiting in order to scan it.

所以在我的程序中,我使用以下函数来获取:

So in my program I used the following function to get:

def apdata_path():
    path = str(pathlib.Path().absolute())
    driver_path = path + "\chromedriver.exe"
    apdata = os.getenv('APPDATA')
    apdata_path = "user-data-dir=" + \
    re.findall(".+.\Dta\D", a)[0] + \
    r'Local\Chromium\User Data\Default\Default'
    apdata_path = apdata_path.replace("\\", "\\"*2)
    return apdata_path

在这里找到第一个apdata路径=>C:\ Users \ AppData \,然后将其余路径连接到缓存文件夹,在这种情况下,我使用Chromium.在您的情况下,它将是:

Here it finds first apdata path => C:\Users\AppData\ then I concatenated the rest of the path to the cache folder, in this case I used Chromium. In your case it will be:

C:\ Users \ AppData \ Local \ Google \ Chrome \ User Data \ Default

C:\Users\AppData\Local\Google\Chrome\User Data\Default

找到个人资料数据路径可能是更好的方法.找到它之后,我设置了驱动程序:

There's probably a better way to find the profile data path. After finding it I set the driver:

def chrome_driver(user_agent=0):
    usr_path = apdata_path()
    chrome_path = file_path() + '\Chromium 85\\bin\chrome.exe'
    options = webdriver.ChromeOptions()
    options.binary_location = r"{}".format(chrome_path)
    if user_agent != 0:
        options.add_argument('--headless')
        options.add_argument('--hide-scrollbars')
        options.add_argument('--disable-gpu')
        options.add_argument("--log-level=3")
        options.add_argument('--user-agent={}'.format(user_agent))
    options.add_argument(usr_path)
    driver = webdriver.Chrome('chromedriver.exe', chrome_options=options)
    return driver

这里我还有另一个问题,就是有时Selenium无法使用,因为Whatsapp具有用户代理验证功能,以便能够验证浏览器版本是否兼容.我不太了解,所以我通过反复试验得出了这个结论,也许这不是真正的解释.但这对我有用.

Here I had another problem, that is, sometimes Selenium wouldn't work because Whatsapp has user agent validation in order to be able to verify if the browser version its compatible. I don't know much so I reached this conclusion by trial and error, maybe this is not the real explanation. But it worked for me.

因此,在我的机器人中,我启动了启动功能,以获取用户代理并获取第一个QR扫描并将其保留在浏览器缓存中:

So, in my Bot i made a start function to get the user agent and get the first QR Scan and keep it on the browser cache:

def whatsapp_QR():
    driver = chrome_driver()
    user_agent = driver.execute_script("return navigator.userAgent;")
    driver.get("https://web.whatsapp.com/")
    print("Scan QR Code, And then Enter")
    input()
    print("Logged In")
    driver.close()
    return user_agent

毕竟,我的机器人工作正常,但运行不顺利.我能够以无头模式在测试组中发送消息.

Afterall, my bot worked, not perfectly but it ran smoothly. I was able to send messages in my test group in headless mode.

总而言之,请在apdata中获取配置文件用户缓存以绕过QR代码(但您将需要运行一次而没有头绪以创建第一个缓存).

To summarize, get the profile user cache in apdata to bypass the QR Code ( but you will need to run it once without headless to create the first cache).

然后获取用户代理,以绕过Wthatsapp验证.因此整个选项集如下所示:

Then get the user agent in order to bypass Wthatsapp validation. So the whole option set would look like this:

options.add_argument('--headless')
options.add_argument('--hide-scrollbars')
options.add_argument('--disable-gpu')
options.add_argument("--log-level=3")
options.add_argument('--user-agent={}'.format(user_agent)) # User agent for validation
options.add_argument(usr_path) #apdata user profile, to by pass QR code

usr_path ="user-data-dir = rC:\ Users \\ AppData \ Local \ Google \ Chrome \ User Data \ Default"

usr_path = "user-data-dir=rC:\Users\\AppData\Local\Google\Chrome\User Data\Default"

这篇关于在Python中,如何通过保存的浏览器会话使Selenium正常运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆