由于无法在执行mainloop之前填充变量,因此在tkinter GUI中请求模块MissingSchema错误:如何解决此问题? [英] requests module MissingSchema error in tkinter GUI due to inability to fill variable before execution of mainloop: How to resolve this?

查看:63
本文介绍了由于无法在执行mainloop之前填充变量,因此在tkinter GUI中请求模块MissingSchema错误:如何解决此问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过一些现有代码构建GUI,并且遇到了 MissingSchema 错误.我知道一般问题,但不是最佳解决方案.

I'm trying to build a GUI over some existing code and I'm running into a MissingSchema error. I am aware of the general problem but not the best solution.

基本上,在tkinter mainloop()之前,我试图发出一个 requests 模块请求,以创建一个BeautifulSoup对象,该对象需要许多功能.但是,要发出该请求,我需要使用用户选择的URL填充 url 变量;但是,直到 mainloop()执行之后,才能填充此变量.因此,由于URL为空,因此 requests 调用失败,给了我 MissingSchema 错误.您可以运行以下代码来了解我的意思:

Basically, before the tkinter mainloop() I'm trying to make a requests module request in order to create a BeautifulSoup object which is needed for a number of functions. However, to make that request I need a filled url variable with a url of the user's choosing; however, this variable cannot be filled until after mainloop() executes. Consequently the requests call fails as the url is empty, giving me the MissingSchema error. You can run the below code to see what I mean:

from tkinter import *
from tkinter import scrolledtext as st
import requests
import re
from bs4 import BeautifulSoup

root = Tk()

url_entry = Entry(root)
url = url_entry.get()

log_text = st.ScrolledText(root, state='disabled')

start_button = Button(root, text='Run program', command=lambda: [seo_find_stopwords(urlSoup)])

url_entry.grid(column=0, row=1)
log_text.grid(column=2, row=0, rowspan=3)
start_button.grid(column=1, row=5)

agent = "Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0"
# attempts to access provided URL, returns errors if unable
try:
    # 'agent' added as part of effort to avoid HTTP Error 403: Forbidden
    url_request = requests.get(url, headers={'User-Agent': agent})
    url_request.raise_for_status()
    urlSoup = BeautifulSoup(url_request.text, 'lxml')
except requests.exceptions.MissingSchema as exc:
    log_text.insert(INSERT, "ERROR: Invalid URL provided. Please try again with a valid URL.")
    raise exc


# searches HTML page title for SEO stop words from stopwords.txt, then provides number and list of present stop words
def seo_find_stopwords(urlSoup):
    stopwords_count = 0
    stopwords_list = []
    if urlSoup.title:
        with open('stopwords.txt', 'r', encoding='utf-8') as file:
            for line in file:
                if re.search(r'\b' + line.rstrip('\n') + r'\b', urlSoup.title.text.casefold()):
                    stopwords_count += 1
                    stopwords_list.append(line.rstrip('\n'))

        if stopwords_count > 0:
            log_text.insert(INSERT, "{0} stop words were found in your page title. If possible, it would be good to "
                        "reduce them. The stop words found are: {1}".format(stopwords_count, stopwords_list))


root.mainloop()

对不起,如果它有点大,我尝试尽可能地压缩它.我想知道纠正此错误的最佳方法是什么.我给人的印象是可能将有关进行 requests.get()调用的部分放入函数中,并使用该部分以某种方式返回 urlSoup 在需要它的功能中.

Sorry if this is a bit large, I tried condensing it as much as possible. I'd like to know what the best way to rectify this error is. I am of the impression that it may be to put the portion regarding making the requests.get() call into a function and use that to return the urlSoup somehow to be used in the functions that need it.

推荐答案

您甚至在用户尝试输入url之前都在尝试获取url.因此,将url请求放入函数中,并在 Entry 小部件包含文本或将事件处理程序绑定到按钮上时调用它

You are attempting to get url even before the user has tried to enter any. So place url request in a function and call it when the Entry widget has text or bind the event handler to a button

这是一个演示.(在 Entry 小部件中插入文本后,您可以按Enter键或运行按钮)

Here is a demo.(you can either press the enter key or the run button after inserting text in the Entry widget)

from tkinter import *
import requests
from tkinter import scrolledtext as st
import re
from bs4 import BeautifulSoup

# searches HTML page title for SEO stop words from stopwords.txt, then provides number and list of present stop words
def seo_find_stopwords(urlSoup):
    stopwords_count = 0
    stopwords_list = []
    print('No')
    if urlSoup.title:
        with open('stopwords.txt', 'r', encoding='utf-8') as file:
            
            for line in file:
                if re.search(r'\b' + line.rstrip('\n') + r'\b', urlSoup.title.text.casefold()):
                    stopwords_count += 1
                    stopwords_list.append(line.rstrip('\n'))

        if stopwords_count > 0:
            log_text.insert(INSERT, "{0} stop words were found in your page title. If possible, it would be good to "
                        "reduce them. The stop words found are: {1}".format(stopwords_count, stopwords_list))


def request_url(event=None):
    global urlSoup
    try:
        # 'agent' added as part of effort to avoid HTTP Error 403: Forbidden
        url_request = requests.get(url_entry.get(), headers={'User-Agent': agent})
        url_request.raise_for_status()
        urlSoup = BeautifulSoup(url_request.text, 'lxml')
    except requests.exceptions.MissingSchema as exc:
        log_text.insert(INSERT, "ERROR: Invalid URL provided. Please try again with a valid URL.")
        raise exc    


root = Tk()

urlSoup =''

url_entry = Entry(root)
url_entry.bind('<Return>', request_url)
#url = url_entry.get()

log_text = st.ScrolledText(root, state='disabled')

start_button = Button(root, text='Run program', command=lambda: request_url() or [seo_find_stopwords(urlSoup)])

url_entry.grid(column=0, row=1)
log_text.grid(column=2, row=0, rowspan=3)
start_button.grid(column=1, row=5)

agent = "Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0"
# attempts to access provided URL, returns errors if unable


root.mainloop()

这篇关于由于无法在执行mainloop之前填充变量,因此在tkinter GUI中请求模块MissingSchema错误:如何解决此问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆