为了正确使用Scrapy发送电子邮件,我忘记了什么 [英] What did I forget in order to correctly send an email using Scrapy
问题描述
我想用Scrapy发送电子邮件
I wanna use Scrapy to send an email
我阅读了throw官方网站,发现可以做到这一点:
I read throw the official website, and I found that I can do this:
from scrapy.mail import MailSender
from scrapy.utils.project import get_project_settings
settings = get_project_settings()
mailer = MailSender(mailfrom ="Something@gmail.com", smtphost="smtp.gmail.com", smtpport=465, smtppass ="MySecretPassword")
mailer.send(to=["AnotherMail@gmail.com"], subject="Some subject", body="Some body")
代码没有引发任何异常,但是没有发送邮件.
The code didn't throw any exception, but there is no mail has been sent.
我想念什么?
我需要使用Scrapy框架,而不是纯Python
I need to work with Scrapy framework, and not pure Python
我不想使用mailer = MailSender.from_settings(settings)
来应用默认设置,因为如您所见,我有我的自定义选项,而bty我尝试使用默认设置,但结果相同,没有例外,但没有电子邮件已发送.
I don't wanna apply the default settings by using mailer = MailSender.from_settings(settings)
, because as you saw, i have my custom options, and bty i tried to use the default settings, but the same result, no exception, but no emails sent.
希望你能帮助我
推荐答案
您的代码有两件事.首先,是否要执行邮件程序代码,其次,应该填充smtpuser
参数.
Two things come to mind with your code. First, whether the mailer code being executed and, second, the smtpuser
parameter should be populated.
这是使用Scrapy通过Gmail发送电子邮件的有效代码.该答案分为4个部分:电子邮件代码,完整示例,日志记录和Gmail配置.提供了完整的示例,因为需要协调一些工作才能使其正常工作.
Here is working code to send email via Gmail using Scrapy. This answer has 4 sections: email code, complete example, logging, and Gmail configuration. The complete example is provided as there are a few things that need to be coordinated for this to work.
电子邮件代码
要让Scrapy发送电子邮件,您可以在Spider类中添加以下内容(下一节中的完整示例).这些示例使爬网完成后爬网"发送电子邮件.
To have Scrapy send email, you can add the following in your Spider class (complete example in next section). These examples have Scrapy send email after crawling has completed.
有两段代码要添加,第一段要导入模块,第二段要发送电子邮件.
There are two chunks of code to add, the first to import modules and the second to send the email.
导入模块:
from scrapy import signals
from scrapy.mail import MailSender
在您的Spider类定义中:
Inside your Spider class definition:
class MySpider(Spider):
<SPIDER CODE>
@classmethod
def from_crawler(cls, crawler):
spider = cls()
crawler.signals.connect(spider.spider_closed, signals.spider_closed)
return spider
def spider_closed(self, spider):
mailer = MailSender(mailfrom="Something@gmail.com",smtphost="smtp.gmail.com",smtpport=587,smtpuser="Something@gmail.com",smtppass="MySecretPassword")
return mailer.send(to=["AnotherMail@gmail.com"],subject="Some subject",body="Some body")
完整示例
将其放在一起,此示例使用位于以下位置的dirbot示例:
Putting this together, this example uses the dirbot example located at:
https://github.com/scrapy/dirbot
只需编辑一个文件:
./dirbot/spiders/dmoz.py
这里是整个工作文件,顶部附近是导入文件,而蜘蛛类的末尾是电子邮件代码:
Here is the entire working file with the imports near the top and the email code at the end of the spider class:
from scrapy.spider import Spider
from scrapy.selector import Selector
from dirbot.items import Website
from scrapy import signals
from scrapy.mail import MailSender
class DmozSpider(Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/",
]
def parse(self, response):
"""
The lines below is a spider contract. For more info see:
http://doc.scrapy.org/en/latest/topics/contracts.html
@url http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/
@scrapes name
"""
sel = Selector(response)
sites = sel.xpath('//ul[@class="directory-url"]/li')
items = []
for site in sites:
item = Website()
item['name'] = site.xpath('a/text()').extract()
item['url'] = site.xpath('a/@href').extract()
item['description'] = site.xpath('text()').re('-\s[^\n]*\\r')
items.append(item)
return items
@classmethod
def from_crawler(cls, crawler):
spider = cls()
crawler.signals.connect(spider.spider_closed, signals.spider_closed)
return spider
def spider_closed(self, spider):
mailer = MailSender(mailfrom="Something@gmail.com",smtphost="smtp.gmail.com",smtpport=587,smtpuser="Something@gmail.com",smtppass="MySecretPassword")
return mailer.send(to=["AnotherMail@gmail.com"],subject="Some subject",body="Some body")
此文件更新后,请从项目目录运行标准的爬网命令以进行爬网并发送电子邮件:
Once this file is updated, run the standard crawl command from the project directory to crawl and send the email:
$ scrapy crawl dmoz
记录
通过在spider_closed
方法中返回mailer.send
方法的输出,Scrapy将自动将结果添加到其日志中.以下是成功和失败的示例:
By returning the output of the mailer.send
method in the spider_closed
method, Scrapy will automatically add the result to its log. Here are examples of successes and failures:
成功日志消息:
2015-03-22 23:24:30-0000 [scrapy] INFO: Mail sent OK: To=['AnotherMail@gmail.com'] Cc=None Subject="Some subject" Attachs=0
错误日志消息-无法连接:
Error Log Message - Unable to Connect:
2015-03-22 23:39:45-0000 [scrapy] ERROR: Unable to send mail: To=['AnotherMail@gmail.com'] Cc=None Subject="Some subject" Attachs=0- Unable to connect to server.
错误日志消息-身份验证失败:
Error Log Message - Authentication Failure:
2015-03-22 23:38:29-0000 [scrapy] ERROR: Unable to send mail: To=['AnotherMail@gmail.com'] Cc=None Subject="Some subject" Attachs=0- 535 5.7.8 Username and Password not accepted. Learn more at 5.7.8 http://support.google.com/mail/bin/answer.py?answer=14257 sb4sm6116233pbb.5 - gsmtp
Gmail配置
要将Gmail配置为以这种方式接受电子邮件,您需要启用访问不太安全的应用程序",您可以在登录帐户后在下面的URL进行操作:
To configure Gmail to accept email this way, you need to enable "Access for less secure apps" which you can do at the URL below when you are logged in to the account:
https://www.google.com/settings/security/lesssecureapps
这篇关于为了正确使用Scrapy发送电子邮件,我忘记了什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!