使用脚本从网站提取电子邮件地址 [英] Extract email addresses from a website using scripts

查看：158 发布时间：2017/8/9 1:05:35 bash email website

本文介绍了使用脚本从网站提取电子邮件地址的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给定一个网站，我想知道什么是最好的程序，以编程方式和/或使用脚本，从该链接和所有网站提取以XXXX@YYYYY.ZZZZ格式显示在每个页面上的所有电子邮件地址

解决方案

使用shell编程可以使用2个程序实现目标： p>

wget ：将获取所有页面

grep ：将过滤并只给你电子邮件

一个例子：

  wget -q -r -l 5 -O  -  http://somesite.com/ | grep -E -o\b [a-zA-Z0-9 .-] + @ [a-zA-Z0-9 .-] + \。[a-zA-Z0-9 .-] + \\ \\ b

wget ，处于安静模式（ -q来自somesite.com.br的最大深度级别为5（ -l 5 ）的所有页面递归递送（ -r ），并将所有页面打印到stdout（ -O - ）。

grep 正在使用扩展正则表达式（ -E ），并仅显示（ -o ）电子邮件地址。

所有电子邮件将打印到标准输出，您可以写他们通过追加> somefile.txt 到命令。

阅读 man wget 和 grep 。

此示例使用GNU bash 版本4.2.37（1）-release，GNU grep 2.12和GNU Wget 1.13.4。

Given a website, I wonder what is the best procedure, programmatically and/or using scripts, to extract all email addresses that are present on each page in plain text in the form XXXX@YYYYY.ZZZZ from that link and all sites underneath, recursively or until some fixed depth.
解决方案
Using shell programming you can achieve your goal using 2 programs piped together:

wget: will get all pages

grep: will filter and give you only the emails

An example:
wget -q -r -l 5 -O - http://somesite.com/ | grep -E -o "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b"
wget, in quiet mode (-q), is getting all pages recursively (-r) with maximum depth level of 5 (-l 5) from somesite.com.br and printing everything to stdout (-O -).

grep is using an extended regular expression (-E) and showing only (-o) email address.

All emails are going to be printed to standard output and you can write them to a file by appending > somefile.txt to the command.

Read the man pages for more documentation on wget and grep.

This example was tested with GNU bash version 4.2.37(1)-release, GNU grep 2.12 and GNU Wget 1.13.4.

这篇关于使用脚本从网站提取电子邮件地址的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用脚本从网站提取电子邮件地址 [英] Extract email addresses from a website using scripts

问题描述

相关文章

开发方法最新文章

热门教程

热门工具

登录关闭

使用脚本从网站提取电子邮件地址 [英] Extract email addresses from a website using scripts

问题描述

相关文章

开发方法最新文章

热门教程

热门工具

登录 关闭

登录关闭