如何在python中检查url是网页链接还是文件链接 [英] How to check the url is either web page link or file link in python

查看:120
本文介绍了如何在python中检查url是网页链接还是文件链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有如下链接:

    http://example.com/index.html
    http://example.com/stack.zip
    http://example.com/setup.exe
    http://example.com/news/

在上面的链接中,第一个和第四个链接是网页链接,第二个和第三个是文件链接.

In the above links first and fourth links are web page links and second and third are the file link.

这些只是文件链接的一些示例,例如 .zip 和 .exe,但可能还有许多其他文件.

These are only some examples of files links i.e .zip and .exe, but there may be many other files.

有什么标准的方法可以区分文件 url 或网页链接吗?提前致谢.

Is there any standard way to distinguish between file url or web page link? Thanks in advance.

推荐答案

import urllib
import mimetypes


def guess_type_of(link, strict=True):
    link_type, _ = mimetypes.guess_type(link)
    if link_type is None and strict:
        u = urllib.urlopen(link)
        link_type = u.headers.gettype() # or using: u.info().gettype()
    return link_type

演示:

links = ['http://stackoverflow.com/q/21515098/538284', # It's a html page
         'http://upload.wikimedia.org/wikipedia/meta/6/6d/Wikipedia_wordmark_1x.png', # It's a png file
         'http://commons.wikimedia.org/wiki/File:Typing_example.ogv', # It's a html page
         'http://upload.wikimedia.org/wikipedia/commons/e/e6/Typing_example.ogv'   # It's an ogv file
]

for link in links:
    print(guess_type_of(link))

输出:

text/html
image/x-png
text/html
application/ogg

这篇关于如何在python中检查url是网页链接还是文件链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆