检查是否URL转到包含文本&QUOT页面; 404 QUOT; [英] Check if a URL goes to a page containing the text "404"

查看:145
本文介绍了检查是否URL转到包含文本&QUOT页面; 404 QUOT;的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个bash脚本来检查URL列表的HTTP状态code,但我意识到了一些,虽然看似是200,显示实际包含错误404的页面。我怎么能检查吗?

I have a bash script to check the HTTP status code of a list of urls, but I realize that some, while appearing to be "200", display actually a page containing "error 404". How could I check for that ?

下面是我当前的脚本:

#!/bin/bash
while read LINE; do
  curl -o /dev/null --silent --head --write-out '%{http_code}\n' "$LINE"
done < url-list.txt

(我是从一个precedent问题:<一href=\"http://stackoverflow.com/questions/6136022/script-to-get-the-http-status-$c$c-of-a-list-of-urls\">script得到的URL列表的HTTP状态code?)

修改似乎有在脚本中的错误:它返回200,但如果我的wget -o日志是同一个地址我获得404未找​​到

EDIT There seems to be a bug in the script : it returns "200" but if I wget -o log that same adress I get "404 not found"

推荐答案

有关的乐趣 - 在这里是一个BASH的解决方案:

For the fun - here is an BASH solution:

dosomething() {
        code="$1"; url="$2"
        case "$code" in
                200) echo "OK for $url";;
                302) echo "redir for $url";;
                404) echo "notfound for $url";;
                *) echo "other $code for $url";;
        esac
}

#MAIN program
while read url
do
        uri=($(echo "$url" | sed 's~http://\([^/][^/]*\)\(.*\)~\1 \2~'))
        HOST=${uri[0]:=localhost}
        FILE=${uri[1]:=/}
        exec {SOCKET}<>/dev/tcp/$HOST/80
        echo -ne "GET $FILE HTTP/1.1\nHost: $HOST\n\n" >&${SOCKET}
        res=($(<&${SOCKET} sed '/^.$/,$d' | grep '^HTTP'))
        dosomething ${res[1]} "$url"
done << EOF
http://stackoverflow.com
http://stackoverflow.com/some/bad/url
EOF

这篇关于检查是否URL转到包含文本&QUOT页面; 404 QUOT;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆