Squarespace RSS 提要是否被 PHP 文件拉取请求阻止? [英] Are Squarespace RSS feeds blocked by PHP file pull requests?

查看:44
本文介绍了Squarespace RSS 提要是否被 PHP 文件拉取请求阻止?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Squarespace 为所有使用其服务的博客等提供了一个内置的 RSS 提要,您可以通过将 ?format=rss 附加到博客 URL 的末尾来查看任何博客的 RSS 提要.比如http://denverdarling.com/home是Squarespace的博客,可以查看RSS通过 http://denverdarling.com/home?format=rss

当您在浏览器的地址栏中手动输入 RSS 源的 URL 时,它会毫不费力地显示 RSS 内容.但是,当我尝试使用 PHP 脚本提取相同的内容时,每次都会出现HTTP 请求失败!HTTP/1.0 400 错误请求"

我尝试了几个不同的 PHP 函数来提取内容,但它们都导致相同的错误.我也在几个不同的 Squarespace 博客上尝试过这个,但它们都导致同样的错误.我尝试过的 PHP 函数包括:file_get_contentsfopensimplexml_load_fileDOMDocument()->load()代码>等.这些都导致HTTP请求失败!HTTP/1.0 400 错误请求"错误.

我在谷歌这个主题时看到的唯一一件事是你不能为受密码保护的博客拉取 RSS 提要,但由于我尝试提取提要的博客都没有受密码保护,我不知道发生了什么.

解决方案

他们可能会阻止无头用户代理

获取",'header'=>接受语言:en\r\n";."User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n"//即 iPad));$context = stream_context_create($options);$file = file_get_contents($url, false, $context);var_dump($file);

这是有效的,他们或他们的主机正在检查请求中的标头并过滤掉特定的东西

Squarespace has a built-in RSS feed for all blogs, etc. that use its service, and you can view the RSS feed for any blog by appending ?format=rss to the end of the URL for the blog. For example, http://denverdarling.com/home is a blog through Squarespace, and you can view the RSS feed for that blog through http://denverdarling.com/home?format=rss

When you manually type in the URL for the RSS feed within a browser's address bar it shows the RSS contents without any trouble. However, when I try to pull the same contents with a PHP script, I get an error every time that says "HTTP request failed! HTTP/1.0 400 Bad Request"

I have tried a few different PHP functions to pull the content, but they all result in the same error. I have also tried this with several different Squarespace blogs, and again they all result in the same error. The PHP functions that I have tried include: file_get_contents, fopen, simplexml_load_file, DOMDocument()->load(), etc. Which all result in a "HTTP request failed! HTTP/1.0 400 Bad Request" error.

The only thing that I see when I google the topic is that you can't pull the RSS feed for a password protected blog, but since none of the blogs I've tried to pull the feeds for are password protected, I'm not sure what's going on.

解决方案

It is possible that they are blocking headless user agents

<?php

$url = "http://denverdarling.com/home?format=rss";

$options = array(
  'http'=>array(
    'method'=>"GET",
    'header'=>"Accept-language: en\r\n" .
              "User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n" // i.e. An iPad 
  )
);

$context = stream_context_create($options);
$file = file_get_contents($url, false, $context);

var_dump($file);

this works, they or their host is checking the header in the request and filtering out particular things

这篇关于Squarespace RSS 提要是否被 PHP 文件拉取请求阻止?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆