Selenium请求的HTTP标头中缺少引荐来源 [英] Referer missing in HTTP header of Selenium request
问题描述
我正在用Selenium编写一些测试,并注意到,标头中缺少Referer
.我编写了以下最小示例,以使用 https://httpbin.org/headers 进行测试:
I'm writing some tests with Selenium and noticed, that Referer
is missing from the headers. I wrote the following minimal example to test this with https://httpbin.org/headers:
import selenium.webdriver
options = selenium.webdriver.FirefoxOptions()
options.add_argument('--headless')
profile = selenium.webdriver.FirefoxProfile()
profile.set_preference('devtools.jsonview.enabled', False)
driver = selenium.webdriver.Firefox(firefox_options=options, firefox_profile=profile)
wait = selenium.webdriver.support.ui.WebDriverWait(driver, 10)
driver.get('http://www.python.org')
assert 'Python' in driver.title
url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
wait.until(lambda driver: driver.current_url == url)
print(driver.page_source)
driver.close()
哪些印刷品:
<html><head><link rel="alternate stylesheet" type="text/css" href="resource://content-accessible/plaintext.css" title="Wrap Long Lines"></head><body><pre>{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.5",
"Connection": "close",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
}
}
</pre></body></html>
因此没有Referer
.但是,如果我浏览到任何页面并手动执行
So there is no Referer
. However, if I browse to any page and manually execute
window.location.href = "https://httpbin.org/headers"
在Firefox控制台中,按预期显示Referer
.
in the Firefox console, Referer
does appear as expected.
如以下评论所述,使用时
As pointed out in the comments below, when using
driver.get("javascript: window.location.href = '{}'".format(url))
代替
driver.execute_script("window.location.href = '{}';".format(url))
该请求确实包含Referer
.另外,当使用Chrome而不是Firefox时,两种方法都包含Referer
.
the request does include Referer
. Also, when using Chrome instead of Firefox, both methods include Referer
.
所以主要问题仍然存在:如上所述,使用Firefox发送请求时,为什么请求中缺少Referer
?
So the main question still stands: Why is Referer
missing in the request when sent with Firefox as described above?
推荐答案
Referer
Referer
请求标头包含上一个网页的地址,从该地址开始到当前请求的页面的链接.Referer
标头允许服务器识别人们从何处访问它们,例如,可以将该数据用于分析,日志记录或优化的缓存.
The
Referer
request header contains the address of the previous web page from which a link to the currently requested page was followed. TheReferer
header allows servers to identify where people are visiting them from and may use that data for analytics, logging, or optimized caching, for example.
重要提示:尽管此标头有许多无害的用法,但对于用户安全和隐私可能会产生不良后果.
Important: Although this header has many innocent uses it can have undesirable consequences for user security and privacy.
来源: https://developer.mozilla.org /en-US/docs/Web/HTTP/Headers/Referer
但是:
在以下情况下,浏览器不会发送Referer标头:
A Referer header is not sent by browsers if:
- 引荐资源是本地文件"或数据" URI.
- 使用了不安全的HTTP请求,并使用安全协议(HTTPS)接收了引荐页.
来源: https://developer.mozilla.org /en-US/docs/Web/HTTP/Headers/Referer
与Referer
HTTP标头相关的一些隐私和安全风险:
There are some privacy and security risks associated with the Referer
HTTP header:
Referer
标头包含前一个网页的地址,从该地址开始一直指向当前请求的页面的链接,该地址可进一步用于分析,日志记录或优化的缓存.
The
Referer
header contains the address of the previous web page from which a link to the currently requested page was followed, which can be further used for analytics, logging, or optimized caching.
从Referer
标头的角度来看,可以通过以下步骤缓解大多数安全风险:
From the Referer
header perspective majority of security risks can be mitigated following the steps:
Referrer-Policy
:使用服务器上的Referrer-Policy
标头,以控制通过Referer标头发送哪些信息.同样,无引荐的指令将完全忽略引荐标头.- HTML元素上的
referrerpolicy
属性可能会泄漏此类信息(例如<img>
和<a>
).例如,可以将其设置为no-referrer
以停止完全发送Referer
标头.- 在可能泄漏此类信息(例如
<img>
和<a>
)的HTML元素上,rel
属性设置为noreferrer
.- 退出页面重定向技术:这是目前没有缺陷的唯一可行方法,是使退出页面不包含在
referer
标头中.许多网站都采用这种方法,包括Google和Facebook.如果正确实现,它不会显示引用者数据显示私人信息,而只会显示用户来自的网站.而不是引荐来源网址数据显示为http://example.com/user/foobar
,而是新的引荐来源网址数据显示为http://example.com/exit?url=http%3A%2F%2Fexample.com
.该方法的工作方式是让您网站上的所有外部链接都转到中间页面,然后该页面重定向到最终页面.下面我们有一个指向网站example.com
的链接,并且URL对完整URL进行了编码,并将其添加到退出页面的url
参数中.
Referrer-Policy
: Using theReferrer-Policy
header on your server to control what information is sent through the Referer header. Again, a directive of no-referrer would omit the Referer header entirely.- The
referrerpolicy
attribute on HTML elements that are in danger of leaking such information (such as<img>
and<a>
). This can for example be set tono-referrer
to stop theReferer
header being sent altogether.- The
rel
attribute set tonoreferrer
on HTML elements that are in danger of leaking such information (such as<img>
and<a>
).- The Exit Page Redirect technique: This is the only method that should work at the moment without flaw is to have an exit page that you don’t mind having inside of the
referer
header. Many websites implement this method, including Google and Facebook. Instead of having the referrer data show private information, it only shows the website that the user came from, if implemented correctly. Instead of the referrer data appearing ashttp://example.com/user/foobar
the new referrer data will appear ashttp://example.com/exit?url=http%3A%2F%2Fexample.com
. The way the method works is by having all external links on your website go to a intermediary page that then redirects to the final page. Below we have a link to the websiteexample.com
and we URL encode the full URL and add it to theurl
parameter of our exit page.
来源:
- https://developer.mozilla.org /en-US/docs/Web/Security/Referer_header:_privacy_and_security_concerns#How_can_we_fix_this
- https://geekthis.net/post/hide-http-referer-headers/#exit-page-redirect
- https://developer.mozilla.org/en-US/docs/Web/Security/Referer_header:_privacy_and_security_concerns#How_can_we_fix_this
- https://geekthis.net/post/hide-http-referer-headers/#exit-page-redirect
我已经通过GeckoDriver/Firefox和ChromeDriver/Chrome组合执行了您的代码:
I have executed your code through both through GeckoDriver/Firefox and ChromeDriver/Chrome combination:
driver.get('http://www.python.org')
assert 'Python' in driver.title
url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
WebDriverWait(driver, 10).until(lambda driver: driver.current_url == url)
print(driver.page_source)
观察:
-
使用GeckoDriver/Firefox
Referer: "https://www.python.org/"
标头丢失,如下所示:Observation:
Using GeckoDriver/Firefox
Referer: "https://www.python.org/"
header was missing as follows:{ "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "en-US,en;q=0.5", "Host": "httpbin.org", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0" } }
-
使用ChromeDriver/Chrome
Referer: "https://www.python.org/"
标头出现,如下所示: Using ChromeDriver/Chrome
Referer: "https://www.python.org/"
header was present as follows:{ "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "en-US,en;q=0.9", "Host": "httpbin.org", "Referer": "https://www.python.org/", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36" } }
在处理
Referer
标头时,GeckoDriver/Firefox似乎是一个问题.It seems to be an issue with GeckoDriver/Firefox in handling the
Referer
header.这篇关于Selenium请求的HTTP标头中缺少引荐来源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!