匹配所有Google搜索页面的模式 [英] Match pattern for all Google search pages

查看:118
本文介绍了匹配所有Google搜索页面的模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个扩展程序,该扩展程序将对所有Google搜索网址执行特定操作,但不会在其他网站或Google网页上执行。在自然语言中,匹配模式是:


  • 任何协议('*://''www'''

  • >

  • 域名字符串必须等于'google'

  • 字母顶级域名(例如'。com')和多部分国家顶级域名(例如'。co.uk'

  • 路径的前8个字母必须等于'/ search?'



很多人说'匹配所有谷歌搜索页面使用*://*.google.com/search?*但这显然是不真实的,因为它不会匹配国家顶级域名(如google.co.uk)。



因此,以下代码根本不起作用:

  chrome.webRequest.onBeforeRequest.addListener(
函数(详情){
alert('This never happen');
},{
网址:[
*://*.google.*/search?*,
*:// google。 * / search?*,
],
类型:[main_frame]
},
[blocking]
);

使用*://*.google.com/search?* 作为匹配模式不会的工作,但我担心我需要每一个谷歌本地化的列表,因为这是一个有效的策略。

解决方案

不幸的是,匹配模式出于安全原因不允许使用通配符


不能使用通配符匹配模式,如 http://google.*/* 以匹配
顶级域名(例如 http://google.es http://google.fr ),因为实际限制这种匹配的
复杂性仅限于所需的
域。



对于 http://google.*/* 的示例,Google域名将是
匹配的,但 http://google.someotherdomain.com 。此外,
许多网站并不拥有其域名的所有顶级域名。对于
示例,假设您想使用 http://example.*/* 来匹配
http:/ /example.com http://example.es ,但 http://example.net 是一个
的敌对网站。如果您的扩展程序存在错误,则恶意网站可能会
潜在地攻击您的扩展程序,以便获得您的
扩展程序的增加特权。



您应明确列举您希望运行
扩展名的TLD。


稍微不切实际的选项是列出所有所有国家TLD的变体。



编辑:感谢 rsanchez ,这里是所有Google的最新列表使得这种方法可行。



一个现实的选择是注入更大的一组页面(例如所有页面),然后分析URL例如正则表达式),并且只有在它与您正在查找的模式匹配时才会执行。是的,这将是一个令人恐惧的权限警告,您必须向用户解释它。


I'm developing an extension which will perform a certain action on all Google search URLs - but not on other websites or Google pages. In natural language the match pattern is:

  • Any protocol ('*://')
  • Any subdomain or none ('www' or '')
  • The domain string must equal 'google'
  • Any TLD including three-letter TLDs (e.g. '.com') and multi-part country TLDs (e.g. '.co.uk')
  • The first 8 letters of the path must equal '/search?'

Many people say 'to match all google search pages use "*://*.google.com/search?*" but this is patently untrue as it will not match national TLDs like google.co.uk.

Thus the following code does not work at all:

chrome.webRequest.onBeforeRequest.addListener(
  function(details) {
    alert('This never happens');
  }, {
    urls: [
        "*://*.google.*/search?*",
        "*://google.*/search?*",
    ],
    types: ["main_frame"]
  },
  ["blocking"]
);

Using "*://*.google.com/search?*" as the match pattern does work, but I fear I would need a list of every single Google localisation for that to be an effective strategy.

解决方案

Unfortunately, match patterns do not allow wildcards for TLDs for security reasons.

You cannot use wildcard match patterns like http://google.*/* to match TLDs (like http://google.es and http://google.fr) due to the complexity of actually restricting such a match to only the desired domains.

For the example of http://google.*/*, the Google domains would be matched, but so would http://google.someotherdomain.com. Additionally, many sites do not own all of the TLDs for their domain. For an example, assume you want to use http://example.*/* to match http://example.com and http://example.es, but http://example.net is a hostile site. If your extension has a bug, the hostile site could potentially attack your extension in order to get access to your extension's increased privileges.

You should explicitly enumerate the TLDs that you wish to run your extension on.

A slightly unrealistic option would be to list all variants with all national TLDs.

Edit: thanks to an incredibly helpful comment by rsanchez, here's an up to date list of all Google domain variants which makes this approach viable.

A realistic option is to inject into a larger set of pages (for instance, all pages), then analyze the URL (with a regexp, for example) and only execute if it matches the pattern you are looking for. Yes, it will be a scarier permissions warning, and you will have to explain it to your users.

这篇关于匹配所有Google搜索页面的模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆