防止在Google图片搜索中进行图片热链接 [英] Prevent image hotlinking in Google Image Search

查看:150
本文介绍了防止在Google图片搜索中进行图片热链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,Google推出了其图片搜索的新界面.从2013年1月25日开始,完整尺寸的图像将直接显示在Google内部,而不会将访问者发送到源站点.我偶然发现了一个网站,该网站显然开发了一种复杂的方法,可通过动态引入某种水印来防止用户从Google抓取图像.为此,请在新的Google图片搜索界面上通过"fansshare.com"搜索图片.此链接应该正常工作:

Just recently, Google has introduced a new interface of their Image Search. From January 25 2013 on, full size images are shown directly inside Google, without sending visitors to the source site. I came across a site, that apparently has developed a sophisticated approach to prevent users from grabbing images from Google by introducing some sort of watermark dynamically. To see this, please search on the new Google Image Search interface for images by "fansshare.com". This link should be working: Google Image Search. If not, simply enter "site:fansshare.com" in Google search input filed. Be sure to be on the new search interface, though.

fansshare.com如何实现这一目标?我不知道...

How does fansshare.com achieve this? I couldn't figure it out ...

更新:

fansshare.com为其所有图像URL添加GET参数,例如?rnd = 69 .图片网址示例: http://fansshare.com/media/content/570_Jessica-Biel-talks-Kate-Beckinsale-Total-Recall-fight-5423.jpg?rnd = 62

fansshare.com adds a GET param to all of their image URLs, like ?rnd=69. Example image URL: http://fansshare.com/media/content/570_Jessica-Biel-talks-Kate-Beckinsale-Total-Recall-fight-5423.jpg?rnd=62

此图像URL可以运行几个电话或几秒钟,之后重定向到已缓存的,带有水印的图像: http://fansshare.com/cached/?version=media/content/570_Jessica-Biel-talks-Kate-Beckinsale-Total-Recall-fight-5423.jpg&rnd=5810

This image URL works for a few calls or seconds, after which a redirect takes place to a cached, watermarked image: http://fansshare.com/cached/?version=media/content/570_Jessica-Biel-talks-Kate-Beckinsale-Total-Recall-fight-5423.jpg&rnd=5810

我们终于设法完全模仿FansShare的热链接保护,并在以下广泛的博客文章中发表了我们的发现:

We have finally managed to fully mimic FansShare's hotlink protection and we've published our findings in the following, extensive blog post:

http://pixabay.com/zh-CN/blog/posts/hotlinking-protection-and-watermarking-for-google-32/

http://pixabay.com/en/blog/posts/hotlinking-protection-and-watermarking-for-google-32/

推荐答案

有一个解决方案,但就像其他解决方案一样,由Google解释为隐瞒并禁止他们这样做.这是一个漫长的过程,可能需要进一步修补才能解决您的问题. (对不起,请耐心等待)

There is a solution but just like other solutions it's up to Google to intepret it as cloaking and ban at their will. This is a long one and probably will need further tinkering to work for your case. (Sorry in advance for the length)

设置

为示例起见,我们这样说:

For the sake of the example, let's just say that:

  • 站点:www.thesite.com
  • ImageURL库:images.thesite.com
  • site: www.thesite.com and
  • ImageURL base: images.thesite.com

(但是ImageURL的基础很容易是www.thesites.com/wp-content/uploads)

(but ImageURL base could easily be www.thesites.com/wp-content/uploads)

目标

我们的目标是做到这一点,(1)如果从Google图片搜索中请求,则仅以水印/叠加显示全尺寸图片,并且(2)不破坏以前的工作内容.

Our target is to make it so, (1) the full-size image is shown only with a watermark/overlay if it's requested from google images search and (2) don't break previously working stuff.

解决方案

因此,理论上的解决方案如下.

So the theoretical solution is the following.

1)检查User-Agent,如果其中包含Googlebot,则投放陷阱" URL.陷阱URL是您当前的图像URL,但稍有变化,因此您可以区别对待它,所以请不要使用当前的普通URL:

1) Check the User-Agent and if it contains Googlebot then serve the "trap" URL. The trap URL is your current image URL but slightly changed so you can treat it differently, so instead of the current normal:

http://images.thesite.com/wallpapers/awesome.jpg

您应该为Googlebots打印:

you should print for Googlebots:

http://cacheimages.thesite.com/wallpapers/awesome.jpg

(其中cacheimages是您想要的任何内容)

(where cacheimages is anything you want)

2)现在是主菜;您应该能够将请求定位为http://cacheimages.thesite.com/,并具有一个如下所示的脚本:

2) Now the main dish; you should be able to target the requests to http://cacheimages.thesite.com/ and have a script that acts like following:

 If the request comes from a bot (check user-agent headers)
     Then serve the normal image without watermark
 Else (if the request seems to be from a normal user)
     Then check the referer: If it's from google (but NOT http://www.google.com/blank.html)
          Redirect to the Post of the image (Note 1.)
     Else if the refer is your site
          Show the raw normal image
     Else (any other referer, including http://www.google.com/blank.html)
          Show watermarked image (Note 2.)

注释1 :当人们单击查看原始图像"或图像本身时,就会发生这种情况

Note 1: This will happen when people click "View original image" or the image itself

注释2 :当人们尝试从google图片搜索结果中查看完整尺寸的图片(并且如果他们以某种方式到达图片的陷阱网址)时,就会发生这种情况

Note 2: This will happen when people try to see the full-size image from the google image search results (and if they somehow arrive to the trap url of an image)

3)如果用户代理为Googlebots,则可以将旧图像HTTP重定向到新的ImageURL库,因此覆盖/水印技巧可以更快地处理旧图像(甚至使用Google网站站长工具)如果您将子域用于图片),则一定要保留SEO汁.

3) You could HTTP redirect the old images to the new ImageURL base if the user-agent is Googlebots so the overlay/watermark trick starts working on old images faster (or even use Google Webmaster Tools if you use subdomains for images) and you are sure to preserve the SEO juice.

其他操作

如果您想当真,可以做更多的改变.

You could do more changes if you want to be serious.

  1. 不是显示带水印的图像,而是重定向到更多动态的URL http://cacheimages.thesite.com/preview?p=/wallpapers/awesome.jpg&r=23535 或更现代的HTTP标头用于无索引: X-Robots-Tag: noindex
  2. 当然要缓存带水印的图像
  3. 检查Accept http标头以查找我没有想到的情况,并相应地投放图片或重定向图片.
  1. Instead of showing the watermarked image redirect to more dynamic url http://cacheimages.thesite.com/preview?p=/wallpapers/awesome.jpg&r=23535 or the more modern use of HTTP headers for no indexing: X-Robots-Tag: noindex
  2. Of course cache the watermarked images
  3. Check the Accept http headers for cases that I haven't thought and serve image or redirect image post accordingly.

注意

您可能还需要考虑国际流量,因此您要检查google.[a-z-\.]+/

You may also have to think about international traffic so instead of google.com you want to check for google.[a-z-\.]+/

结论

这可以适用于任何系统,我是针对在子域上具有图像的系统制作的,因此对于其他系统(例如wordpress等)可能不会完全相同.此外,我确信Google会做出更改在接下来的几个月中对他们的图片进行搜索以解决此问题.

This could be adapted to any system, I made it for one that has images on a subdomain, so it probably won't be exactly the same for other systems like wordpress etc. Also, I am sure Google will do a change on their image search in the following couple months to fix this issue.

可以在 Github 中找到未经验证的示例实现.

An untested sample implementation of the idea can be found on Github.

免责声明

此功能尚未经过全面测试,您可能会被禁止使用,仅出于研究和教育目的而提供.我对任何损坏等不承担任何责任.

This hasn't been tested thoroughly and you could get banned, it's merely provided for research and educational purposes. I cannot be held responsible for any damages etc.

这篇关于防止在Google图片搜索中进行图片热链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆