阻止脚本编写者抨击您的网站 [英] Stopping scripters from slamming your website

查看:60
本文介绍了阻止脚本编写者抨击您的网站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经接受了答案,但可悲的是,我相信我们会坚持最初的最坏情况:让每个人都在尝试购买废话.简短说明:缓存/Web场使无法跟踪命中,并且任何解决方法(发送非缓存的Web信标,写入统一表等)都会使网站的速度比僵尸程序慢.思科或某些类似的昂贵硬件可能会在较高程度上提供帮助,但是如果将每个人都验证码作为替代方案,则很难证明其成本合理.稍后,我将尝试进行更详细的解释,并为以后的搜索者进行清理(尽管欢迎其他人尝试,因为它是社区Wiki).

I've accepted an answer, but sadly, I believe we're stuck with our original worst case scenario: CAPTCHA everyone on purchase attempts of the crap. Short explanation: caching / web farms make it impossible to track hits, and any workaround (sending a non-cached web-beacon, writing to a unified table, etc.) slows the site down worse than the bots would. There is likely some pricey hardware from Cisco or the like that can help at a high level, but it's hard to justify the cost if CAPTCHA-ing everyone is an alternative. I'll attempt a more full explanation later, as well as cleaning this up for future searchers (though others are welcome to try, as it's community wiki).

情况

这与woot.com上的垃圾袋销售有关.我是Woot Workshop的总裁,该工作室是Woot的子公司,负责设计,撰写产品说明,播客,博客文章,并主持论坛.我使用CSS/HTML,几乎不熟悉其他技术.我与开发人员紧密合作,并在这里讨论了所有答案(以及我们已有的许多其他想法).

Situation

This is about the bag o' crap sales on woot.com. I'm the president of Woot Workshop, the subsidiary of Woot that does the design, writes the product descriptions, podcasts, blog posts, and moderates the forums. I work with CSS/HTML and am only barely familiar with other technologies. I work closely with the developers and have talked through all of the answers here (and many other ideas we've had).

可用性是我工作的重要部分,而使网站令人兴奋和有趣则是其余的大部分.这就是以下三个目标的来源. CAPTCHA损害了可用性,并且机器人从我们的垃圾交易中窃取了乐趣和兴奋.

Usability is a massive part of my job, and making the site exciting and fun is most of the rest of it. That's where the three goals below derive. CAPTCHA harms usability, and bots steal the fun and excitement out of our crap sales.

机器人在第二次屏幕抓取(和/或扫描我们的RSS)进行随机垃圾销售时,猛烈地抨击了我们的首页.他们看到该消息后,便会触发该程序的第二阶段,登录该程序,单击我想要的",填写表格,然后购买废话.

Bots are slamming our front page tens of times a second screen scraping (and/or scanning our RSS) for the Random Crap sale. The moment they see that, it triggers a second stage of the program that logs in, clicks I want One, fills out the form, and buys the crap.

lc: On stackoverflow and other sites that use this method, they're almost always dealing with authenticated (logged in) users, because the task being attempted requires that.

Woot,匿名(未登录)用户可以查看我们的主页.换句话说,砰砰机器人可以是未经身份验证的(除了IP地址以外,基本上是不可跟踪的).

On Woot, anonymous (non-logged) users can view our home page. In other words, the slamming bots can be non-authenticated (and essentially non-trackable except by IP address).

因此,我们将返回到扫描IP的位置,a)在云网络和垃圾邮件僵尸时代,b)毫无用处; b)鉴于来自一个IP地址的企业数量,捕获了太多的无辜者(更不用说了)非静态IP ISP的问题以及可能对性能造成的影响,以试图对此进行跟踪.

So we're back to scanning for IPs, which a) is fairly useless in this age of cloud networking and spambot zombies and b) catches too many innocents given the number of businesses that come from one IP address (not to mention the issues with non-static IP ISPs and potential performance hits to trying to track this).

哦,让人们打电话给我们将是最糟糕的情况.我们可以让他们给您打电话吗?

Oh, and having people call us would be the worst possible scenario. Can we have them call you?

BradC: Ned Batchelder's methods look pretty cool, but they're pretty firmly designed to defeat bots built for a network of sites. Our problem is bots are built specifically to defeat our site. Some of these methods could likely work for a short time until the scripters evolved their bots to ignore the honeypot, screen-scrape for nearby label names instead of form ids, and use a javascript-capable browser control.

lc again: "Unless, of course, the hype is part of your marketing scheme." Yes, it definitely is. The surprise of when the item appears, as well as the excitement if you manage to get one is probably as much or more important than the crap you actually end up getting. Anything that eliminates first-come/first-serve is detrimental to the thrill of 'winning' the crap.

novatrust: And I, for one, welcome our new bot overlords. We actually do offer RSSfeeds to allow 3rd party apps to scan our site for product info, but not ahead of the main site HTML. If I'm interpreting it right, your solution does help goal 2 (performance issues) by completely sacrificing goal 1, and just resigning the fact that bots will be buying most of the crap. I up-voted your response, because your last paragraph pessimism feels accurate to me. There seems to be no silver bullet here.

其余的响应通常依赖于IP跟踪,这再次似乎既无用(对于僵尸网络/僵尸网络/云网络)又有害(捕获来自相同IP目的地的许多无辜者).

The rest of the responses generally rely on IP tracking, which, again, seems to both be useless (with botnets/zombies/cloud networking) and detrimental (catching many innocents who come from same-IP destinations).

还有其他方法/想法吗?我的开发人员一直在说让我们做验证码"但我希望对所有想要我们胡说八道的实际人类的侵入式方法更少.

Any other approaches / ideas? My developers keep saying "let's just do CAPTCHA" but I'm hoping there's less intrusive methods to all actual humans wanting some of our crap.

假设您要出售的便宜商品具有很高的感知价值,而数量却非常有限.没有人确切知道您何时会出售该物品.经常有超过一百万的人过来看看你在卖什么.

Say you're selling something cheap that has a very high perceived value, and you have a very limited amount. No one knows exactly when you will sell this item. And over a million people regularly come by to see what you're selling.

最终,脚本编写者和机器人程序会试图以编程方式[a]弄清楚何时出售该商品,[b]确保他们是第一个购买该商品的人.这很烂,原因有两个:

You end up with scripters and bots attempting to programmatically [a] figure out when you're selling said item, and [b] make sure they're among the first to buy it. This sucks for two reasons:

  1. 您的网站遭到非人类的猛烈抨击,拖慢了所有人的步伐.
  2. 脚本编写者最终赢得"了产品,使常规用户感到被骗.

一个看似显而易见的解决方案是在下订单之前为用户创建一些跳转框,但这至少存在三个问题:

A seemingly obvious solution is to create some hoops for your users to jump through before placing their order, but there are at least three problems with this:

  • 用户体验很糟糕,因为他们必须破译验证码,挑选猫或解决数学问题.
  • 如果所感知到的利益足够高,而人群又足够多,那么一些人将在任何调整中找到自己的出路,从而导致军备竞赛. (特别简单的调整是正确的;隐藏的注释"表单,重新排列表单元素,给它们贴错标签,隐藏的陷阱"文本都将一次起作用,然后需要进行更改以针对此特定表单进行战斗.)
  • 即使脚本编写者无法解决"您的调整,也不能阻止他们猛击您的首页,然后发出警报,要求脚本编写者手动填写订单.如果他们从解决[a]中获得优势,他们将很可能仍然会赢得[b],因为他们将是第一个到达订单页面的人.此外,1.仍然会发生,导致服务器错误并降低每个人的性能.

另一种解决方案是监视IP击中次数过多,将其从防火墙中阻止或以其他方式阻止其排序.这可以解决2.并防止[b],但是扫描IP会对性能造成巨大影响,并且可能比脚本编写者自己引起的问题更多,例如1..此外,云网络和spambot僵尸的可能性使IP检查变得毫无用处.

Another solution is to watch for IPs hitting too often, block them from the firewall, or otherwise prevent them from ordering. This could solve 2. and prevent [b] but the performance hit from scanning for IPs is massive and would likely cause more problems like 1. than the scripters were causing on their own. Additionally, the possibility of cloud networking and spambot zombies makes IP checking fairly useless.

第三个想法,迫使订单被加载一段时间(例如半秒钟),可能会减慢快速订单的进度,但同样,脚本编写者仍然无论如何都是第一人.不会损害实际用户.

A third idea, forcing the order form to be loaded for some time (say, half a second) would potentially slow the progress of the speedy orders, but again, the scripters would still be the first people in, at any speed not detrimental to actual users.

  1. 将商品出售给没有文字的人.
  2. 让网站以不会被机器人放慢的速度运行.
  3. 不要为了完成证明自己是人的任务而烦扰普通"用户.

推荐答案

您需要找到一种方法,使机器人购买价格过高的东西:12毫米小坚果:20美元.在脚本编写者决定您要对其进行游戏之前,请查看有多少机器人抢购了.

You need to figure a way to make the bots buy stuff that is massively overpriced: 12mm wingnut: $20. See how many bots snap up before the script-writers decide you're gaming them.

利用这些利润购买更多服务器并支付带宽.

Use the profits to buy more servers and pay for bandwidth.

这篇关于阻止脚本编写者抨击您的网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆