确定网站的唯一访问者 [英] Determine unique visitors to site

查看:51
本文介绍了确定网站的唯一访问者的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Apache2作为服务器创建django网站.我需要一种方法来以完全证明的方式确定我的网站(特别是每个页面的唯一身份访问者)的数量.不幸的是,用户将有很大的动力去尝试游戏"跟踪系统,因此我试图提供充分的证据.

I'm creating a django website with Apache2 as the server. I need a way to determine the number of unique visitors to my website (specifically to every page in particular) in a full proof way. Unfortunately users will have high incentives to try to "game" the tracking systems so I'm trying to make it full proof.

有什么办法吗?

当前,我正在尝试使用IP&Cookie可以确定唯一身份访问者,但是该系统可以通过无头浏览器轻松地欺骗.

Currently I'm trying to use IP & Cookies to determine unique visitors, but this system can be easily fooled with a headless browser.

推荐答案

除非有必要将数据集成到您的Django数据库中,否则我强烈建议您将流量外包"给另一个提供商.我对Google Analytics(分析)感到非常满意.

Unless it's necessary that the data be integrated into your Django database, I'd strongly recommend "outsourcing" your traffic to another provider. I'm very happy with Google Analytics.

否则,您几乎无能为力,无法阻止某人玩系统游戏.您可以基于IP地址进行限制,但随后您会遇到一个问题,即许多唯一身份访问者经常共享IP(例如,通过大学,组织或工作地点).Cookie非常容易清除,因此,如果您走那条路线,那么游戏就非常容易.

Failing that, there's really little you can do to keep someone from gaming the system. You could limit based on IP address but then of course you run into the problem that often many unique visitors share IPs (say, via a university, organization, or work site). Cookies are very easy to clear out, so if you go that route then it's very easy to game.

更难摆脱的一件事是存储在应用程序缓存中的文件,因此一种在现代浏览器中可以使用的可能解决方案是将文件存储在应用程序缓存中.您会把它第一次加载时计为唯一访问,在此之后,由于已被缓存,因此不会再计入它们.

One thing that's harder to get rid of is files stored in the appcache, so one possible solution that would work on modern browsers is to store a file in the appcache. You'd count the first time it was loaded in as the unique visit, and after that since it's cached they don't get counted again.

当然,由于您可能需要将其向后兼容,因此当然可以让它完全打开最有可能用于游戏系统的各种工具,例如curl.

Of course, since you presumably need this to be backwards compatible then of course it leaves it open to exactly the sorts of tools which are most likely to be used for gaming the system, such as curl.

您当然可以阻止非浏览器类用户代理,如果某些游戏玩家不了解欺骗浏览器代理字符串(大多数人会很快学习),这将使难度稍微增加.

You can certainly block non-browserlike user agents, which makes it slightly more difficult if some gamers don't know about spoofing browser agent strings (which most will quickly learn).

真的,最好的解决方案可能是-访问页面会产生什么结果?例如,如果是销售产品,则不要奖励浏览量最高的人;奖励点击量最高的人.或有人可能会在页面上执行任何耗时的操作.

Really, the best solution might be -- what is the outcome from a visit to a page? If it is, for example, selling a product, then don't award people who have the most page views; award the people whose hits generate the most sales. Or whatever time-consuming action someone might take at the page.

如果您愿意忽略禁用JavaScript的用户,则可以选择只计算访问该页面的人,然后在给定的时间范围内(例如1分钟)停留在该页面上).在给定的时间段后,向服务器发出Ajax请求.因此,如果他们尝试通过更改Cookie并一次加载多个标签来进行游戏,那么它将无法正常工作,因为他们需要具有相同的Cookie才能注册已经在该页面上停留了足够长的时间.我实际上认为这可能有效;老实说,我看不出有什么办法.基本上,在服务器端,您在 request.session 中存储了一个名为 stay_until 的字典,其中包含每个唯一页面的键,大约1分钟后,您将向服务器运行Ajax调用.如果 stay_until [page_id] 的值小于或等于当前时间,则它们是活动用户,否则不是活动用户.这意味着,某人至少需要 20分钟才能产生20位唯一身份访问者,而且只要您获得的收益少于所花费的时间,那将是很不利的.

If you're willing to ignore people with JavaScript disabled, you could choose to count only people who access the page and then stay on that page for a given window of time (say, 1 minute). After a given period of time, do an Ajax request back to the server. So if they tried to game by changing their cookie and loading multiple tabs at once, it wouldn't work because they'd need to have the same cookie in order to register that they'd been on that page long enough. I actually think this might work; I can't honestly see a way to game that. Basically on the server side you store a dictionary called stay_until in request.session with keys for each unique page and after 1 minute or so you run an Ajax call back to the server. If the value for stay_until[page_id] is less than or equal to the current time, then they're an active user, otherwise they're not. This means that it will take someone at least 20 minutes to generate 20 unique visitors, and so long as you make the payoff worth less than the time consumed that will be a strong disincentive.

我什至更明确地指出:在页面底部的 noscript 标记中,在页面底部放置您的访问不被计算.打开要计算的JavaScript".跟踪过程.

I'd even make it more explicit: on the bottom of the page in a noscript tag, put "Your access was not counted. Turn on JavaScript to be counted" with a page that lays out the tracking process.

这篇关于确定网站的唯一访问者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆