如何唯一标识访问我网站的计算机? [英] How do I uniquely identify computers visiting my web site?

查看:24
本文介绍了如何唯一标识访问我网站的计算机?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要找到一种方法来唯一标识访问我创建的网站的每台计算机.有人对如何实现这一目标有任何建议吗?

I need to figure out a way uniquely identify each computer which visits the web site I am creating. Does anybody have any advice on how to achieve this?

因为我希望该解决方案适用于所有机器和所有浏览器(在合理范围内),所以我正在尝试使用 javascript 创建一个解决方案.

Because i want the solution to work on all machines and all browsers (within reason) I am trying to create a solution using javascript.

Cookies 不行.

Cookies will not do.

我需要能够基本上创建一个唯一的计算机和可重复的 guid,假设计算机没有发生硬件更改.我正在考虑的方向是获取网卡的 MAC 和其他这种性质的信息,这些信息将识别访问网站的机器.

I need the ability to basically create a guid which is unique to a computer and repeatable, assuming no hardware changes have happened to the computer. Directions i am thinking of are getting the MAC of the network card and other information of this nature which will id the machine visiting the web site.

推荐答案

简介

我不知道是否有或将有一种方法可以单独使用浏览器来唯一标识机器.主要原因是:

Introduction

I don't know if there is or ever will be a way to uniquely identify machines using a browser alone. The main reasons are:

  • 您需要在用户计算机上保存数据.这个数据可以用户随时删除.除非你有办法重新创建这个每台机器唯一的数据然后你就卡住了.
  • 验证.您需要防范欺骗、会话劫持等.

即使有办法在不使用 cookie 的情况下跟踪计算机,总有一种方法可以绕过它,并且软件会自动执行此操作.如果您确实需要基于计算机跟踪某些内容,则必须编写本机应用程序(Apple Store/Android Store/Windows Program/等).

Even if there are ways to track a computer without using cookies there will always be a way to bypass it and software that will do this automatically. If you really need to track something based on a computer you will have to write a native application (Apple Store / Android Store / Windows Program / etc).

我可能无法回答您提出的问题,但我可以向您展示如何实施会话跟踪.通过会话跟踪,您可以尝试跟踪浏览会话而不是访问您网站的计算机.通过跟踪会话,您的数据库架构将如下所示:

I might not be able to give you an answer to the question you asked but I can show you how to implement session tracking. With session tracking you try to track the browsing session instead of the computer visiting your site. By tracking the session, your database schema will look like this:

sesssion:
  sessionID: string
  // Global session data goes here
  
  computers: [{
     BrowserID: string
     ComputerID: string
     FingerprintID: string
     userID: string
     authToken: string
     ipAddresses: ["203.525....", "203.525...", ...]
     // Computer session data goes here
  }, ...]

基于会话的跟踪的优点:

Advantages of session based tracking:

  1. 对于登录用户,您始终可以从用户 username/password/email 生成相同的会话 ID.
  2. 您仍然可以使用 sessionID 跟踪访客用户.
  3. 即使多人使用同一台计算机(即网吧),如果他们登录,您也可以分别跟踪他们.
  1. For logged in users, you can always generate the same session id from the users username / password / email.
  2. You can still track guest users using sessionID.
  3. Even if several people use the same computer (ie cybercafe) you can track them separately if they log in.

基于会话的跟踪的缺点:

Disadvantages of session based tracking:

  1. 会话是基于浏览器的,而不是基于计算机的.如果用户使用 2 个不同的浏览器,则会导致 2 个不同的会话.如果这是一个问题,您可以在此处停止阅读.
  2. 如果用户未登录,会话就会过期.如果用户未登录,则他们将使用访客会话,如果用户删除 cookie 和浏览器缓存,该会话将失效.

实施

有很多方法可以实现这一点.我不认为我可以涵盖所有这些,我只会列出我最喜欢的,这将使这成为一个自以为是的答案.请记住这一点.

我将使用所谓的永久 cookie 来跟踪会话.即使用户删除了他的 cookie 或更新了他的浏览器,这些数据也会自动重新创建.然而,如果用户删除他们的 cookie 和浏览缓存,它就不会存在.

I will track the session by using what is known as a forever cookie. This is data which will automagically recreate itself even if the user deletes his cookies or updates his browser. It will not however survive the user deleting both their cookies and their browsing cache.

为了实现这一点,我将使用浏览器缓存机制(RFC)、WebStorage API(MDN)和浏览器 cookie(RFCGoogle Analytics).

To implement this I will use the browsers caching mechanism (RFC), WebStorage API (MDN) and browser cookies (RFC, Google Analytics).

为了使用跟踪 ID,您需要将它们添加到您的隐私政策和使用条款中,最好在子标题 跟踪 下.我们将在 document.cookiewindow.localStorage 上使用以下键:

In order to utilize tracking ids you need to add them to both your privacy policy and your terms of use preferably under the sub-heading Tracking. We will use the following keys on both document.cookie and window.localStorage:

  • _ga:Google Analytics 数据
  • __utma:Google Analytics 跟踪 cookie
  • sid:会话 ID
  • _ga: Google Analytics data
  • __utma: Google Analytics tracking cookie
  • sid: SessionID

确保在所有使用跟踪的页面上包含指向您的隐私政策和使用条款的链接.

Make sure you include links to your Privacy policy and terms of use on all pages that use tracking.

您可以将会话数据存储在您的网站数据库或用户计算机上.由于我通常在使用 3rd 方应用程序(Google Analytics/Clicky/等)的较小站点(允许超过 10,000 个连续连接)上工作,因此最好将数据存储在客户端计算机上.这有以下优点:

You can either store your session data in your website database or on the users computer. Since I normally work on smaller sites (let than 10 thousand continuous connections) that use 3rd party applications (Google Analytics / Clicky / etc) it's best for me to store data on clients computer. This has the following advantages:

  1. 无需数据库查找/开销/负载/延迟/空间等
  2. 用户可以随时删除他们的数据,而无需给我写烦人的电子邮件.

和缺点:

  1. 必须对数据进行加密/解密和签名/验证,这会在客户端(还不错)和服务器(呸!)上产生 CPU 开销.
  2. 当用户删除他们的 cookie 和缓存时,数据也会被删除.(这才是我真正想要的)
  3. 当用户离线时,数据无法用于分析.(仅针对当前浏览用户的分析)

UUIDS

  • BrowserID:从浏览器用户代理字符串生成的唯一 ID.Browser|BrowserVersion|OS|OSVersion|Processor|MozzilaMajorVersion|GeckoMajorVersion
  • ComputerID:根据用户 IP 地址和 HTTPS 会话密钥生成.getISP(requestIP)|getHTTPSClientKey()
  • FingerPrintID:基于 JavaScript 的指纹识别,基于修改后的 fingerprint.js.FingerPrint.get()
  • SessionID:用户第一次访问网站时生成的随机密钥.BrowserID|ComputerID|randombytes(256)
  • GoogleID:从 __utma cookie 生成.getCookie(__utma).uniqueid
  • UUIDS

    • BrowserID: Unique id generated from the browsers user agent string. Browser|BrowserVersion|OS|OSVersion|Processor|MozzilaMajorVersion|GeckoMajorVersion
    • ComputerID: Generated from users IP Address and HTTPS session key. getISP(requestIP)|getHTTPSClientKey()
    • FingerPrintID: JavaScript based fingerprinting based on a modified fingerprint.js. FingerPrint.get()
    • SessionID: Random key generated when user 1st visits site. BrowserID|ComputerID|randombytes(256)
    • GoogleID: Generated from __utma cookie. getCookie(__utma).uniqueid
    • 前几天我和我的女朋友一起看 wendy williams 的节目,当主持人建议时我完全被吓坏了她的观众每月至少删除一次浏览器历史记录.删除浏览器历史记录通常有以下效果:

      The other day I was watching the wendy williams show with my girlfriend and was completely horrified when the host advised her viewers to delete their browser history at least once a month. Deleting browser history normally has the following effects:

      1. 删除访问过的网站的历史记录.
      2. 删除 cookie 和 window.localStorage (aww man).
      1. Deletes history of visited websites.
      2. Deletes cookies and window.localStorage (aww man).

      大多数现代浏览器都使此选项随时可用,但不要害怕朋友.因为有解决办法.浏览器有一个缓存机制来存储脚本/图像和其他东西.通常即使我们删除历史记录,这个浏览器缓存仍然存在.我们所需要的只是一种在此处存储数据的方法.有两种方法可以做到这一点.更好的方法是使用 SVG 图像并将我们的数据存储在其标签中.即使使用 Flash 禁用 JavaScript,这种方式仍然可以提取数据.但是,由于这有点复杂,我将演示使用 JSONP 的另一种方法(维基百科)

      Most modern browsers make this option readily available but fear not friends. For there is a solution. The browser has a caching mechanism to store scripts / images and other things. Usually even if we delete our history, this browser cache still remains. All we need is a way to store our data here. There are 2 methods of doing this. The better one is to use a SVG image and store our data inside its tags. This way data can still be extracted even if JavaScript is disabled using flash. However since that is a bit complicated I will demonstrate the other approach which uses JSONP (Wikipedia)

      example.com/assets/js/tracking.js(实际上是 tracking.php)

      example.com/assets/js/tracking.js (actually tracking.php)

      var now = new Date();
      var window.__sid = "SessionID"; // Server generated
      
      setCookie("sid", window.__sid, now.setFullYear(now.getFullYear() + 1, now.getMonth(), now.getDate() - 1));
      
      if( "localStorage" in window ) {
        window.localStorage.setItem("sid", window.__sid);
      }
      

      现在我们可以随时获取会话密钥:

      Now we can get our session key any time:

      window.__sid ||window.localStorage.getItem(sid") ||getCookie("sid") ||""

      如何让 tracking.js 留在浏览器中?

      我们可以使用缓存控制来实现这一点,上次修改 和 ETag HTTP 标头.我们可以使用 SessionID 作为 etag 标头的值:

      We can achieve this using Cache-Control, Last-Modified and ETag HTTP headers. We can use the SessionID as value for etag header:

      setHeaders({
        "ETag": SessionID,
        "Last-Modified": new Date(0).toUTCString(),
        "Cache-Control": "private, max-age=31536000, s-max-age=31536000, must-revalidate"
      })
      

      Last-Modified 标头告诉浏览器这个文件基本上不会被修改.Cache-Control 告诉代理和网关不要缓存文档,而是告诉浏览器将其缓存 1 年.

      Last-Modified header tells the browser that this file is basically never modified. Cache-Control tells proxies and gateways not to cache the document but tells the browser to cache it for 1 year.

      下次浏览器请求文档时,它会发送If-Modified-SinceIf-None-Match 标头.我们可以使用这些来返回 304 Not Modified 响应.

      The next time the browser requests the document, it will send If-Modified-Since and If-None-Match headers. We can use these to return a 304 Not Modified response.

      example.com/assets/js/tracking.php

      $sid = getHeader("If-None-Match") ?: getHeader("if-none-match") ?: getHeader("IF-NONE-MATCH") ?: ""; 
      $ifModifiedSince = hasHeader("If-Modified-Since") ?: hasHeader("if-modified-since") ?: hasHeader("IF-MODIFIED-SINCE");
      
      if( validateSession($sid) ) {
        if( sessionExists($sid) ) {
          continueSession($sid);
          send304();
        } else {
          startSession($sid);
          send304();
        }
      } else if( $ifModifiedSince ) {
        send304();
      } else {
        startSession();
        send200();
      }
      

      现在每次浏览器请求 tracking.js 时,我们的服务器都会响应 304 Not Modified 结果并强制执行 tracking 的本地副本.js.

      Now every time the browser requests tracking.js our server will respond with a 304 Not Modified result and force an execute of the local copy of tracking.js.

      我还是不明白.给我解释一下

      假设用户清除了他们的浏览历史并刷新了页面.唯一留在用户计算机上的是浏览器缓存中的 tracking.js 副本.当浏览器请求 tracking.js 时,它会收到 304 Not Modified 响应,这会导致它执行它收到的 tracking.js 的第一个版本.tracking.js 执行并恢复被删除的 SessionID.

      Lets suppose the user clears their browsing history and refreshes the page. The only thing left on the users computer is a copy of tracking.js in browser cache. When the browser requests tracking.js it recieves a 304 Not Modified response which causes it to execute the 1st version of tracking.js it recieved. tracking.js executes and restores the SessionID that was deleted.

      假设 Haxor X 在客户仍处于登录状态时窃取了他们的 cookie.我们如何保护他们?密码学和浏览器指纹识别来拯救.记住我们对 SessionID 的原始定义是:

      Suppose Haxor X steals our customers cookies while they are still logged in. How do we protect them? Cryptography and Browser fingerprinting to the rescue. Remember our original definition for SessionID was:

      BrowserID|ComputerID|randomBytes(256)
      

      我们可以将其更改为:

      Timestamp|BrowserID|ComputerID|encrypt(randomBytes(256), hk)|sign(Timestamp|BrowserID|ComputerID|randomBytes(256), hk)
      

      其中 hk = sign(Timestamp|BrowserID|ComputerID, serverKey).

      现在我们可以使用以下算法验证我们的SessionID:

      Now we can validate our SessionID using the following algorithm:

      if( getTimestamp($sid) is older than 1 year ) return false;
      if( getBrowserID($sid) !== createBrowserID($_Request, $_Server) ) return false;
      if( getComputerID($sid) !== createComputerID($_Request, $_Server) return false;
      
      $hk = sign(getTimestamp($sid) + getBrowserID($sid) + getComputerID($sid), $SERVER["key"]);
      
      if( !verify(getTimestamp($sid) + getBrowserID($sid) + getComputerID($sid) + decrypt(getRandomBytes($sid), hk), getSignature($sid), $hk) ) return false;
      
      return true; 
      

      现在为了让 Haxor 的攻击起作用,他们必须:

      Now in order for Haxor's attack to work they must:

      1. 具有相同的ComputerID.这意味着他们必须拥有与受害者相同的 ISP 提供商(Tricky).这将使我们的受害者有机会在他们自己的国家采取法律行动.Haxor 还必须从受害者(硬)那里获取 HTTPS 会话密钥.
      2. 具有相同的BrowserID.任何人都可以欺骗用户代理字符串(烦人).
      3. 能够创建自己的假SessionID(非常困难).批量攻击将不起作用,因为我们使用时间戳来生成加密/签名密钥,所以基本上就像为每个会话生成一个新密钥一样.最重要的是,我们对随机字节进行加密,因此简单的字典攻击也是不可能的.
      1. Have same ComputerID. That means they have to have the same ISP provider as victim (Tricky). This will give our victim the opportunity to take legal action in their own country. Haxor must also obtain HTTPS session key from victim (Hard).
      2. Have same BrowserID. Anyone can spoof User-Agent string (Annoying).
      3. Be able to create their own fake SessionID (Very Hard). Volume atacks won't work because we use a time-stamp to generate encryption / signing key so basically its like generating a new key for each session. On top of that we encrypt random bytes so a simple dictionary attack is also out of the question.

      我们可以通过转发 GoogleIDFingerprintID(通过 ajax 或隐藏字段)并匹配它们来改进验证.

      We can improve validation by forwarding GoogleID and FingerprintID (via ajax or hidden fields) and matching against those.

      if( GoogleID != getStoredGoodleID($sid) ) return false;
      if( byte_difference(FingerPrintID, getStoredFingerprint($sid) > 10%) return false;
      

      这篇关于如何唯一标识访问我网站的计算机?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆