使用CURL登录到亚马逊 [英] Login to amazon using CURL

查看:904
本文介绍了使用CURL登录到亚马逊的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用curl登录到亚马逊,但是当我发送POST数据我没有得到任何东西,我想使用curl只是我不想使用任何API。这是我尝试的代码:

 <?php 
$ curl_crack = curl_init
CURL_SETOPT($ curl_crack,CURLOPT_URL,https://www.amazon.com/ap/signin?_encoding=UTF8&openid.assoc_handle=usflex&openid.claimed_id=http%3A%2F%2Fspecs.openid。 net%2Fauth%2F2.0%2Fidentifier_select& openid.identity = http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select& openid.mode = checkid_setup& openid.ns = http%3A%2F% 2Fspecs.openid.net%2Fauth%2F2.0& openid.ns.pape = http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0& openid.pape.max_auth_age = 0& openid.return_to = https%3A%2F%2Fwww.amazon.com%2F%3Fref_%3Dnav_custrec_signin);
CURL_SETOPT($ curl_crack,CURLOPT_USERAGENT,$ _ SERVER ['HTTP_USER_AGENT']);
// CURL_SETOPT($ curl_crack,CURLOPT_PROXY,trim($ socks [$ sockscount]));
// CURL_SETOPT($ curl_crack,CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($ curl_crack,CURLOPT_POST,True);
CURL_SETOPT($ curl_crack,CURLOPT_POSTFIELDS,appAction = SIGNIN& email=test@hotmail.com& create = 0& password = test123);
CURL_SETOPT($ curl_crack,CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($ curl_crack,CURLOPT_COOKIEFILE,cookie.txt);
curl_setopt($ curl_crack,CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ curl_crack,CURLOPT_FOLLOWLOCATION,1);
CURL_SETOPT($ curl_crack,CURLOPT_TIMEOUT,30);
echo $ check = curl_exec($ curl_crack);

?>


解决方案

测试&工作。



编辑:此代码在2016年6月前停止工作。亚马逊已添加客户端JavaScript浏览器指纹,登录如下所示。它实际上不是很难绕过,但我没有花时间在工程PHP代码这样做会很容易破坏小的更改。



相反,我发布了使用 CasperJS 登录的旧PHP代码下面的示例。也可以使用PhatomJS或Selenium。



为了提供一些背景,Jaavascript填充了一个名为 metaData1 的额外表单字段,它包含一个包含模糊浏览器信息的base64编码字符串。一些可能与服务器端收集的数据进行比较。



这里是一个示例字符串(编码前):


9E0AC647#{version:2.3.6-AUI,start:1466184997409,elapsed:5,userAgent:Mozilla / 5.0 (X11; Linux x86_64)AppleWebKit / 537.36(KHTML,像Gecko)Chrome / 51.0.2704.84 Safari / 537.36,plugins:Chrome PDF Viewer Shockwave Flash 2100Widevine Content Decryption Module 148885Native Client || 1600-1200-1150-24 - - - ,dupedPlugins:Chrome PDF查看器Shockwave Flash 2100Widevine内容解密模块148885Native客户端Chrome PDF查看器|| 1600-1200-1150-24 -


< blockquote>

正如你可以看到,字符串包含一些令人毛骨悚然的信息,加载什么浏览器插件,你的键和鼠标点击页面上的计数, trueIp 是您的计算机的32位IP地址,您的时区,屏幕分辨率和视口分辨率,以及您在登录页面上的时间。有更多的信息,它可以收集,但这是一个从我的浏览器的示例。



9E0AC647 后的字符串的crc32校验和 - 它将不匹配,因为我更改了trueIp和其他数据。



这是一个永久性的paste 负责所有这些的JS代码。






步骤:




  • 获取首页以建立Cookie

  • 解析HTML以提取

  • 获取登录页面

  • 解析HTML并查找登录表单
  • 有很多必要的隐藏字段)
  • 创建用于登录的帖子数组

  • 提交登录表单

  • 检查是否成功



PHP代码(不再工作 - 参见下面的示例):

 <?php 

// amazon username& password
$ username ='you@example.com';
$ password ='yourpassword';

//请求的HTTP头
$ headers = array(
'Accept:text / html,application / xhtml + xml,application / xml; q = 0.9,* / *; q = 0.8',
'Accept-Language:en-US,en; q = 0.5',
'Connection:keep-alive',
'DNT:1' / :)
);

//初始化curl
$ ch = curl_init('https://www.amazon.com/');
curl_setopt($ ch,CURLOPT_USERAGENT,'Mozilla / 5.0(X11; Ubuntu; Linux x86_64; rv:42.0)Gecko / 20100101 Firefox / 42.0');
curl_setopt($ ch,CURLOPT_FOLLOWLOCATION,1);
curl_setopt($ ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ ch,CURLOPT_COOKIEFILE,'');
curl_setopt($ ch,CURLOPT_ENCODING ,'gzip,deflate');

//获取主页以建立cookie
$ result = curl_exec($ ch);

//解析HTML寻找登录URL
libxml_use_internal_errors(true);
$ dom = new DOMDocument();
$ dom-> loadHTML($ result);

//找到登录页面的链接
$ xpath = new DOMXPath($ dom);
$ elements = $ xpath-> query('// * [@ id =nav-link-yourAccount]');

if($ elements-> length == 0){
die('Did not findsign-inlink!
}

// get login url
$ url = $ elements-> item(0) - > attribute-> getNamedItem('href') - > nodeValue;

if(strpos($ url,'http')!== 0){
$ url ='https://www.amazon.com'。 $ url;
}

// fetch登录页面
curl_setopt($ ch,CURLOPT_URL,$ url);
$ result = curl_exec($ ch);

//解析html以获取表单输入
$ dom-> loadHTML($ result);
$ xpath = new DOMXPath($ dom);

//查找签入表单输入
$ inputs = $ xpath-> query('// form [@ name =signIn] // input');

if($ inputs-> length == 0){
die('无法找到登录表单域!
}

//获取登录信息url
$ url = $ xpath-> query('// form [@ name =signIn]');
$ url = $ url-> item(0) - > attributes-> getNamedItem('action') - > nodeValue; //表单操作(登录URL)

//提交的表单字段数组
$ fields = array();

//构建表单输入和值的列表
for($ i = 0; $ i< $ inputs-> length; ++ $ i){
$ attribs = $ inputs-> item($ i) - >属性;

if($ attribs-> getNamedItem('name')!== null){
$ val =(null!== $ attribs-> getNamedItem('value') )? $ attribs-> getNamedItem('value') - > nodeValue:'';
$ fields [$ attribs-> getNamedItem('name') - > nodeValue] = $ val;
}
}

//填充登录表单字段
$ fields ['email'] = $ username;
$ fields ['password'] = $ password;

//准备登录
curl_setopt($ ch,CURLOPT_URL,$ url);
curl_setopt($ ch,CURLOPT_POST,1);
curl_setopt($ ch,CURLOPT_POSTFIELDS,http_build_query($ fields));

//执行登录post
$ result = curl_exec($ ch);
$ info = curl_getinfo($ ch);

//如果登录失败,url应该和登录url一样
if($ info ['url'] == $ url){
echo登录时出现问题。< br> \\\
;
var_dump($ result);
} else {
//如果成功,我们被重定向到主页,因此URL不同于登录url
echo应该登录!< br> \\\
;
var_dump($ result);
}

使用CasperJS代码

  var casper = require('casper')。create(); 

casper.userAgent('Mozilla / 5.0(X11; Ubuntu; Linux x86_64; rv:46.0​​)Gecko / 20100101 Firefox / 46.0');
phantom.cookiesEnabled = true;

var AMAZON_USER ='you@yoursite.com';
var AMAZON_PASS ='some crazy password';

casper.start('https://www.amazon.com/').thenClick('a#nav-link-yourAccount',function(){
this.echo 'Title:'+ this.getTitle());

var emailInput ='input#ap_email';
var passInput ='input#ap_password';

this.mouseEvent('click',emailInput,'15%','48%');
this.sendKeys('input#ap_email',AMAZON_USER);

this.wait 3000,function(){
this.mouseEvent('click',passInput,'12%','67%');
this.sendKeys('input#ap_password',AMAZON_PASS);

this.mouseEvent('click','input#signInSubmit','50%','50%');
});
});

casper.then(function(e){
this.wait(5000,function(){
this.echo('Capping');
this。 capture('amazon.png');
});
});


casper.run(function(){
console.log('Done');

casper.done();
});

您应该真正扩展此代码才能更像人类! / p>

I'm trying to login to amazon using curl, however when i send the POST data I'm not getting anything and i want to use curl only i don't want to use any API. This is the code that i tried:

<?php
$curl_crack = curl_init();
CURL_SETOPT($curl_crack,CURLOPT_URL,"https://www.amazon.com/ap/signin?_encoding=UTF8&openid.assoc_handle=usflex&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.mode=checkid_setup&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.amazon.com%2F%3Fref_%3Dnav_custrec_signin");
CURL_SETOPT($curl_crack,CURLOPT_USERAGENT,$_SERVER['HTTP_USER_AGENT']);
//CURL_SETOPT($curl_crack,CURLOPT_PROXY,trim($socks[$sockscount]));
//CURL_SETOPT($curl_crack,CURLOPT_PROXYTYPE,CURLPROXY_SOCKS5);
CURL_SETOPT($curl_crack,CURLOPT_POST,True);
CURL_SETOPT($curl_crack,CURLOPT_POSTFIELDS,"appAction=SIGNIN&email=test@hotmail.com&create=0&password=test123");
CURL_SETOPT($curl_crack,CURLOPT_RETURNTRANSFER,True);
CURL_SETOPT($curl_crack,CURLOPT_COOKIEFILE,"cookie.txt");
curl_setopt($curl_crack, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl_crack, CURLOPT_FOLLOWLOCATION, 1);
CURL_SETOPT($curl_crack,CURLOPT_TIMEOUT,30);  
echo $check = curl_exec($curl_crack);

?> 

解决方案

Here you go. Tested & working.

EDIT: This code stopped working sometime before June 2016. Amazon has added client side Javascript browser fingerprinting that breaks automated logins like the one below. It's actually not that hard to bypass but I haven't spent time on engineering PHP code to do so which would be easily breakable by minor changes.

Instead, I've posted an example below the old PHP code that uses CasperJS to log in. PhatomJS or Selenium could also be used.

To supply a little background, an extra form field called metaData1 is populated by Jaavascript which contains a base64 encoded string of obfuscated browser information. Some of it might be compared with server side collected data.

Here's an example string (before encoding):

9E0AC647#{"version":"2.3.6-AUI","start":1466184997409,"elapsed":5,"userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36","plugins":"Chrome PDF Viewer Shockwave Flash 2100Widevine Content Decryption Module 148885Native Client ||1600-1200-1150-24---","dupedPlugins":"Chrome PDF Viewer Shockwave Flash 2100Widevine Content Decryption Module 148885Native Client Chrome PDF Viewer ||1600-1200-1150-24---","flashVersion":"21.0.0","timeZone":-8,"lsUbid":"X69-8317848-6241674:1466184997","mercury":{"version":"2.1.0","start":1467231996334,"ubid":"X69-8317848-6241674:1466184997","trueIp":"1020304","echoLatency":831},"timeToSubmit":57868,"interaction":{"keys":47,"copies":0,"cuts":0,"pastes":0,"clicks":6}}

As you can see the string contains some creepy information, what browser plugins are loaded, your key and mouse click count on the page, the trueIp is a 32-bit long IP address of your computer, your time zone, screen resolution and viewport resolution, and how long you were on the login page. There's quite a bit more info that it can collect, but this is a sample from my browser.

The value 9E0AC647 is a crc32 checksum of the string after the # - it won't match because I changed trueIp and other data. This data then goes through some transformation (encoding) using some values from Javascript, is base64 encoded, and then added to the login form.

Here's a permanent paste of the JS code responsible for all of this.


The steps:

  • Fetch the home page to establish cookies
  • Parse HTML to extract login URL
  • Fetch login page
  • Parse HTML and find signin form
  • Extract form inputs for login (there are quite a few required hidden fields)
  • Build post array for login
  • Submit login form
  • Check for success or failure

PHP Code (no longer working - see example below):

<?php

// amazon username & password
$username = 'you@example.com';
$password = 'yourpassword';

// http headers for requests
$headers = array(
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language: en-US,en;q=0.5',
    'Connection: keep-alive',
    'DNT: 1', // :)
);

// initialize curl
$ch = curl_init('https://www.amazon.com/');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, '');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');

// fetch homepage to establish cookies
$result = curl_exec($ch);

// parse HTML looking for login URL
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($result);

// find link to login page
$xpath    = new DOMXPath($dom);
$elements = $xpath->query('//*[@id="nav-link-yourAccount"]');

if ($elements->length == 0) {
    die('Did not find "sign-in" link!');
}

// get login url
$url = $elements->item(0)->attributes->getNamedItem('href')->nodeValue;

if (strpos($url, 'http') !== 0) {
    $url = 'https://www.amazon.com' . $url;
}

// fetch login page
curl_setopt($ch, CURLOPT_URL, $url);
$result = curl_exec($ch);

// parse html to get form inputs
$dom->loadHTML($result);
$xpath = new DOMXPath($dom);

// find sign in form inputs
$inputs = $xpath->query('//form[@name="signIn"]//input');

if ($inputs->length == 0) {
    die('Failed to find login form fields!');
}

// get login post url
$url = $xpath->query('//form[@name="signIn"]');
$url = $url->item(0)->attributes->getNamedItem('action')->nodeValue; // form action (login URL)

// array of form fields to submit
$fields = array();

// build list of form inputs and values
for ($i = 0; $i < $inputs->length; ++$i) {
    $attribs = $inputs->item($i)->attributes;

    if ($attribs->getNamedItem('name') !== null) {
        $val = (null !== $attribs->getNamedItem('value')) ? $attribs->getNamedItem('value')->nodeValue : '';
        $fields[$attribs->getNamedItem('name')->nodeValue] = $val;
    }
}

// populate login form fields
$fields['email']    = $username;
$fields['password'] = $password;

// prepare for login
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($fields));

// execute login post
$result = curl_exec($ch);
$info   = curl_getinfo($ch);

// if login failed, url should be the same as the login url
if ($info['url'] == $url) {
    echo "There was a problem logging in.<br>\n";
    var_dump($result);
} else {
    // if successful, we are redirected to homepage so URL is different than login url
    echo "Should be logged in!<br>\n";
    var_dump($result);
}

Working CasperJS code:

var casper = require('casper').create();

casper.userAgent('Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0');
phantom.cookiesEnabled = true;

var AMAZON_USER = 'you@yoursite.com';
var AMAZON_PASS = 'some crazy password';

casper.start('https://www.amazon.com/').thenClick('a#nav-link-yourAccount', function() {
    this.echo('Title: ' + this.getTitle());

    var emailInput = 'input#ap_email';
    var passInput  = 'input#ap_password';

    this.mouseEvent('click', emailInput, '15%', '48%');
    this.sendKeys('input#ap_email', AMAZON_USER);

    this.wait(3000, function() {
        this.mouseEvent('click', passInput, '12%', '67%');
        this.sendKeys('input#ap_password', AMAZON_PASS);

        this.mouseEvent('click', 'input#signInSubmit', '50%', '50%');
    });
});

casper.then(function(e) {
    this.wait(5000, function() {
        this.echo('Capping');
        this.capture('amazon.png');
    });
});


casper.run(function() {
    console.log('Done');

    casper.done();
});

You should really extend this code to act more like a human!

这篇关于使用CURL登录到亚马逊的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆