PHP curl模拟登录带验证码的网站


Posted in PHP onNovember 30, 2015

需求是这样的,需要登录带验证码的网站,获取数据,但是不可能人为一直去记录数据,想通过自动采集的方式进行,如下是试验出来的结果代码!有需要的可以参考下!

<?php
namespace Home\Controller;
use Think\Controller;
class LoginController extends Controller
{
  protected $cookieName = array('cookie_verify', 'cookie_verify');
  protected $cookiePath = '/cookie/';
  protected $cookiePathFile = array();
  public function index()
  {
    $this->display();
  }
  public function _initialize(){
    foreach($this->cookieName as $key => $name)
    {
      $this->cookiePathFile[] = ROOT_PATH . $this->cookiePath . $this->cookieName[$key] . '_xxx.txt';
    }
  }
  /**
   * 登录xxx
   */
  public function xxxLogin()
  {
    $username = I('username');
    $password = I('password');
    $verifyCode = I('verify');
    $loginData = array(
      '__VIEWSTATE' => '/wEPDwUKMTU0MzAzOTU4NmQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFDExvZ2luX1N1Ym1pdL/yae69NsY163G3yuP0lxjz8oXu',              //不把参数补全可能会不被响应哦
      '__VIEWSTATEGENERATOR' => 'DC42DE27',
      'txt_UserName' => $username,
      'txt_PWD' => $password,
      'txt_VerifyCode' => $verifyCode,
      'SMONEY' => 'ABC',
      'Login_Submit.x' => '52',
      'Login_Submit.y' => '19',
    );
    $getBack = $this->_cookieRequest('http://xxx.com/noLogin.aspx', $loginData);
    if(preg_match('/<div[^\<div]*?id\s*=\s*[\'\"]{1}div_msg[\'\"]{1}.*?>(.*?)<\/div>/s', $getBack, $match)){
      echo 'matched\r\n';
      print_r($match);
    }else{
      echo $getBack, '<br />';
      $paramsFull = parse_url($getBack);
      parse_str($paramsFull['query'], $paramsFull['parsedQuery']);
      if(!empty($paramsFull['parsedQuery']['Warn'])) {
        $msg = "您好,欢迎来P,请先登录。";
        switch ($paramsFull['parsedQuery']['Warn'])
        {
          case '2':
            $msg = '您输入的验证码错误,请重试';
            break;
          case '3':
            $msg = '该帐号不存在,还没帐号?';
            break;
          case '5':
            $msg = '账户已注销';
            break;
          case '6':
            $msg = '密码错误,如果连续错误3次半小时内不能登录!';
            break;
          case '20':
            $msg = '今日密码错误3次及以上,请于半小时后再来登录!';
            break;
          case '21':
            $msg = '今日您所在IP的所有帐号密码错误9次以上,请于半小时后再来登录!';
            break;
          case '22':
            $msg = '登录失败,您所在IP今日登录的帐号过多!';
            break;
          case '23':
            $msg = '登录失败,验证码失效!';
            break;
          case '32':
            $msg = '该帐号已经绑定其他xx帐号!';
            break;
          case '33':
            $msg = '一台电脑一天只能注册一个帐号!';
            break;
        }
        $this->error($msg, '', 5);
      }else{
        $_SESSION['user_id'] = '123456';      //登录设置session
        $this->success('登录P网站成功', U('Index/index'), 5);
      }
    }
  }
  /**
   * 获取验证码
   */
  public function getVerifyCode()
  {
    $img = $this->_cookieRequest('http://xxx.com/VerifyCode_Login.aspx?id=' . rand(10000,999999), null, true, 1);
    echo $img;
  }
  /**
   * 删除cookie
   */
  public function clearCookie()
  {
    for($i = 0; $i <count($this->cookieName); $i++)
    {
      setcookie($this->cookieName[$i], '', time() - 3600);
    }
//    unlink($this->cookiePathFile);
    $this->success('清除cookie成功!');
  }
  /**
   * 带COOKIE的访问curl
   * @param $url 访问地址
   * @param bool|array $data 传递的数据
   * @param bool $redirect 是否获取重定向的地址
   * @return mixed 地址或者返回内容
   */
  public function _cookieRequest($url, $data = null, $redirect = false, $cookieNum = 0)
  {
    $ch = curl_init();
    $params[CURLOPT_URL] = $url;     //请求url地址
    $params[CURLOPT_HEADER] = false; //是否返回响应头信息
    $params[CURLOPT_RETURNTRANSFER] = true; //是否将结果返回
    $params[CURLOPT_FOLLOWLOCATION] = true; //是否重定向
    $params[CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1';
    if($data)
    {
      $params[CURLOPT_POST] = true;
      $params[CURLOPT_POSTFIELDS] = http_build_query($data);
    }
    //判断是否有cookie,有的话直接使用
    if (!empty($_COOKIE[$this->cookieName[$cookieNum]]) && is_file($this->cookiePathFile[$cookieNum]))
    {
      $params[CURLOPT_COOKIEFILE] = $this->cookiePathFile[$cookieNum];   //这里判断cookie
    }
    else
    {
//      $cookie_jar = tempnam($cookie_path, 'cookie');        //产生一个cookie文件
      $params[CURLOPT_COOKIEJAR] = $this->cookiePathFile[$cookieNum];    //写入cookie信息
      setcookie($this->cookieName[$cookieNum], $this->cookiePathFile[$cookieNum], time() + 120);   //保存cookie路径
    }
    curl_setopt_array($ch, $params);                //传入curl参数
    $content = curl_exec($ch);
    $headers = curl_getinfo($ch);
//    echo $content;
    curl_close($ch);
    if ($url != $headers["url"] && $redirect == false)
 {
return $headers["url"];
 }
return $content;
 }
}

登录以后,就可以使用带cookie的访问其他页面了!

ps:php curl 登录淘宝

提交上去后显示为填写验证码,登录不上去

 填写验证码提交:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>    
   <meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
   <title></title>
  </head>
  <body>
  <iframe id='img' src="b.php" width="950" height="300" scrolling="No" frameborder="0"></iframe>
  <form action="tb.php" method="POST">
  <textarea name="vv" cols="50" rows="10">umto=&action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424
A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxxx&need_check_code=&&TPL_checkcode=</textarea>
  <input type="submit" />
  </form>
 
  </body>
</html>
<?php
session_start();
if(empty($_SESSION['cookie_jar'])) exit();
$cookie_jar=$_SESSION['cookie_jar'];
$post_fields=$_POST["vv"];
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
echo $data;exit;
$ch = curl_init('http://www.taobao.com'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_jar); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE); 
curl_exec($ch); 
curl_close($ch); 
?>

提取验证码

<?php
session_start();
$cookie_jar=tempnam("./temp/","cookie");
$_SESSION['cookie_jar']=$cookie_jar;
$post_fields = "action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxx"; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
$data = curl_exec($ch); 
curl_close($ch); 
preg_match("/id=\"um_to\" name=\"umto\" value=\"(.*?)\"\/>/", $data, $arr); 
$post_fields = "umto=" . $arr[1] . "&" . $post_fields."&TPL_checkcode="; 
echo "<textarea cols=50 rows=10>" . $post_fields . "</textarea><br/>" ; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_jar);
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
preg_match("/img id=\"J_StandardCode_m\" src=\"(.*?)\" data-src=/", $data, $arr1); 
echo "<img src=".$arr1[1]." />";
exit;
?>
PHP 相关文章推荐
php array_map array_multisort 高效处理多维数组排序
Jun 11 PHP
LotusPhp笔记之:Cookie组件的使用详解
May 06 PHP
PHP图片自动裁切应付不同尺寸的显示
Oct 16 PHP
php打造智能化的柱状图程序,用于报表等
Jun 19 PHP
php生成0~1随机小数的方法(必看)
Apr 05 PHP
YII2框架中excel表格导出的方法详解
Jul 21 PHP
PHP中PDO事务处理操作示例
May 02 PHP
PHP实现将多个文件压缩成zip格式并下载到本地的方法示例
May 23 PHP
PHP排序二叉树基本功能实现方法示例
May 26 PHP
PHP实现唤起微信支付功能
Feb 18 PHP
PHP设计模式之适配器模式(Adapter)原理与用法详解
Dec 12 PHP
PHP日期和时间函数的使用示例详解
Aug 06 PHP
PHP可变函数学习小结
Nov 29 #PHP
PHP可变变量学习小结
Nov 29 #PHP
PHP中对数组的一些常用的增、删、插操作函数总结
Nov 27 #PHP
详解PHP对数组的定义以及数组的创建方法
Nov 27 #PHP
实例简介PHP的一些高级面向对象编程的特性
Nov 27 #PHP
PHP编程中的__clone()方法使用详解
Nov 27 #PHP
PHP通过反射动态加载第三方类和获得类源码的实例
Nov 27 #PHP
You might like
海贼王动画变成“真人”后,凯多神还原,雷利太帅了!
2020/04/09 日漫
PHP通用检测函数集合
2006/11/25 PHP
PHP5中的时间相差8小时的解决办法
2008/03/28 PHP
浅谈PHP SHA1withRSA加密生成签名及验签
2019/03/18 PHP
些很实用且必用的小脚本代码
2006/06/26 Javascript
一个简单的Ext.XTemplate的实例代码
2012/03/18 Javascript
jQuery.extend()的实现方式详解及实例
2013/06/29 Javascript
jQuery阻止事件冒泡具体实现
2013/10/11 Javascript
在JavaScript中判断整型的N种方法示例介绍
2014/06/18 Javascript
基于JavaScript实现下拉列表左右移动代码
2017/02/07 Javascript
nodejs获取微信小程序带参数二维码实现代码
2017/04/12 NodeJs
js禁止表单重复提交
2017/08/29 Javascript
详解Node 定时器
2018/02/26 Javascript
vue elementUI tree树形控件获取父节点ID的实例
2018/09/12 Javascript
jquery 回调操作实例分析【回调成功与回调失败的情况】
2019/09/27 jQuery
Vue-drag-resize 拖拽缩放插件的使用(简单示例)
2019/12/04 Javascript
Angular+Ionic使用queryParams实现跳转页传值的方法
2020/09/05 Javascript
wxPython使用系统剪切板的方法
2015/06/16 Python
用Eclipse写python程序
2018/02/10 Python
Python pymongo模块用法示例
2018/03/31 Python
Python基于opencv的图像压缩算法实例分析
2018/05/03 Python
python3 requests中使用ip代理池随机生成ip的实例
2018/05/07 Python
Python3.6简单的操作Mysql数据库的三个实例
2018/10/17 Python
利用Python实现微信找房机器人实例教程
2019/03/10 Python
wxPython实现分隔窗口
2019/11/19 Python
python由已知数组快速生成新数组的方法
2020/04/08 Python
德国体育用品网上商店:SC24.com
2016/08/01 全球购物
水芝澳美国官网:H2O Plus
2016/10/15 全球购物
Marmot土拨鼠官网:美国专业户外运动品牌
2018/01/11 全球购物
乌克兰在线电子产品商店:MTA
2019/11/14 全球购物
下面代码从性能上考虑,有什么问题
2015/04/03 面试题
竞选演讲稿范文
2013/12/28 职场文书
汉语言文学毕业求职信
2014/07/17 职场文书
机关作风建设自查报告
2014/10/22 职场文书
教师个人发展总结
2015/02/11 职场文书
体检通知范文
2015/04/21 职场文书