PHP curl模拟登录带验证码的网站


Posted in PHP onNovember 30, 2015

需求是这样的,需要登录带验证码的网站,获取数据,但是不可能人为一直去记录数据,想通过自动采集的方式进行,如下是试验出来的结果代码!有需要的可以参考下!

<?php
namespace Home\Controller;
use Think\Controller;
class LoginController extends Controller
{
  protected $cookieName = array('cookie_verify', 'cookie_verify');
  protected $cookiePath = '/cookie/';
  protected $cookiePathFile = array();
  public function index()
  {
    $this->display();
  }
  public function _initialize(){
    foreach($this->cookieName as $key => $name)
    {
      $this->cookiePathFile[] = ROOT_PATH . $this->cookiePath . $this->cookieName[$key] . '_xxx.txt';
    }
  }
  /**
   * 登录xxx
   */
  public function xxxLogin()
  {
    $username = I('username');
    $password = I('password');
    $verifyCode = I('verify');
    $loginData = array(
      '__VIEWSTATE' => '/wEPDwUKMTU0MzAzOTU4NmQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFDExvZ2luX1N1Ym1pdL/yae69NsY163G3yuP0lxjz8oXu',              //不把参数补全可能会不被响应哦
      '__VIEWSTATEGENERATOR' => 'DC42DE27',
      'txt_UserName' => $username,
      'txt_PWD' => $password,
      'txt_VerifyCode' => $verifyCode,
      'SMONEY' => 'ABC',
      'Login_Submit.x' => '52',
      'Login_Submit.y' => '19',
    );
    $getBack = $this->_cookieRequest('http://xxx.com/noLogin.aspx', $loginData);
    if(preg_match('/<div[^\<div]*?id\s*=\s*[\'\"]{1}div_msg[\'\"]{1}.*?>(.*?)<\/div>/s', $getBack, $match)){
      echo 'matched\r\n';
      print_r($match);
    }else{
      echo $getBack, '<br />';
      $paramsFull = parse_url($getBack);
      parse_str($paramsFull['query'], $paramsFull['parsedQuery']);
      if(!empty($paramsFull['parsedQuery']['Warn'])) {
        $msg = "您好,欢迎来P,请先登录。";
        switch ($paramsFull['parsedQuery']['Warn'])
        {
          case '2':
            $msg = '您输入的验证码错误,请重试';
            break;
          case '3':
            $msg = '该帐号不存在,还没帐号?';
            break;
          case '5':
            $msg = '账户已注销';
            break;
          case '6':
            $msg = '密码错误,如果连续错误3次半小时内不能登录!';
            break;
          case '20':
            $msg = '今日密码错误3次及以上,请于半小时后再来登录!';
            break;
          case '21':
            $msg = '今日您所在IP的所有帐号密码错误9次以上,请于半小时后再来登录!';
            break;
          case '22':
            $msg = '登录失败,您所在IP今日登录的帐号过多!';
            break;
          case '23':
            $msg = '登录失败,验证码失效!';
            break;
          case '32':
            $msg = '该帐号已经绑定其他xx帐号!';
            break;
          case '33':
            $msg = '一台电脑一天只能注册一个帐号!';
            break;
        }
        $this->error($msg, '', 5);
      }else{
        $_SESSION['user_id'] = '123456';      //登录设置session
        $this->success('登录P网站成功', U('Index/index'), 5);
      }
    }
  }
  /**
   * 获取验证码
   */
  public function getVerifyCode()
  {
    $img = $this->_cookieRequest('http://xxx.com/VerifyCode_Login.aspx?id=' . rand(10000,999999), null, true, 1);
    echo $img;
  }
  /**
   * 删除cookie
   */
  public function clearCookie()
  {
    for($i = 0; $i <count($this->cookieName); $i++)
    {
      setcookie($this->cookieName[$i], '', time() - 3600);
    }
//    unlink($this->cookiePathFile);
    $this->success('清除cookie成功!');
  }
  /**
   * 带COOKIE的访问curl
   * @param $url 访问地址
   * @param bool|array $data 传递的数据
   * @param bool $redirect 是否获取重定向的地址
   * @return mixed 地址或者返回内容
   */
  public function _cookieRequest($url, $data = null, $redirect = false, $cookieNum = 0)
  {
    $ch = curl_init();
    $params[CURLOPT_URL] = $url;     //请求url地址
    $params[CURLOPT_HEADER] = false; //是否返回响应头信息
    $params[CURLOPT_RETURNTRANSFER] = true; //是否将结果返回
    $params[CURLOPT_FOLLOWLOCATION] = true; //是否重定向
    $params[CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1';
    if($data)
    {
      $params[CURLOPT_POST] = true;
      $params[CURLOPT_POSTFIELDS] = http_build_query($data);
    }
    //判断是否有cookie,有的话直接使用
    if (!empty($_COOKIE[$this->cookieName[$cookieNum]]) && is_file($this->cookiePathFile[$cookieNum]))
    {
      $params[CURLOPT_COOKIEFILE] = $this->cookiePathFile[$cookieNum];   //这里判断cookie
    }
    else
    {
//      $cookie_jar = tempnam($cookie_path, 'cookie');        //产生一个cookie文件
      $params[CURLOPT_COOKIEJAR] = $this->cookiePathFile[$cookieNum];    //写入cookie信息
      setcookie($this->cookieName[$cookieNum], $this->cookiePathFile[$cookieNum], time() + 120);   //保存cookie路径
    }
    curl_setopt_array($ch, $params);                //传入curl参数
    $content = curl_exec($ch);
    $headers = curl_getinfo($ch);
//    echo $content;
    curl_close($ch);
    if ($url != $headers["url"] && $redirect == false)
 {
return $headers["url"];
 }
return $content;
 }
}

登录以后,就可以使用带cookie的访问其他页面了!

ps:php curl 登录淘宝

提交上去后显示为填写验证码,登录不上去

 填写验证码提交:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>    
   <meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
   <title></title>
  </head>
  <body>
  <iframe id='img' src="b.php" width="950" height="300" scrolling="No" frameborder="0"></iframe>
  <form action="tb.php" method="POST">
  <textarea name="vv" cols="50" rows="10">umto=&action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424
A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxxx&need_check_code=&&TPL_checkcode=</textarea>
  <input type="submit" />
  </form>
 
  </body>
</html>
<?php
session_start();
if(empty($_SESSION['cookie_jar'])) exit();
$cookie_jar=$_SESSION['cookie_jar'];
$post_fields=$_POST["vv"];
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
echo $data;exit;
$ch = curl_init('http://www.taobao.com'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_jar); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE); 
curl_exec($ch); 
curl_close($ch); 
?>

提取验证码

<?php
session_start();
$cookie_jar=tempnam("./temp/","cookie");
$_SESSION['cookie_jar']=$cookie_jar;
$post_fields = "action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxx"; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
$data = curl_exec($ch); 
curl_close($ch); 
preg_match("/id=\"um_to\" name=\"umto\" value=\"(.*?)\"\/>/", $data, $arr); 
$post_fields = "umto=" . $arr[1] . "&" . $post_fields."&TPL_checkcode="; 
echo "<textarea cols=50 rows=10>" . $post_fields . "</textarea><br/>" ; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_jar);
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
preg_match("/img id=\"J_StandardCode_m\" src=\"(.*?)\" data-src=/", $data, $arr1); 
echo "<img src=".$arr1[1]." />";
exit;
?>
PHP 相关文章推荐
PHP初学者头疼问题总结
Oct 09 PHP
桌面中心(二)数据库写入
Oct 09 PHP
PHP获取栏目的所有子级和孙级栏目的ID号示例
Apr 01 PHP
自己写了一个php检测文件编码的函数
Apr 21 PHP
PHP实现根据设备类型自动跳转相应页面的方法
Jul 24 PHP
php魔术函数__call()用法实例分析
Feb 13 PHP
PHP滚动日志的代码实现
Jun 10 PHP
讲解WordPress中用于获取评论模板和搜索表单的PHP函数
Dec 28 PHP
浅析Yii2缓存的使用
May 10 PHP
浅析php静态方法与非静态方法的用法区别
May 17 PHP
PHP请求远程地址设置超时时间的解决方法
Oct 29 PHP
WordPress伪静态规则设置代码实例
Dec 10 PHP
PHP可变函数学习小结
Nov 29 #PHP
PHP可变变量学习小结
Nov 29 #PHP
PHP中对数组的一些常用的增、删、插操作函数总结
Nov 27 #PHP
详解PHP对数组的定义以及数组的创建方法
Nov 27 #PHP
实例简介PHP的一些高级面向对象编程的特性
Nov 27 #PHP
PHP编程中的__clone()方法使用详解
Nov 27 #PHP
PHP通过反射动态加载第三方类和获得类源码的实例
Nov 27 #PHP
You might like
PHP数据类型之布尔型的介绍
2013/04/28 PHP
php数组去重实例及分析
2013/11/26 PHP
php打包压缩文件之ZipArchive方法用法分析
2016/04/30 PHP
phpMyAdmin通过密码漏洞留后门文件
2018/11/20 PHP
把textarea中字符串里含有的回车换行替换成&amp;lt;br&amp;gt;的javascript代码
2007/04/20 Javascript
简单的JS多重继承示例
2008/03/13 Javascript
Javascript学习笔记7 原型链的原理
2010/01/11 Javascript
jquery 文本上下无缝滚动,鼠标放上去就停止 小例子
2013/06/05 Javascript
封装了一个支持匿名函数的Javascript事件监听器
2014/06/05 Javascript
超棒的响应式布局jQuery插件Freetile.js
2014/11/17 Javascript
javascript封装addLoadEvent实现页面同时加载执行多个函数的方法
2016/07/25 Javascript
利用Node.JS实现邮件发送功能
2016/10/21 Javascript
Bootstrap CSS组件之按钮组(btn-group)
2016/12/17 Javascript
微信小程序 Button 组件详解及简单实例
2017/01/10 Javascript
利用js给datalist或select动态添加option选项的方法
2018/01/25 Javascript
vue 过滤器filter实例详解
2018/03/14 Javascript
JS+canvas画布实现炫酷的旋转星空效果示例
2019/02/13 Javascript
vue组件之间通信方式实例总结【8种方式】
2019/02/22 Javascript
vue组件入门知识全梳理
2020/09/21 Javascript
python实现数据导出到excel的示例--普通格式
2018/05/03 Python
python框架中flask知识点总结
2018/08/17 Python
如何安装多版本python python2和python3共存以及pip共存
2018/09/18 Python
Python可变对象与不可变对象原理解析
2020/02/25 Python
python 使用建议与技巧分享(四)
2020/08/18 Python
鞋子女王塔玛拉·梅隆同名奢侈品牌:Tamara Mellon
2017/11/22 全球购物
美国领先的个性化礼品商城:Personalization Mall
2019/07/27 全球购物
htmlentities() 和 htmlspecialchars()有什么区别
2015/07/01 面试题
介绍一下except的用法和作用
2015/01/22 面试题
电话销售经理岗位职责
2013/12/07 职场文书
讲座主持词
2014/03/20 职场文书
学期评语大全
2014/04/30 职场文书
2014财务年终工作总结
2014/12/08 职场文书
2015新学期家长寄语
2015/02/26 职场文书
2015年乡镇安全生产工作总结
2015/05/19 职场文书
修改并编译golang源码的操作步骤
2021/07/25 Golang
集英社今正式宣布 成立游戏公司“集英社Games”
2022/03/31 其他游戏