PHP curl模拟登录带验证码的网站


Posted in PHP onNovember 30, 2015

需求是这样的,需要登录带验证码的网站,获取数据,但是不可能人为一直去记录数据,想通过自动采集的方式进行,如下是试验出来的结果代码!有需要的可以参考下!

<?php
namespace Home\Controller;
use Think\Controller;
class LoginController extends Controller
{
  protected $cookieName = array('cookie_verify', 'cookie_verify');
  protected $cookiePath = '/cookie/';
  protected $cookiePathFile = array();
  public function index()
  {
    $this->display();
  }
  public function _initialize(){
    foreach($this->cookieName as $key => $name)
    {
      $this->cookiePathFile[] = ROOT_PATH . $this->cookiePath . $this->cookieName[$key] . '_xxx.txt';
    }
  }
  /**
   * 登录xxx
   */
  public function xxxLogin()
  {
    $username = I('username');
    $password = I('password');
    $verifyCode = I('verify');
    $loginData = array(
      '__VIEWSTATE' => '/wEPDwUKMTU0MzAzOTU4NmQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFDExvZ2luX1N1Ym1pdL/yae69NsY163G3yuP0lxjz8oXu',              //不把参数补全可能会不被响应哦
      '__VIEWSTATEGENERATOR' => 'DC42DE27',
      'txt_UserName' => $username,
      'txt_PWD' => $password,
      'txt_VerifyCode' => $verifyCode,
      'SMONEY' => 'ABC',
      'Login_Submit.x' => '52',
      'Login_Submit.y' => '19',
    );
    $getBack = $this->_cookieRequest('http://xxx.com/noLogin.aspx', $loginData);
    if(preg_match('/<div[^\<div]*?id\s*=\s*[\'\"]{1}div_msg[\'\"]{1}.*?>(.*?)<\/div>/s', $getBack, $match)){
      echo 'matched\r\n';
      print_r($match);
    }else{
      echo $getBack, '<br />';
      $paramsFull = parse_url($getBack);
      parse_str($paramsFull['query'], $paramsFull['parsedQuery']);
      if(!empty($paramsFull['parsedQuery']['Warn'])) {
        $msg = "您好,欢迎来P,请先登录。";
        switch ($paramsFull['parsedQuery']['Warn'])
        {
          case '2':
            $msg = '您输入的验证码错误,请重试';
            break;
          case '3':
            $msg = '该帐号不存在,还没帐号?';
            break;
          case '5':
            $msg = '账户已注销';
            break;
          case '6':
            $msg = '密码错误,如果连续错误3次半小时内不能登录!';
            break;
          case '20':
            $msg = '今日密码错误3次及以上,请于半小时后再来登录!';
            break;
          case '21':
            $msg = '今日您所在IP的所有帐号密码错误9次以上,请于半小时后再来登录!';
            break;
          case '22':
            $msg = '登录失败,您所在IP今日登录的帐号过多!';
            break;
          case '23':
            $msg = '登录失败,验证码失效!';
            break;
          case '32':
            $msg = '该帐号已经绑定其他xx帐号!';
            break;
          case '33':
            $msg = '一台电脑一天只能注册一个帐号!';
            break;
        }
        $this->error($msg, '', 5);
      }else{
        $_SESSION['user_id'] = '123456';      //登录设置session
        $this->success('登录P网站成功', U('Index/index'), 5);
      }
    }
  }
  /**
   * 获取验证码
   */
  public function getVerifyCode()
  {
    $img = $this->_cookieRequest('http://xxx.com/VerifyCode_Login.aspx?id=' . rand(10000,999999), null, true, 1);
    echo $img;
  }
  /**
   * 删除cookie
   */
  public function clearCookie()
  {
    for($i = 0; $i <count($this->cookieName); $i++)
    {
      setcookie($this->cookieName[$i], '', time() - 3600);
    }
//    unlink($this->cookiePathFile);
    $this->success('清除cookie成功!');
  }
  /**
   * 带COOKIE的访问curl
   * @param $url 访问地址
   * @param bool|array $data 传递的数据
   * @param bool $redirect 是否获取重定向的地址
   * @return mixed 地址或者返回内容
   */
  public function _cookieRequest($url, $data = null, $redirect = false, $cookieNum = 0)
  {
    $ch = curl_init();
    $params[CURLOPT_URL] = $url;     //请求url地址
    $params[CURLOPT_HEADER] = false; //是否返回响应头信息
    $params[CURLOPT_RETURNTRANSFER] = true; //是否将结果返回
    $params[CURLOPT_FOLLOWLOCATION] = true; //是否重定向
    $params[CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1';
    if($data)
    {
      $params[CURLOPT_POST] = true;
      $params[CURLOPT_POSTFIELDS] = http_build_query($data);
    }
    //判断是否有cookie,有的话直接使用
    if (!empty($_COOKIE[$this->cookieName[$cookieNum]]) && is_file($this->cookiePathFile[$cookieNum]))
    {
      $params[CURLOPT_COOKIEFILE] = $this->cookiePathFile[$cookieNum];   //这里判断cookie
    }
    else
    {
//      $cookie_jar = tempnam($cookie_path, 'cookie');        //产生一个cookie文件
      $params[CURLOPT_COOKIEJAR] = $this->cookiePathFile[$cookieNum];    //写入cookie信息
      setcookie($this->cookieName[$cookieNum], $this->cookiePathFile[$cookieNum], time() + 120);   //保存cookie路径
    }
    curl_setopt_array($ch, $params);                //传入curl参数
    $content = curl_exec($ch);
    $headers = curl_getinfo($ch);
//    echo $content;
    curl_close($ch);
    if ($url != $headers["url"] && $redirect == false)
 {
return $headers["url"];
 }
return $content;
 }
}

登录以后,就可以使用带cookie的访问其他页面了!

ps:php curl 登录淘宝

提交上去后显示为填写验证码,登录不上去

 填写验证码提交:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>    
   <meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
   <title></title>
  </head>
  <body>
  <iframe id='img' src="b.php" width="950" height="300" scrolling="No" frameborder="0"></iframe>
  <form action="tb.php" method="POST">
  <textarea name="vv" cols="50" rows="10">umto=&action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424
A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxxx&need_check_code=&&TPL_checkcode=</textarea>
  <input type="submit" />
  </form>
 
  </body>
</html>
<?php
session_start();
if(empty($_SESSION['cookie_jar'])) exit();
$cookie_jar=$_SESSION['cookie_jar'];
$post_fields=$_POST["vv"];
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
echo $data;exit;
$ch = curl_init('http://www.taobao.com'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_jar); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE); 
curl_exec($ch); 
curl_close($ch); 
?>

提取验证码

<?php
session_start();
$cookie_jar=tempnam("./temp/","cookie");
$_SESSION['cookie_jar']=$cookie_jar;
$post_fields = "action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxx"; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
$data = curl_exec($ch); 
curl_close($ch); 
preg_match("/id=\"um_to\" name=\"umto\" value=\"(.*?)\"\/>/", $data, $arr); 
$post_fields = "umto=" . $arr[1] . "&" . $post_fields."&TPL_checkcode="; 
echo "<textarea cols=50 rows=10>" . $post_fields . "</textarea><br/>" ; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_jar);
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
preg_match("/img id=\"J_StandardCode_m\" src=\"(.*?)\" data-src=/", $data, $arr1); 
echo "<img src=".$arr1[1]." />";
exit;
?>
PHP 相关文章推荐
PHP5中使用PDO连接数据库的方法
Aug 01 PHP
浅析HTTP消息头网页缓存控制以及header常用指令介绍
Jun 28 PHP
php 检查电子邮件函数(自写)
Jan 16 PHP
php创建sprite
Feb 11 PHP
smarty中post用法实例
Nov 28 PHP
php生成随机颜色方法汇总
Dec 03 PHP
WordPress主题制作中自定义头部的相关PHP函数解析
Jan 08 PHP
php操纵mysqli数据库的实现方法
Sep 18 PHP
完美解决在ThinkPHP控制器中命名空间的问题
May 05 PHP
windows下的WAMP环境搭建图文教程(推荐)
Jul 27 PHP
PHP实现Snowflake生成分布式唯一ID的方法示例
Aug 30 PHP
PHP之header函数详解
Mar 02 PHP
PHP可变函数学习小结
Nov 29 #PHP
PHP可变变量学习小结
Nov 29 #PHP
PHP中对数组的一些常用的增、删、插操作函数总结
Nov 27 #PHP
详解PHP对数组的定义以及数组的创建方法
Nov 27 #PHP
实例简介PHP的一些高级面向对象编程的特性
Nov 27 #PHP
PHP编程中的__clone()方法使用详解
Nov 27 #PHP
PHP通过反射动态加载第三方类和获得类源码的实例
Nov 27 #PHP
You might like
php小偷相关截取函数备忘
2010/11/28 PHP
有关于PHP中常见数据类型的汇总分享
2014/01/06 PHP
Zend Framework教程之Zend_Form组件实现表单提交并显示错误提示的方法
2016/03/21 PHP
关于PHP文件的自动运行方法分析
2016/05/13 PHP
CI框架(CodeIgniter)实现的导入、导出数据操作示例
2018/05/24 PHP
javascript Object与Function使用
2010/01/11 Javascript
跨浏览器通用、可重用的选项卡tab切换js代码
2011/09/20 Javascript
Dom 学习总结以及实例的使用介绍
2013/04/24 Javascript
当鼠标移动到图片上时跟随鼠标显示放大的图片效果
2013/06/06 Javascript
js判断选择的时间是否大于今天的代码
2013/08/20 Javascript
jquery获取对象的方法足以应付常见的各种类型的对象
2014/05/14 Javascript
TypeScript具有的几个不同特质
2015/04/07 Javascript
vue.js学习之UI组件开发教程
2017/07/03 Javascript
jQuery remove()过滤被删除的元素(推荐)
2017/07/18 jQuery
Mui使用jquery并且使用点击跳转新窗口的实例
2017/08/19 jQuery
详解Webpack loader 之 file-loader
2018/11/07 Javascript
微信小程序实现渐入渐出动画效果
2019/06/13 Javascript
Vue 实现前端权限控制的示例代码
2019/07/09 Javascript
vue-next/runtime-core 源码阅读指南详解
2019/10/25 Javascript
微信小程序页面渲染实现方法
2019/11/06 Javascript
分析在Python中何种情况下需要使用断言
2015/04/01 Python
Python中的字符串类型基本知识学习教程
2016/02/04 Python
Python中最大最小赋值小技巧(分享)
2017/12/23 Python
Python3.6笔记之将程序运行结果输出到文件的方法
2018/04/22 Python
python linecache 处理固定格式文本数据的方法
2019/01/08 Python
利用python实现.dcm格式图像转为.jpg格式
2020/01/13 Python
python中upper是做什么用的
2020/07/20 Python
利用css3径向渐变做一张优惠券的示例
2018/03/22 HTML / CSS
财务主管的岗位职责
2013/12/30 职场文书
《一个中国孩子的呼声》教学反思
2014/02/12 职场文书
道歉的话怎么说
2015/05/12 职场文书
2016年寒假生活小结
2015/10/10 职场文书
2016关于军训的心得体会
2016/01/11 职场文书
导游词之昭君岛
2020/01/17 职场文书
Python还能这么玩之用Python做个小游戏的外挂
2021/06/04 Python
MySQL创建管理LIST分区
2022/04/13 MySQL