PHP curl模拟登录带验证码的网站


Posted in PHP onNovember 30, 2015

需求是这样的,需要登录带验证码的网站,获取数据,但是不可能人为一直去记录数据,想通过自动采集的方式进行,如下是试验出来的结果代码!有需要的可以参考下!

<?php
namespace Home\Controller;
use Think\Controller;
class LoginController extends Controller
{
  protected $cookieName = array('cookie_verify', 'cookie_verify');
  protected $cookiePath = '/cookie/';
  protected $cookiePathFile = array();
  public function index()
  {
    $this->display();
  }
  public function _initialize(){
    foreach($this->cookieName as $key => $name)
    {
      $this->cookiePathFile[] = ROOT_PATH . $this->cookiePath . $this->cookieName[$key] . '_xxx.txt';
    }
  }
  /**
   * 登录xxx
   */
  public function xxxLogin()
  {
    $username = I('username');
    $password = I('password');
    $verifyCode = I('verify');
    $loginData = array(
      '__VIEWSTATE' => '/wEPDwUKMTU0MzAzOTU4NmQYAQUeX19Db250cm9sc1JlcXVpcmVQb3N0QmFja0tleV9fFgEFDExvZ2luX1N1Ym1pdL/yae69NsY163G3yuP0lxjz8oXu',              //不把参数补全可能会不被响应哦
      '__VIEWSTATEGENERATOR' => 'DC42DE27',
      'txt_UserName' => $username,
      'txt_PWD' => $password,
      'txt_VerifyCode' => $verifyCode,
      'SMONEY' => 'ABC',
      'Login_Submit.x' => '52',
      'Login_Submit.y' => '19',
    );
    $getBack = $this->_cookieRequest('http://xxx.com/noLogin.aspx', $loginData);
    if(preg_match('/<div[^\<div]*?id\s*=\s*[\'\"]{1}div_msg[\'\"]{1}.*?>(.*?)<\/div>/s', $getBack, $match)){
      echo 'matched\r\n';
      print_r($match);
    }else{
      echo $getBack, '<br />';
      $paramsFull = parse_url($getBack);
      parse_str($paramsFull['query'], $paramsFull['parsedQuery']);
      if(!empty($paramsFull['parsedQuery']['Warn'])) {
        $msg = "您好,欢迎来P,请先登录。";
        switch ($paramsFull['parsedQuery']['Warn'])
        {
          case '2':
            $msg = '您输入的验证码错误,请重试';
            break;
          case '3':
            $msg = '该帐号不存在,还没帐号?';
            break;
          case '5':
            $msg = '账户已注销';
            break;
          case '6':
            $msg = '密码错误,如果连续错误3次半小时内不能登录!';
            break;
          case '20':
            $msg = '今日密码错误3次及以上,请于半小时后再来登录!';
            break;
          case '21':
            $msg = '今日您所在IP的所有帐号密码错误9次以上,请于半小时后再来登录!';
            break;
          case '22':
            $msg = '登录失败,您所在IP今日登录的帐号过多!';
            break;
          case '23':
            $msg = '登录失败,验证码失效!';
            break;
          case '32':
            $msg = '该帐号已经绑定其他xx帐号!';
            break;
          case '33':
            $msg = '一台电脑一天只能注册一个帐号!';
            break;
        }
        $this->error($msg, '', 5);
      }else{
        $_SESSION['user_id'] = '123456';      //登录设置session
        $this->success('登录P网站成功', U('Index/index'), 5);
      }
    }
  }
  /**
   * 获取验证码
   */
  public function getVerifyCode()
  {
    $img = $this->_cookieRequest('http://xxx.com/VerifyCode_Login.aspx?id=' . rand(10000,999999), null, true, 1);
    echo $img;
  }
  /**
   * 删除cookie
   */
  public function clearCookie()
  {
    for($i = 0; $i <count($this->cookieName); $i++)
    {
      setcookie($this->cookieName[$i], '', time() - 3600);
    }
//    unlink($this->cookiePathFile);
    $this->success('清除cookie成功!');
  }
  /**
   * 带COOKIE的访问curl
   * @param $url 访问地址
   * @param bool|array $data 传递的数据
   * @param bool $redirect 是否获取重定向的地址
   * @return mixed 地址或者返回内容
   */
  public function _cookieRequest($url, $data = null, $redirect = false, $cookieNum = 0)
  {
    $ch = curl_init();
    $params[CURLOPT_URL] = $url;     //请求url地址
    $params[CURLOPT_HEADER] = false; //是否返回响应头信息
    $params[CURLOPT_RETURNTRANSFER] = true; //是否将结果返回
    $params[CURLOPT_FOLLOWLOCATION] = true; //是否重定向
    $params[CURLOPT_USERAGENT] = 'Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1';
    if($data)
    {
      $params[CURLOPT_POST] = true;
      $params[CURLOPT_POSTFIELDS] = http_build_query($data);
    }
    //判断是否有cookie,有的话直接使用
    if (!empty($_COOKIE[$this->cookieName[$cookieNum]]) && is_file($this->cookiePathFile[$cookieNum]))
    {
      $params[CURLOPT_COOKIEFILE] = $this->cookiePathFile[$cookieNum];   //这里判断cookie
    }
    else
    {
//      $cookie_jar = tempnam($cookie_path, 'cookie');        //产生一个cookie文件
      $params[CURLOPT_COOKIEJAR] = $this->cookiePathFile[$cookieNum];    //写入cookie信息
      setcookie($this->cookieName[$cookieNum], $this->cookiePathFile[$cookieNum], time() + 120);   //保存cookie路径
    }
    curl_setopt_array($ch, $params);                //传入curl参数
    $content = curl_exec($ch);
    $headers = curl_getinfo($ch);
//    echo $content;
    curl_close($ch);
    if ($url != $headers["url"] && $redirect == false)
 {
return $headers["url"];
 }
return $content;
 }
}

登录以后,就可以使用带cookie的访问其他页面了!

ps:php curl 登录淘宝

提交上去后显示为填写验证码,登录不上去

 填写验证码提交:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>    
   <meta http-equiv="Content-Type" content="text/html; charset=gb2312" />
   <title></title>
  </head>
  <body>
  <iframe id='img' src="b.php" width="950" height="300" scrolling="No" frameborder="0"></iframe>
  <form action="tb.php" method="POST">
  <textarea name="vv" cols="50" rows="10">umto=&action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424
A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxxx&need_check_code=&&TPL_checkcode=</textarea>
  <input type="submit" />
  </form>
 
  </body>
</html>
<?php
session_start();
if(empty($_SESSION['cookie_jar'])) exit();
$cookie_jar=$_SESSION['cookie_jar'];
$post_fields=$_POST["vv"];
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1); 
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
echo $data;exit;
$ch = curl_init('http://www.taobao.com'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0); 
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_jar); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, TRUE); 
curl_exec($ch); 
curl_close($ch); 
?>

提取验证码

<?php
session_start();
$cookie_jar=tempnam("./temp/","cookie");
$_SESSION['cookie_jar']=$cookie_jar;
$post_fields = "action=Authenticator&event_submit_do_login=anything&from=tb&fc=default&style=default&css_style=&tid=XOR_1_000000000000000000000000000000_635045544
70A7C717F750278&support=000001&CtrlVersion=1,0,0,7&loginType=3&minititle=&minipara=&pstrong=&llnick=&sign=&need_sign=&isIgnore=&full_redirect=&popid=&callback=&guf=¬_duplite_str=&need_user_id=&poy=XOR_1_000000000000000000000000000000_625A424A45137C6F7A7F0B786D08&gvfdcname=&gvfdcre=&from_encoding=&TPL_redirect_url=http:www.taobao.com&TPL_username=xxx&TPL_password=xxx"; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
$data = curl_exec($ch); 
curl_close($ch); 
preg_match("/id=\"um_to\" name=\"umto\" value=\"(.*?)\"\/>/", $data, $arr); 
$post_fields = "umto=" . $arr[1] . "&" . $post_fields."&TPL_checkcode="; 
echo "<textarea cols=50 rows=10>" . $post_fields . "</textarea><br/>" ; 
$ch = curl_init('https://login.taobao.com/member/login.jhtml'); 
curl_setopt($ch, CURLOPT_HEADER, 0); 
curl_setopt($ch, CURLOPT_USERAGENT, 
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; QQWubi 133; Embedded Web Browser from: http://bsalsa.com/; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Tablet PC 2.0; .NET4.0C; .NET4.0E; InfoPath.3; Media Center PC 6.0)"); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_POST, 1); 
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields); 
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); 
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2); 
curl_setopt($ch,CURLOPT_COOKIEJAR,$cookie_jar);
curl_setopt($ch,CURLOPT_COOKIEFILE,$cookie_jar); 
$data = curl_exec($ch); 
curl_close($ch);
preg_match("/img id=\"J_StandardCode_m\" src=\"(.*?)\" data-src=/", $data, $arr1); 
echo "<img src=".$arr1[1]." />";
exit;
?>
PHP 相关文章推荐
最小化数据传输――在客户端存储数据
Oct 09 PHP
用PHP函数解决SQL injection
Dec 09 PHP
PHP+MYSQL 出现乱码的解决方法
Aug 08 PHP
常用的php对象类型判断
Aug 27 PHP
php一个找二层目录的小东东
Aug 02 PHP
smarty模板中使用get、post、request、cookies、session变量的方法
Apr 24 PHP
一个比较不错的PHP日历类分享
Nov 18 PHP
php获取从html表单传递数组的方法
Mar 20 PHP
Yii数据读取与跳转参数传递用法实例分析
Jul 12 PHP
Zend Framework实现自定义过滤器的方法
Dec 09 PHP
php数据序列化测试实例详解
Aug 12 PHP
PHP中soap用法示例【SoapServer服务端与SoapClient客户端编写】
Dec 25 PHP
PHP可变函数学习小结
Nov 29 #PHP
PHP可变变量学习小结
Nov 29 #PHP
PHP中对数组的一些常用的增、删、插操作函数总结
Nov 27 #PHP
详解PHP对数组的定义以及数组的创建方法
Nov 27 #PHP
实例简介PHP的一些高级面向对象编程的特性
Nov 27 #PHP
PHP编程中的__clone()方法使用详解
Nov 27 #PHP
PHP通过反射动态加载第三方类和获得类源码的实例
Nov 27 #PHP
You might like
php 文件上传后缀名与文件类型对照表(几乎涵盖所有文件)
2010/05/16 PHP
PHP插入排序实现代码
2013/04/04 PHP
PHP采用自定义函数实现遍历目录下所有文件的方法
2014/08/19 PHP
php 删除cookie方法详解
2014/12/01 PHP
php微信公众号js-sdk开发应用
2016/11/28 PHP
ThinkPHP框架实现的邮箱激活功能示例
2018/06/15 PHP
菜鸟javascript基础整理1
2010/12/06 Javascript
Jquery操作下拉框(DropDownList)实现取值赋值
2013/08/13 Javascript
巧用jquery解决下拉菜单被Div遮挡的相关问题
2014/02/13 Javascript
判断某个字符在一个字符串中是否存在的js代码
2014/02/28 Javascript
浅谈javascript构造函数与实例化对象
2015/06/22 Javascript
jQuery操作Table技巧大汇总
2016/01/23 Javascript
谈一谈javascript中继承的多种方式
2016/02/19 Javascript
js+flash实现的5图变换效果广告代码(附演示与demo源码下载)
2016/04/01 Javascript
分享12个非常实用的JavaScript小技巧
2016/05/11 Javascript
javascript输出AscII码扩展集中的字符方法
2016/12/26 Javascript
jQuery读取本地的json文件(实例讲解)
2017/10/31 jQuery
JS代码实现电脑配置检测功能
2018/03/21 Javascript
微信小程序之多列表的显示和隐藏功能【附源码】
2018/08/06 Javascript
生产制造追溯系统之在线打印功能
2019/06/03 Javascript
小程序实现tab标签页
2020/11/16 Javascript
简单的Python2.7编程初学经验总结
2015/04/01 Python
使用Python来编写HTTP服务器的超级指南
2016/02/18 Python
Python 使用 docopt 解析json参数文件过程讲解
2019/08/13 Python
python实现打砖块游戏
2020/02/25 Python
pyCharm 实现关闭代码检查
2020/06/09 Python
python 基于UDP协议套接字通信的实现
2021/01/22 Python
HMV日本官网:全球知名的音乐、DVD和电脑游戏零售巨头
2016/08/13 全球购物
美国第二大连锁书店:Books-A-Million
2017/12/28 全球购物
CHARLES & KEITH英国官网:新加坡时尚品牌
2018/07/04 全球购物
Shopping happy life西班牙:以最优惠的价格提供最好的时尚配饰
2020/03/13 全球购物
临床医学系毕业生推荐信
2013/11/09 职场文书
物理课外活动总结
2014/08/27 职场文书
房屋所有权证明
2014/10/20 职场文书
2014年度安全工作总结
2014/12/04 职场文书
Css预编语言及区别详解
2021/04/25 HTML / CSS