每天迁移MySQL历史数据到历史库Python脚本


Posted in Python onApril 13, 2018

本文实例为大家分享了Python每天迁移MySQL历史数据到历史库的具体代码,供大家参考,具体内容如下

#!/usr/bin/env python 
# coding:utf-8 
__author__ = 'John' 
 

import MySQLdb 
import sys 
import datetime 
import time 
 
class ClassMigrate(object): 
  def _get_argv(self): 
    self.usage = """ 
      usage(): 
      python daily_migration.py --source=192.168.1.4:3306/db_name:tab_name/proxy/password \\ 
                    --dest=192.168.1.150:13301/db_name_archive:tab_name_201601/proxy/password \\ 
                    --delete_strategy=delete --primary_key=auto_id --date_col=ut --time_interval=180 
      """ 
    if len(sys.argv) == 1: 
      print self.usage 
      sys.exit(1) 
    elif sys.argv[1] == '--help' or sys.argv[1] == '-h': 
        print self.usage 
        sys.exit() 
    elif len(sys.argv) > 2: 
      for i in sys.argv[1:]: 
        _argv = i.split('=') 
        if _argv[0] == '--source': 
          _list = _argv[1].split('/') 
          self.source_host = _list[0].split(':')[0] 
          self.source_port = int(_list[0].split(':')[1]) 
          self.source_db = _list[1].split(':')[0] 
          self.source_tab = _list[1].split(':')[1] 
          self.source_user = _list[2] 
          self.source_password = _list[3] 
        elif _argv[0] == '--dest': 
          _list = _argv[1].split('/') 
          self.dest_host = _list[0].split(':')[0] 
          self.dest_port = int(_list[0].split(':')[1]) 
          self.dest_db = _list[1].split(':')[0] 
          self.dest_tab = _list[1].split(':')[1] 
          self.dest_user = _list[2] 
          self.dest_password = _list[3] 
        elif _argv[0] == '--delete_strategy': 
          self.deleteStrategy = _argv[1] 
          if self.deleteStrategy not in ('delete', 'drop'): 
            print (self.usage) 
            sys.exit(1) 
        elif _argv[0] == '--primary_key': 
          self.pk = _argv[1] 
        elif _argv[0] == '--date_col': 
          self.date_col = _argv[1] 
        elif _argv[0] == '--time_interval': 
          self.interval = _argv[1] 
        else: 
          print (self.usage) 
          sys.exit(1) 
 
  def __init__(self): 
    self._get_argv() 
## -------------------------------------------------------------------- 
    self.sourcedb_conn_str = MySQLdb.connect(host=self.source_host, port=self.source_port, user=self.source_user, passwd=self.source_password, db=self.source_db, charset='utf8') 
    self.sourcedb_conn_str.autocommit(True) 
    self.destdb_conn_str = MySQLdb.connect(host=self.dest_host, port=self.dest_port, user=self.dest_user, passwd=self.dest_password, db=self.dest_db, charset='utf8') 
    self.destdb_conn_str.autocommit(True) 
## -------------------------------------------------------------------- 
    self.template_tab = self.source_tab + '_template' 
    self.step_size = 20000 
## -------------------------------------------------------------------- 
    self._migCompleteState = False 
    self._deleteCompleteState = False 
## -------------------------------------------------------------------- 
    self.source_cnt = '' 
    self.source_min_id = '' 
    self.source_max_id = '' 
    self.source_checksum = '' 
    self.dest_cn = '' 
## -------------------------------------------------------------------- 
    self.today = time.strftime("%Y-%m-%d") 
    # self.today = '2016-05-30 09:59:40' 

def sourcedb_query(self, sql, sql_type): 
    try: 
      cr = self.sourcedb_conn_str.cursor() 
      cr.execute(sql) 
      if sql_type == 'select': 
        return cr.fetchall() 
      elif sql_type == 'dml': 
        rows = self.sourcedb_conn_str.affected_rows() 
        return rows 
      else: 
        return True 
    except Exception, e: 
      print (str(e) + "<br>") 
      return False 
    finally: 
      cr.close() 

  def destdb_query(self, sql, sql_type, values=''): 
    try: 
      cr = self.destdb_conn_str.cursor() 
      if sql_type == 'select': 
        cr.execute(sql) 
        return cr.fetchall() 
      elif sql_type == 'insertmany': 
        cr.executemany(sql, values) 
        rows = self.destdb_conn_str.affected_rows() 
        return rows 
      else: 
        cr.execute(sql) 
        return True 
    except Exception, e: 
      print (str(e) + "<br>") 
      return False 
    finally: 
      cr.close() 
 
 def create_table_from_source(self): 
    '''''因为tab_name表的数据需要迁移到archive引擎表,所以不适合使用这种方式。 预留作其他用途。''' 
    try: 
      sql = "show create table %s;" % self.source_tab 
      create_str = self.sourcedb_query(sql, 'select')[0][1] 
      create_str = create_str.replace('CREATE TABLE', 'CREATE TABLE IF NOT EXISTS') 
      self.destdb_query(create_str, 'ddl') 
      return True 
    except Exception, e: 
      print (str(e) + "<br>") 
      return False 

  def create_table_from_template(self): 
    try: 
      sql = 'CREATE TABLE IF NOT EXISTS %s like %s;' % (self.dest_tab, self.template_tab) 
      state = self.destdb_query(sql, 'ddl') 
      if state: 
        return True 
      else: 
        return False 
    except Exception, e: 
      print (str(e + "<br>") + "<br>") 
      return False 

  def get_min_max(self): 
    """ 创建目标表、并获取源表需要迁移的总条数、最小id、最大id """ 
    try: 
      print ("\nStarting Migrate at -- %s <br>") % (datetime.datetime.now().__str__()) 
      sql = """select count(*),IFNULL(min(%s),-1),IFNULL(max(%s),-1) from %s where %s >= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 00:00:00') \ 
           and %s <= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 23:59:59') """ \ 
            % (self.pk, self.pk, self.source_tab, self.date_col, self.today, self.interval, self.date_col, self.today, self.interval) 
      q = self.sourcedb_query(sql, 'select') 
      self.source_cnt = q[0][0] 
      self.source_min_id = q[0][1] 
      self.source_max_id = q[0][2] 
      self.source_checksum = str(self.source_cnt) + '_' + str(self.source_min_id) + '_' + str(self.source_max_id) 
      if self.source_cnt == 0 or self.source_min_id == -1 or self.source_max_id == -1: 
        print ("There is 0 record in source table been matched! <br>") 
        return False 
      else: 
        return True 
    except Exception, e: 
      print (str(e) + "<br>") 
      return False 
 
  def migrate_2_destdb(self): 
    try: 
      get_min_max_id = self.get_min_max() 
      if get_min_max_id: 
        k = self.source_min_id 
        desc_sql = "desc %s;" % self.source_tab 
        # self.filed = [] 
        cols = self.sourcedb_query(desc_sql, 'select') 
        # for j in cols: 
        #   self.filed.append(j[0]) 
        fileds = "%s," * len(cols) # 源表有多少个字段,就拼凑多少个%s,拼接到insert语句 
        fileds = fileds.rstrip(',') 
        while k <= self.source_max_id: 
          sql = """select * from %s where %s >= %d and %s< %d \ 
               and %s >= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 00:00:00') \ 
               and %s <= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 23:59:59') """\ 
             % (self.source_tab, self.pk, k, self.pk, k+self.step_size, self.date_col, self.today, self.interval, self.date_col, self.today, self.interval) 
          print ("\n%s <br>") % sql 
          starttime = datetime.datetime.now() 
          results = self.sourcedb_query(sql, 'select') 
          insert_sql = "insert into " + self.dest_tab + " values (%s)" % fileds 
          rows = self.destdb_query(insert_sql, 'insertmany', results) 
          if rows == False: 
            print ("Insert failed!! <br>") 
          else: 
            print ("Inserted %s rows. <br>") % rows 
          endtime = datetime.datetime.now() 
          timeinterval = endtime - starttime 
          print("Elapsed :" + str(timeinterval.seconds) + '.' + str(timeinterval.microseconds) + " seconds <br>") 
          k += self.step_size 
        print ("\nInsert complete at -- %s <br>") % (datetime.datetime.now().__str__()) 
        return True 
      else: 
        return False 
    except Exception, e: 
      print (str(e) + "<br>") 
      return False 
  
  def verify_total_cnt(self): 
    try: 
      sql = """select count(*),IFNULL(min(%s),-1),IFNULL(max(%s),-1) from %s where %s >= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 00:00:00') \ 
           and %s <= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 23:59:59') """ \ 
            % (self.pk, self.pk, self.dest_tab, self.date_col, self.today, self.interval, self.date_col, self.today, self.interval) 
      dest_result = self.destdb_query(sql, 'select') 
      self.dest_cnt = dest_result[0][0] 
      dest_checksum = str(self.dest_cnt) + '_' + str(dest_result[0][1]) + '_' + str(dest_result[0][2]) 
      print ("source_checksum: %s, dest_checksum: %s <br>") % (self.source_checksum, dest_checksum) 
      if self.source_cnt == dest_result[0][0] and dest_result[0][0] != 0 and self.source_checksum == dest_checksum: 
        self._migCompleteState = True 
        print ("Verify successfully !!<br>") 
      else: 
        print ("Verify failed !!<br>") 
        sys.exit(77) 
    except Exception, e: 
      print (str(e) + "<br>") 
  
  def drop_daily_partition(self): 
    try: 
      if self._migCompleteState: 
        sql = """explain partitions select * from %s where %s >= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 00:00:00') 
               and %s <= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 23:59:59') """\ 
             % (self.source_tab, self.date_col, self.today, self.interval, self.date_col, self.today, self.interval) 
        partition_name = self.sourcedb_query(sql, 'select') 
        partition_name = partition_name[0][3] 
 
 
 
 
        sql = """select count(*),IFNULL(min(%s),-1),IFNULL(max(%s),-1) from %s partition (%s)""" \ 
            % (self.pk, self.pk, self.source_tab, partition_name) 
        q = self.sourcedb_query(sql, 'select') 
        source_cnt = q[0][0] 
        source_min_id = q[0][1] 
        source_max_id = q[0][2] 
        checksum = str(source_cnt) + '_' + str(source_min_id) + '_' + str(source_max_id) 
        if source_cnt == 0 or source_min_id == -1 or source_max_id == -1: 
          print ("There is 0 record in source PARTITION been matched! <br>") 
        else: 
          if checksum == self.source_checksum: 
            drop_par_sql = "alter table %s drop partition %s;" % (self.source_tab, partition_name) 
            droped = self.sourcedb_query(drop_par_sql, 'ddl') 
            if droped: 
              print (drop_par_sql + " <br>") 
              print ("\nDrop partition complete at -- %s <br>") % (datetime.datetime.now().__str__()) 
              self._deleteCompleteState = True 
            else: 
              print (drop_par_sql + " <br>") 
              print ("Drop partition failed.. <br>") 
          else: 
            print ("The partition %s checksum failed !! Drop failed !!") % partition_name 
            sys.exit(77) 
    except Exception, e: 
      print (str(e) + "<br>") 
 
  def delete_data(self): 
    try: 
      if self._migCompleteState: 
        k = self.source_min_id 
        while k <= self.source_max_id: 
          sql = """delete from %s where %s >= %d and %s< %d \ 
               and %s >= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 00:00:00') \ 
               and %s <= CONCAT(DATE_FORMAT(DATE_ADD('%s', INTERVAL -%s day),'%%Y-%%m-%%d'), ' 23:59:59') """ \ 
             % (self.source_tab, self.pk, k, self.pk, k+self.step_size, self.date_col, self.today, self.interval, self.date_col, self.today, self.interval) 
          print ("\n%s <br>") % sql 
          starttime = datetime.datetime.now() 
          rows = self.sourcedb_query(sql, 'dml') 
          if rows == False: 
            print ("Delete failed!! <br>") 
          else: 
            print ("Deleted %s rows. <br>") % rows 
          endtime = datetime.datetime.now() 
          timeinterval = endtime - starttime 
          print("Elapsed :" + str(timeinterval.seconds) + '.' + str(timeinterval.microseconds) + " seconds <br>") 
          time.sleep(1) 
          k += self.step_size 
        print ("\nDelete complete at -- %s <br>") % (datetime.datetime.now().__str__()) 
        self._deleteCompleteState = True 
    except Exception, e: 
      print (str(e) + "<br>") 
       
  def do(self): 
    tab_create = self.create_table_from_template() 
    if tab_create: 
      migration = self.migrate_2_destdb() 
      if migration: 
        self.verify_total_cnt() 
        if self._migCompleteState: 
          if self.deleteStrategy == 'drop': 
            self.drop_daily_partition() 
          else: 
            self.delete_data() 
          print ("\n<br>") 
          print ("====="*5 + '<br>') 
          print ("source_total_cnt: %s <br>") % self.source_cnt 
          print ("dest_total_cnt: %s <br>") % self.dest_cnt 
          print ("====="*5 + '<br>') 
          if self._deleteCompleteState: 
            print ("\nFinal result: Successfully !! <br>") 
            sys.exit(88) 
          else: 
            print ("\nFinal result: Failed !! <br>") 
            sys.exit(254) 
    else: 
      print ("Create table failed ! Exiting. . .") 
      sys.exit(255) 
  
f = ClassMigrate() 
f.do()

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
Python异常处理总结
Aug 15 Python
Python 实现数据库更新脚本的生成方法
Jul 09 Python
flask中主动抛出异常及统一异常处理代码示例
Jan 18 Python
用Python写脚本,实现完全备份和增量备份的示例
Apr 29 Python
python selenium 对浏览器标签页进行关闭和切换的方法
May 21 Python
python中ASCII码和字符的转换方法
Jul 09 Python
解决每次打开pycharm直接进入项目的问题
Oct 28 Python
python 搭建简单的http server,可直接post文件的实例
Jan 03 Python
django框架实现模板中获取request 的各种信息示例
Jul 01 Python
python 已知一个字符,在一个list中找出近似值或相似值实现模糊匹配
Feb 29 Python
完美解决pycharm 不显示代码提示问题
Jun 02 Python
Python爬虫抓取指定网页图片代码实例
Jul 24 Python
python实现数据库跨服务器迁移
Apr 12 #Python
解决python3爬虫无法显示中文的问题
Apr 12 #Python
python读取中文txt文本的方法
Apr 12 #Python
基于python 处理中文路径的终极解决方法
Apr 12 #Python
解决Python2.7读写文件中的中文乱码问题
Apr 12 #Python
python 实现对文件夹内的文件排序编号
Apr 12 #Python
pandas数值计算与排序方法
Apr 12 #Python
You might like
PHP+DBM的同学录程序(3)
2006/10/09 PHP
9个实用的PHP代码片段分享
2015/01/22 PHP
php实现插入排序
2015/03/29 PHP
YII Framework教程之异常处理详解
2016/03/14 PHP
JavaScript中的prototype使用说明
2010/04/13 Javascript
Jquery Ajax请求代码(2)
2011/01/07 Javascript
ASP.NET jQuery 实例9  通过控件hyperlink实现返回顶部效果
2012/02/03 Javascript
ECMAScript 创建自己的js类库
2012/11/22 Javascript
用js实现博客打赏功能
2016/10/24 Javascript
谈谈因Vue.js引发关于getter和setter的思考
2016/12/02 Javascript
vue父子组件的数据传递示例
2017/03/07 Javascript
详解nodejs微信公众号开发——1.接入微信公众号
2017/04/10 NodeJs
jquery请求servlet实现ajax异步请求的示例
2017/06/03 jQuery
angularjs $http调用接口的方式详解
2018/08/13 Javascript
深入koa-bodyparser原理解析
2019/01/16 Javascript
微信小程序的注册页面包含倒计时验证码、获取用户信息
2019/05/22 Javascript
小程序登录之支付宝授权的实现示例
2019/12/13 Javascript
JS正则表达式验证密码强度
2020/03/18 Javascript
python3使用pyqt5制作一个超简单浏览器的实例
2017/10/19 Python
对Pandas DataFrame缺失值的查找与填充示例讲解
2018/11/06 Python
Python获取好友地区分布及好友性别分布情况代码详解
2019/07/10 Python
Python 二叉树的层序建立与三种遍历实现详解
2019/07/29 Python
快速一键生成Python爬虫请求头
2021/03/04 Python
英国设计的甲板鞋和船鞋:Chatham
2018/12/06 全球购物
电子商务专业实习生自我鉴定
2013/09/24 职场文书
房地产出纳岗位职责
2013/12/01 职场文书
代理班主任的自我评价
2014/02/04 职场文书
护士工作失误检讨书
2014/09/14 职场文书
2014年司法局工作总结
2014/12/11 职场文书
经典导游欢迎词
2015/01/26 职场文书
2015年银行信贷员工作总结
2015/05/19 职场文书
2016年党员读书月活动总结
2016/04/06 职场文书
python实现图片批量压缩
2021/04/24 Python
PHP实现rar解压读取扩展包小结
2021/06/03 PHP
python开发人人对战的五子棋小游戏
2022/05/02 Python
python数字图像处理之图像自动阈值分割示例
2022/06/28 Python