Python实现网页截图(PyQT5)过程解析


Posted in Python onAugust 12, 2019

方案说明

功能要求:实现网页加载后将页面截取成长图片

涉及模块:PyQT5 PIL

逻辑说明:

1:完成窗口设置,利用PyQT5 QWebEngineView加载网页地址,待网页加载完成后,调用check_pag;

class MainWindow(QMainWindow):
  def __init__(self, parent=None):
    super(MainWindow, self).__init__(parent)
    self.setWindowTitle('易哈佛')
    self.temp_height = 0
    self.setWindowFlag(Qt.WindowMinMaxButtonsHint, False) # 禁用最大化,最小化
    # self.setWindowFlag(Qt.WindowStaysOnTopHint, True) # 窗口顶置
    self.setWindowFlag(Qt.FramelessWindowHint, True) # 窗口无边框
  def urlScreenShot(self, url):
    self.browser = QWebEngineView()
    self.browser.load(QUrl(url))
    geometry = self.chose_screen()
    self.setGeometry(geometry)
    self.browser.loadFinished.connect(self.check_page)
    self.setCentralWidget(self.browser)
  def get_page_size(self):
    size = self.browser.page().contentsSize()
    self.set_height = size.height()
    self.set_width = size.width()
    return size.width(), size.height()
  def chose_screen(self):
    width, height = 750, 1370
    desktop = QApplication.desktop()
    screen_count = desktop.screenCount()
    for i in range(0, screen_count):
      rect = desktop.availableGeometry(i)
      s_width, s_height = rect.width(), rect.height()
      if s_width > width and s_height > height:
        return QRect(rect.left(), rect.top(), width, height)
    return QRect(0, 0, width, height)
if __name__ == '__main__':
  app = QApplication(sys.argv)
  win = MainWindow()
  win.show()
  app.exit(app.exec_())

2:收集页面高度,并计算分次截屏的次数和余量高度;实例化图片合并工具,设置定时器,超时信号发出后,执行exe_command;

def check_page(self):
    p_width, p_height = self.get_page_size()
    self.page, self.over_flow_size = divmod(p_height, self.height())
    if self.page == 0:
      self.page = 1
    self.ssm = ScreenShotMerge(self.page, self.over_flow_size)
    self.timer = QTimer(self)
    self.timer.timeout.connect(self.exe_command)
    self.timer.setInterval(400)
    self.timer.start()

3:exe_command用来控制截图次数,并在每次截图完成后控制网页向下滑屏幕的高度;所有的页面都已截取时,完成图片合并。

def exe_command(self):
    if self.page > 0:
      self.screen_shot()
      self.run_js()
    elif self.page < 0:
      self.timer.stop()
      self.ssm.image_merge()
      self.close()
    elif self.over_flow_size > 0:
      self.screen_shot()
    self.page -= 1    
  def run_js(self):
    script = """
      var scroll = function (dHeight) {
      var t = document.documentElement.scrollTop
      var h = document.documentElement.scrollHeight
      dHeight = dHeight || 0
      var current = t + dHeight
      if (current > h) {
        window.scrollTo(0, document.documentElement.clientHeight)
       } else {
        window.scrollTo(0, current)
       }
      }
    """
    command = script + '\n scroll({})'.format(self.height())
    self.browser.page().runJavaScript(command)

4:screen_shot在每次截图完成后将图片保存,并将图片对象由图片合并根据保存到列表中。

def screen_shot(self):
    screen = QApplication.primaryScreen()
    winid = self.browser.winId()
    pix = screen.grabWindow(int(winid))
    name = '{}/temp.png'.format(self.ssm.root_path)
    pix.save(name)
    self.ssm.add_im(name)

5:截图合并工具,在每次截图完成后将图片对象保存,完成余量截图的重绘和截图的合并。

class ScreenShotMerge():
  def __init__(self, page, over_flow_size):
    self.im_list = []
    self.page = page
    self.over_flow_size = over_flow_size
    self.get_path()

  def get_path(self):
    self.root_path = Path(__file__).parent.joinpath('temp')
    if not self.root_path.exists():
      self.root_path.mkdir(parents=True)
    self.save_path = self.root_path.joinpath('merge.png')

  def add_im(self, path):
    if len(self.im_list) == self.page:
      im = self.reedit_image(path)
    else:
      im = Image.open(path)
    im.save('{}/{}.png'.format(self.root_path, len(self.im_list) + 1))
    self.im_list.append(im)

  def get_new_size(self):
    max_width = 0
    total_height = 0
    # 计算合成后图片的宽度(以最宽的为准)和高度
    for img in self.im_list:
      width, height = img.size
      if width > max_width:
        max_width = width
      total_height += height
    return max_width, total_height

  def image_merge(self, ):
    if len(self.im_list) > 1:
      max_width, total_height = self.get_new_size()
      # 产生一张空白图
      new_img = Image.new('RGB', (max_width - 15, total_height), 255)
      x = y = 0
      for img in self.im_list:
        width, height = img.size
        new_img.paste(img, (x, y))
        y += height
      new_img.save(self.save_path)
      print('截图成功:', self.save_path)
    else:
      obj = self.im_list[0]
      width, height = obj.size
      left, top, right, bottom = 0, 0, width, height
      box = (left, top, right, bottom)
      region = obj.crop(box)
      new_img = Image.new('RGB', (width, height), 255)
      new_img.paste(region, box)
      new_img.save(self.save_path)
      print('截图成功:', self.save_path)

  def reedit_image(self, path):
    obj = Image.open(path)
    width, height = obj.size
    left, top, right, bottom = 0, height - self.over_flow_size, width, height
    box = (left, top, right, bottom)
    region = obj.crop(box)
    return region

截图功能完整代码

#!/usr/bin/env python
# -*- coding:UTF-8 -*-
# Author:Leslie-x
import sys
from PyQt5.QtCore import *
from PyQt5.QtWidgets import *
from PyQt5.QtWebEngineWidgets import *
from PIL import Image
from pathlib import Path
class ScreenShotMerge():
  def __init__(self, page, over_flow_size):
    self.im_list = []
    self.page = page
    self.over_flow_size = over_flow_size
    self.get_path()
  def get_path(self):
    self.root_path = Path(__file__).parent.joinpath('temp')
    if not self.root_path.exists():
      self.root_path.mkdir(parents=True)
    self.save_path = self.root_path.joinpath('merge.png')
  def add_im(self, path):
    if len(self.im_list) == self.page:
      im = self.reedit_image(path)
    else:
      im = Image.open(path)
    im.save('{}/{}.png'.format(self.root_path, len(self.im_list) + 1))
    self.im_list.append(im)
  def get_new_size(self):
    max_width = 0
    total_height = 0
    # 计算合成后图片的宽度(以最宽的为准)和高度
    for img in self.im_list:
      width, height = img.size
      if width > max_width:
        max_width = width
      total_height += height
    return max_width, total_height
  def image_merge(self, ):
    if len(self.im_list) > 1:
      max_width, total_height = self.get_new_size()
      # 产生一张空白图
      new_img = Image.new('RGB', (max_width - 15, total_height), 255)
      x = y = 0
      for img in self.im_list:
        width, height = img.size
        new_img.paste(img, (x, y))
        y += height
      new_img.save(self.save_path)
      print('截图成功:', self.save_path)
    else:
      obj = self.im_list[0]
      width, height = obj.size
      left, top, right, bottom = 0, 0, width, height
      box = (left, top, right, bottom)
      region = obj.crop(box)
      new_img = Image.new('RGB', (width, height), 255)
      new_img.paste(region, box)
      new_img.save(self.save_path)
      print('截图成功:', self.save_path)
  def reedit_image(self, path):
    obj = Image.open(path)
    width, height = obj.size
    left, top, right, bottom = 0, height - self.over_flow_size, width, height
    box = (left, top, right, bottom)
    region = obj.crop(box)
    return region
class MainWindow(QMainWindow):
  def __init__(self, parent=None):
    super(MainWindow, self).__init__(parent)
    self.setWindowTitle('易哈佛')
    self.temp_height = 0
    self.setWindowFlag(Qt.WindowMinMaxButtonsHint, False) # 禁用最大化,最小化
    # self.setWindowFlag(Qt.WindowStaysOnTopHint, True) # 窗口顶置
    self.setWindowFlag(Qt.FramelessWindowHint, True) # 窗口无边框
  def urlScreenShot(self, url):
    self.browser = QWebEngineView()
    self.browser.load(QUrl(url))
    geometry = self.chose_screen()
    self.setGeometry(geometry)
    self.browser.loadFinished.connect(self.check_page)
    self.setCentralWidget(self.browser)
  def get_page_size(self):
    size = self.browser.page().contentsSize()
    self.set_height = size.height()
    self.set_width = size.width()
    return size.width(), size.height()
  def chose_screen(self):
    width, height = 750, 1370
    desktop = QApplication.desktop()
    screen_count = desktop.screenCount()
    for i in range(0, screen_count):
      rect = desktop.availableGeometry(i)
      s_width, s_height = rect.width(), rect.height()
      if s_width > width and s_height > height:
        return QRect(rect.left(), rect.top(), width, height)
    return QRect(0, 0, width, height)
  def check_page(self):
    p_width, p_height = self.get_page_size()
    self.page, self.over_flow_size = divmod(p_height, self.height())
    if self.page == 0:
      self.page = 1
    self.ssm = ScreenShotMerge(self.page, self.over_flow_size)
    self.timer = QTimer(self)
    self.timer.timeout.connect(self.exe_command)
    self.timer.setInterval(400)
    self.timer.start()
  def exe_command(self):
    if self.page > 0:
      self.screen_shot()
      self.run_js()

    elif self.page < 0:
      self.timer.stop()
      self.ssm.image_merge()
      self.close()

    elif self.over_flow_size > 0:
      self.screen_shot()
    self.page -= 1

  def run_js(self):
    script = """
      var scroll = function (dHeight) {
      var t = document.documentElement.scrollTop
      var h = document.documentElement.scrollHeight
      dHeight = dHeight || 0
      var current = t + dHeight
      if (current > h) {
        window.scrollTo(0, document.documentElement.clientHeight)
       } else {
        window.scrollTo(0, current)
       }
      }
    """
    command = script + '\n scroll({})'.format(self.height())
    self.browser.page().runJavaScript(command)

  def screen_shot(self):
    screen = QApplication.primaryScreen()
    winid = self.browser.winId()
    pix = screen.grabWindow(int(winid))
    name = '{}/temp.png'.format(self.ssm.root_path)
    pix.save(name)
    self.ssm.add_im(name)

if __name__ == '__main__':
  url = 'http://blog.sina.com.cn/lm/rank/focusbang//'
  app = QApplication(sys.argv)
  win = MainWindow()
  win.urlScreenShot(url)
  win.show()
  app.exit(app.exec_())

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
为Python的web框架编写MVC配置来使其运行的教程
Apr 30 Python
Windows下实现Python2和Python3两个版共存的方法
Jun 12 Python
python中利用Future对象异步返回结果示例代码
Sep 07 Python
对DataFrame数据中的重复行,利用groupby累加合并的方法详解
Jan 30 Python
django基于cors解决跨域请求问题详解
Aug 06 Python
python中文分词库jieba使用方法详解
Feb 11 Python
keras的siamese(孪生网络)实现案例
Jun 12 Python
python怎么自定义捕获错误
Jun 29 Python
python中数字是否为可变类型
Jul 08 Python
Python学习工具jupyter notebook安装及用法解析
Oct 23 Python
五分钟学会怎么用Pygame做一个简单的贪吃蛇
Jan 06 Python
Python实现DBSCAN聚类算法并样例测试
Jun 22 Python
python实现知乎高颜值图片爬取
Aug 12 #Python
python3 enum模块的应用实例详解
Aug 12 #Python
Python一键查找iOS项目中未使用的图片、音频、视频资源
Aug 12 #Python
django+echart数据动态显示的例子
Aug 12 #Python
Flask框架学习笔记之使用Flask实现表单开发详解
Aug 12 #Python
Flask框架学习笔记之表单基础介绍与表单提交方式
Aug 12 #Python
python内存管理机制原理详解
Aug 12 #Python
You might like
php中json_decode()和json_encode()的使用方法
2012/06/04 PHP
PHP将字符分解为多个字符串的方法
2014/11/22 PHP
php打印输出棋盘的实现方法
2014/12/23 PHP
支付宝接口开发集成支付环境小结
2015/03/17 PHP
深入解析PHP的Yii框架中的event事件机制
2016/03/17 PHP
CI框架(ajax分页,全选,反选,不选,批量删除)完整代码详解
2016/11/01 PHP
TNC vs BOOM BO3 第一场2.13
2021/03/10 DOTA
javascript实现的动态添加表单元素input,button等(appendChild)
2007/11/24 Javascript
关于jQuery参考实例 1.0 jQuery的哲学
2013/04/07 Javascript
jQuery 生成svg矢量二维码
2016/08/09 Javascript
jquery实现静态搜索功能(可输入搜索文字)
2017/03/28 jQuery
Vue.js进阶知识点总结
2018/04/01 Javascript
Vue.js 事件修饰符的使用教程
2018/11/01 Javascript
layui动态渲染生成select的option值方法
2019/09/23 Javascript
使用Taro实现小程序商城的购物车功能模块的实例代码
2020/06/05 Javascript
[01:28:43]2014 DOTA2华西杯精英邀请赛5 24 DK VS CIS
2014/05/25 DOTA
Python面向对象编程基础解析(二)
2017/10/26 Python
Python类的继承和多态代码详解
2017/12/27 Python
python selenium UI自动化解决验证码的4种方法
2018/01/05 Python
Python实现Pig Latin小游戏实例代码
2018/02/02 Python
Sanic框架基于类的视图用法示例
2018/07/18 Python
用Python编写一个高效的端口扫描器的方法
2018/12/20 Python
Python 生成器,迭代,yield关键字,send()传参给yield语句操作示例
2019/10/12 Python
python实现tail -f 功能
2020/01/17 Python
Python networkx包的实现
2020/02/14 Python
html5 跨文档消息传输示例探讨
2013/04/01 HTML / CSS
SheIn俄罗斯:时尚女装网上商店
2017/02/28 全球购物
365 Tickets英国:全球景点门票
2019/07/06 全球购物
Parfumdreams芬兰:购买香水和化妆品
2021/02/13 全球购物
介绍下Java中==和equals的区别
2013/09/01 面试题
服装销售人员求职自我评价
2013/09/26 职场文书
2015年小学二年级班主任工作总结
2015/05/21 职场文书
Django展示可视化图表的多种方式
2021/04/08 Python
天谕手游15杯全调酒配方和调酒券的获得方式
2022/04/06 其他游戏
从零开始在Centos7上部署SpringBoot项目
2022/04/07 Servers
解决 Redis 秒杀超卖场景的高并发
2022/04/12 Redis