Python实现GIF动图以及视频卡通化详解


Posted in Python onDecember 06, 2021

前言

参考文章:Python实现照片卡通化

我继续魔改一下,让该模型可以支持将gif动图或者视频,也做成卡通化效果。毕竟一张图可以那就带边视频也可以,没毛病。所以继给次元壁来了一拳,我在加两脚。

项目github地址:github地址

环境依赖

除了参考文章中的依赖,还需要加一些其他依赖,requirements.txt如下:

Python实现GIF动图以及视频卡通化详解

其他环境不太清楚的,可以看我前言链接地址的文章,有具体说明。

核心代码

不废话了,先上gif代码。

gif动图卡通化

实现代码如下:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2021/12/5 18:10
# @Author  : 剑客阿良_ALiang
# @Site    : 
# @File    : gif_cartoon_tool.py
# !/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2021/12/5 0:26
# @Author  : 剑客阿良_ALiang
# @Site    :
# @File    : video_cartoon_tool.py
 
# !/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2021/12/4 22:34
# @Author  : 剑客阿良_ALiang
# @Site    :
# @File    : image_cartoon_tool.py
 
from PIL import Image, ImageEnhance, ImageSequence
import torch
from torchvision.transforms.functional import to_tensor, to_pil_image
from torch import nn
import os
import torch.nn.functional as F
import uuid
import imageio
 
 
# -------------------------- hy add 01 --------------------------
class ConvNormLReLU(nn.Sequential):
    def __init__(self, in_ch, out_ch, kernel_size=3, stride=1, padding=1, pad_mode="reflect", groups=1, bias=False):
        pad_layer = {
            "zero": nn.ZeroPad2d,
            "same": nn.ReplicationPad2d,
            "reflect": nn.ReflectionPad2d,
        }
        if pad_mode not in pad_layer:
            raise NotImplementedError
 
        super(ConvNormLReLU, self).__init__(
            pad_layer[pad_mode](padding),
            nn.Conv2d(in_ch, out_ch, kernel_size=kernel_size, stride=stride, padding=0, groups=groups, bias=bias),
            nn.GroupNorm(num_groups=1, num_channels=out_ch, affine=True),
            nn.LeakyReLU(0.2, inplace=True)
        )
 
 
class InvertedResBlock(nn.Module):
    def __init__(self, in_ch, out_ch, expansion_ratio=2):
        super(InvertedResBlock, self).__init__()
 
        self.use_res_connect = in_ch == out_ch
        bottleneck = int(round(in_ch * expansion_ratio))
        layers = []
        if expansion_ratio != 1:
            layers.append(ConvNormLReLU(in_ch, bottleneck, kernel_size=1, padding=0))
 
        # dw
        layers.append(ConvNormLReLU(bottleneck, bottleneck, groups=bottleneck, bias=True))
        # pw
        layers.append(nn.Conv2d(bottleneck, out_ch, kernel_size=1, padding=0, bias=False))
        layers.append(nn.GroupNorm(num_groups=1, num_channels=out_ch, affine=True))
 
        self.layers = nn.Sequential(*layers)
 
    def forward(self, input):
        out = self.layers(input)
        if self.use_res_connect:
            out = input + out
        return out
 
 
class Generator(nn.Module):
    def __init__(self, ):
        super().__init__()
 
        self.block_a = nn.Sequential(
            ConvNormLReLU(3, 32, kernel_size=7, padding=3),
            ConvNormLReLU(32, 64, stride=2, padding=(0, 1, 0, 1)),
            ConvNormLReLU(64, 64)
        )
 
        self.block_b = nn.Sequential(
            ConvNormLReLU(64, 128, stride=2, padding=(0, 1, 0, 1)),
            ConvNormLReLU(128, 128)
        )
 
        self.block_c = nn.Sequential(
            ConvNormLReLU(128, 128),
            InvertedResBlock(128, 256, 2),
            InvertedResBlock(256, 256, 2),
            InvertedResBlock(256, 256, 2),
            InvertedResBlock(256, 256, 2),
            ConvNormLReLU(256, 128),
        )
 
        self.block_d = nn.Sequential(
            ConvNormLReLU(128, 128),
            ConvNormLReLU(128, 128)
        )
 
        self.block_e = nn.Sequential(
            ConvNormLReLU(128, 64),
            ConvNormLReLU(64, 64),
            ConvNormLReLU(64, 32, kernel_size=7, padding=3)
        )
 
        self.out_layer = nn.Sequential(
            nn.Conv2d(32, 3, kernel_size=1, stride=1, padding=0, bias=False),
            nn.Tanh()
        )
 
    def forward(self, input, align_corners=True):
        out = self.block_a(input)
        half_size = out.size()[-2:]
        out = self.block_b(out)
        out = self.block_c(out)
 
        if align_corners:
            out = F.interpolate(out, half_size, mode="bilinear", align_corners=True)
        else:
            out = F.interpolate(out, scale_factor=2, mode="bilinear", align_corners=False)
        out = self.block_d(out)
 
        if align_corners:
            out = F.interpolate(out, input.size()[-2:], mode="bilinear", align_corners=True)
        else:
            out = F.interpolate(out, scale_factor=2, mode="bilinear", align_corners=False)
        out = self.block_e(out)
 
        out = self.out_layer(out)
        return out
 
 
# -------------------------- hy add 02 --------------------------
 
def handle(gif_path: str, output_dir: str, type: int, device='cpu'):
    _ext = os.path.basename(gif_path).strip().split('.')[-1]
    if type == 1:
        _checkpoint = './weights/paprika.pt'
    elif type == 2:
        _checkpoint = './weights/face_paint_512_v1.pt'
    elif type == 3:
        _checkpoint = './weights/face_paint_512_v2.pt'
    elif type == 4:
        _checkpoint = './weights/celeba_distill.pt'
    else:
        raise Exception('type not support')
    os.makedirs(output_dir, exist_ok=True)
    net = Generator()
    net.load_state_dict(torch.load(_checkpoint, map_location="cpu"))
    net.to(device).eval()
    result = os.path.join(output_dir, '{}.{}'.format(uuid.uuid1().hex, _ext))
    img = Image.open(gif_path)
    out_images = []
    for frame in ImageSequence.Iterator(img):
        frame = frame.convert("RGB")
        with torch.no_grad():
            image = to_tensor(frame).unsqueeze(0) * 2 - 1
            out = net(image.to(device), False).cpu()
            out = out.squeeze(0).clip(-1, 1) * 0.5 + 0.5
            out = to_pil_image(out)
            out_images.append(out)
    # out_images[0].save(result, save_all=True, loop=True, append_images=out_images[1:], duration=100)
    imageio.mimsave(result, out_images, fps=15)
    return result
 
 
if __name__ == '__main__':
    print(handle('samples/gif/128.gif', 'samples/gif_result/', 3, 'cuda'))

代码说明:

1、主要的handle方法入参分别为:gif地址、输出目录、类型、设备使用(默认cpu,可选cuda使用显卡)。

2、类型主要是选择模型,最好用3,人像处理更生动一些。

执行验证一下

下面是我准备的gif素材

Python实现GIF动图以及视频卡通化详解

执行结果如下

Python实现GIF动图以及视频卡通化详解

看一下效果

Python实现GIF动图以及视频卡通化详解

哈哈,有点意思哦。

视频卡通化

实现代码如下:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2021/12/5 0:26
# @Author  : 剑客阿良_ALiang
# @Site    : 
# @File    : video_cartoon_tool.py
 
# !/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2021/12/4 22:34
# @Author  : 剑客阿良_ALiang
# @Site    :
# @File    : image_cartoon_tool.py
 
from PIL import Image, ImageEnhance
import torch
from torchvision.transforms.functional import to_tensor, to_pil_image
from torch import nn
import os
import torch.nn.functional as F
import uuid
import cv2
import numpy as np
import time
from ffmpy import FFmpeg
 
 
# -------------------------- hy add 01 --------------------------
class ConvNormLReLU(nn.Sequential):
    def __init__(self, in_ch, out_ch, kernel_size=3, stride=1, padding=1, pad_mode="reflect", groups=1, bias=False):
        pad_layer = {
            "zero": nn.ZeroPad2d,
            "same": nn.ReplicationPad2d,
            "reflect": nn.ReflectionPad2d,
        }
        if pad_mode not in pad_layer:
            raise NotImplementedError
 
        super(ConvNormLReLU, self).__init__(
            pad_layer[pad_mode](padding),
            nn.Conv2d(in_ch, out_ch, kernel_size=kernel_size, stride=stride, padding=0, groups=groups, bias=bias),
            nn.GroupNorm(num_groups=1, num_channels=out_ch, affine=True),
            nn.LeakyReLU(0.2, inplace=True)
        )
 
 
class InvertedResBlock(nn.Module):
    def __init__(self, in_ch, out_ch, expansion_ratio=2):
        super(InvertedResBlock, self).__init__()
 
        self.use_res_connect = in_ch == out_ch
        bottleneck = int(round(in_ch * expansion_ratio))
        layers = []
        if expansion_ratio != 1:
            layers.append(ConvNormLReLU(in_ch, bottleneck, kernel_size=1, padding=0))
 
        # dw
        layers.append(ConvNormLReLU(bottleneck, bottleneck, groups=bottleneck, bias=True))
        # pw
        layers.append(nn.Conv2d(bottleneck, out_ch, kernel_size=1, padding=0, bias=False))
        layers.append(nn.GroupNorm(num_groups=1, num_channels=out_ch, affine=True))
 
        self.layers = nn.Sequential(*layers)
 
    def forward(self, input):
        out = self.layers(input)
        if self.use_res_connect:
            out = input + out
        return out
 
 
class Generator(nn.Module):
    def __init__(self, ):
        super().__init__()
 
        self.block_a = nn.Sequential(
            ConvNormLReLU(3, 32, kernel_size=7, padding=3),
            ConvNormLReLU(32, 64, stride=2, padding=(0, 1, 0, 1)),
            ConvNormLReLU(64, 64)
        )
 
        self.block_b = nn.Sequential(
            ConvNormLReLU(64, 128, stride=2, padding=(0, 1, 0, 1)),
            ConvNormLReLU(128, 128)
        )
 
        self.block_c = nn.Sequential(
            ConvNormLReLU(128, 128),
            InvertedResBlock(128, 256, 2),
            InvertedResBlock(256, 256, 2),
            InvertedResBlock(256, 256, 2),
            InvertedResBlock(256, 256, 2),
            ConvNormLReLU(256, 128),
        )
 
        self.block_d = nn.Sequential(
            ConvNormLReLU(128, 128),
            ConvNormLReLU(128, 128)
        )
 
        self.block_e = nn.Sequential(
            ConvNormLReLU(128, 64),
            ConvNormLReLU(64, 64),
            ConvNormLReLU(64, 32, kernel_size=7, padding=3)
        )
 
        self.out_layer = nn.Sequential(
            nn.Conv2d(32, 3, kernel_size=1, stride=1, padding=0, bias=False),
            nn.Tanh()
        )
 
    def forward(self, input, align_corners=True):
        out = self.block_a(input)
        half_size = out.size()[-2:]
        out = self.block_b(out)
        out = self.block_c(out)
 
        if align_corners:
            out = F.interpolate(out, half_size, mode="bilinear", align_corners=True)
        else:
            out = F.interpolate(out, scale_factor=2, mode="bilinear", align_corners=False)
        out = self.block_d(out)
 
        if align_corners:
            out = F.interpolate(out, input.size()[-2:], mode="bilinear", align_corners=True)
        else:
            out = F.interpolate(out, scale_factor=2, mode="bilinear", align_corners=False)
        out = self.block_e(out)
 
        out = self.out_layer(out)
        return out
 
 
# -------------------------- hy add 02 --------------------------
 
def handle(video_path: str, output_dir: str, type: int, fps: int, device='cpu'):
    _ext = os.path.basename(video_path).strip().split('.')[-1]
    if type == 1:
        _checkpoint = './weights/paprika.pt'
    elif type == 2:
        _checkpoint = './weights/face_paint_512_v1.pt'
    elif type == 3:
        _checkpoint = './weights/face_paint_512_v2.pt'
    elif type == 4:
        _checkpoint = './weights/celeba_distill.pt'
    else:
        raise Exception('type not support')
    os.makedirs(output_dir, exist_ok=True)
    # 获取视频音频
    _audio = extract(video_path, output_dir, 'wav')
    net = Generator()
    net.load_state_dict(torch.load(_checkpoint, map_location="cpu"))
    net.to(device).eval()
    result = os.path.join(output_dir, '{}.{}'.format(uuid.uuid1().hex, _ext))
    capture = cv2.VideoCapture(video_path)
    size = (int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)))
    print(size)
    videoWriter = cv2.VideoWriter(result, cv2.VideoWriter_fourcc(*'mp4v'), fps, size)
    cul = 0
    with torch.no_grad():
        while True:
            ret, frame = capture.read()
            if ret:
                print(ret)
                image = to_tensor(frame).unsqueeze(0) * 2 - 1
                out = net(image.to(device), False).cpu()
                out = out.squeeze(0).clip(-1, 1) * 0.5 + 0.5
                out = to_pil_image(out)
                contrast_enhancer = ImageEnhance.Contrast(out)
                img_enhanced_image = contrast_enhancer.enhance(2)
                enhanced_image = np.asarray(img_enhanced_image)
                videoWriter.write(enhanced_image)
                cul += 1
                print('第{}张图'.format(cul))
            else:
                break
    videoWriter.release()
    # 视频添加原音频
    _final_video = video_add_audio(result, _audio, output_dir)
    return _final_video
 
 
# -------------------------- hy add 03 --------------------------
def extract(video_path: str, tmp_dir: str, ext: str):
    file_name = '.'.join(os.path.basename(video_path).split('.')[0:-1])
    print('文件名:{},提取音频'.format(file_name))
    if ext == 'mp3':
        return _run_ffmpeg(video_path, os.path.join(tmp_dir, '{}.{}'.format(uuid.uuid1().hex, ext)), 'mp3')
    if ext == 'wav':
        return _run_ffmpeg(video_path, os.path.join(tmp_dir, '{}.{}'.format(uuid.uuid1().hex, ext)), 'wav')
 
 
def _run_ffmpeg(video_path: str, audio_path: str, format: str):
    ff = FFmpeg(inputs={video_path: None},
                outputs={audio_path: '-f {} -vn'.format(format)})
    print(ff.cmd)
    ff.run()
    return audio_path
 
 
# 视频添加音频
def video_add_audio(video_path: str, audio_path: str, output_dir: str):
    _ext_video = os.path.basename(video_path).strip().split('.')[-1]
    _ext_audio = os.path.basename(audio_path).strip().split('.')[-1]
    if _ext_audio not in ['mp3', 'wav']:
        raise Exception('audio format not support')
    _codec = 'copy'
    if _ext_audio == 'wav':
        _codec = 'aac'
    result = os.path.join(
        output_dir, '{}.{}'.format(
            uuid.uuid4(), _ext_video))
    ff = FFmpeg(
        inputs={video_path: None, audio_path: None},
        outputs={result: '-map 0:v -map 1:a -c:v copy -c:a {} -shortest'.format(_codec)})
    print(ff.cmd)
    ff.run()
    return result
 
 
if __name__ == '__main__':
    print(handle('samples/video/981.mp4', 'samples/video_result/', 3, 25, 'cuda'))

代码说明

1、主要的实现方法入参分别为:视频地址、输出目录、类型、fps(帧率)、设备类型(默认cpu,可选择cuda显卡模式)。

2、类型主要是选择模型,最好用3,人像处理更生动一些。

3、代码设计思路:先将视频音频提取出来、将视频逐帧处理后写入新视频、新视频和原视频音频融合。

4、视频中间会产生临时文件,没有清理,如需要可以修改代码自行清理。

验证一下

下面是我准备的视频素材截图,我会上传到github上。

Python实现GIF动图以及视频卡通化详解

执行结果

Python实现GIF动图以及视频卡通化详解

看看效果截图

Python实现GIF动图以及视频卡通化详解

还是很不错的哦。

总结

这次可不是没什么好总结的,总结的东西蛮多的。首先我说一下这个开源项目目前模型的一些问题。

1、我测试了不少图片,总的来说对亚洲人的脸型不能很好的卡通化,但是欧美的脸型都比较好。所以还是训练的数据不是很够,但是能理解,毕竟要专门做卡通化的标注数据想想就是蛮头疼的事。所以我建议大家在使用的时候,多关注一下项目是否更新了最新的模型。

2、视频一但有字幕,会对字幕也做处理。所以可以考虑找一些视频和字幕分开的素材,效果会更好一些。

以上就是Python实现GIF动图以及视频卡通化详解的详细内容,更多关于Python 动图 视频卡通化的资料请关注三水点靠木其它相关文章!

Python 相关文章推荐
web.py在模板中输出美元符号的方法
Aug 26 Python
在Python中使用HTML模版的教程
Apr 29 Python
使用Python程序抓取新浪在国内的所有IP的教程
May 04 Python
全面了解python字符串和字典
Jul 07 Python
查看django执行的sql语句及消耗时间的两种方法
May 29 Python
python输出100以内的质数与合数实例代码
Jul 08 Python
Python Pywavelet 小波阈值实例
Jan 09 Python
Python 类的魔法属性用法实例分析
Nov 21 Python
Tensorflow实现在训练好的模型上进行测试
Jan 20 Python
如何通过python实现全排列
Feb 11 Python
jupyter 导入csv文件方式
Apr 21 Python
python中not、and和or的优先级与详细用法介绍
Nov 03 Python
Python实现照片卡通化
用Python爬取英雄联盟的皮肤详细示例
Python+腾讯云服务器实现每日自动健康打卡
Dec 06 #Python
python 管理系统实现mysql交互的示例代码
Python中super().__init__()测试以及理解
Dec 06 #Python
浅析Python中的随机采样和概率分布
Dec 06 #Python
python程序的组织结构详解
You might like
关于页面优化和伪静态
2009/10/11 PHP
php ftp文件上传函数(基础版)
2010/06/03 PHP
PHP的autoload机制的实现解析
2012/09/15 PHP
PHP面向对象程序设计方法实例详解
2016/12/24 PHP
JavaScript 提升运行速度之循环篇 译文
2009/08/15 Javascript
javascript parseInt与Number函数的区别
2010/01/21 Javascript
网页中CDATA标记的说明
2010/09/12 Javascript
jQuery产品间断向下滚动效果核心代码
2014/05/08 Javascript
jQuery获得指定元素坐标的方法
2015/04/14 Javascript
探究Vue.js 2.0新增的虚拟DOM
2016/10/20 Javascript
js实现返回顶部效果
2017/03/10 Javascript
浅谈js使用in和hasOwnProperty获取对象属性的区别
2017/04/27 Javascript
小程序实现新用户判断并跳转激活的方法
2019/05/20 Javascript
从零搭一个自用的前端脚手架的方法步骤
2019/09/23 Javascript
Vue2.X和Vue3.0数据响应原理变化的区别
2019/11/07 Javascript
python中urllib模块用法实例详解
2014/11/19 Python
Python将多份excel表格整理成一份表格
2018/01/03 Python
Python3中的json模块使用详解
2018/05/05 Python
Python基于最小二乘法实现曲线拟合示例
2018/06/14 Python
python pcm音频添加头转成Wav格式文件的方法
2019/01/09 Python
python实现高斯判别分析算法的例子
2019/12/09 Python
Python While循环语句实例演示及原理解析
2020/01/03 Python
matplotlib阶梯图的实现(step())
2021/03/02 Python
个人简历的自荐信
2013/10/23 职场文书
违反工作纪律检讨书
2014/02/15 职场文书
《美丽的小路》教学反思
2014/02/26 职场文书
期末学生评语大全
2014/04/24 职场文书
拉歌口号大全
2014/06/13 职场文书
教师党员个人整改措施
2014/10/27 职场文书
银行先进个人总结
2015/02/15 职场文书
培训班开班主持词
2015/07/02 职场文书
小学语文新课改心得体会
2016/01/22 职场文书
大学生创业,为什么都会选择快餐饮?
2019/08/08 职场文书
Mysql 用户权限管理实现
2021/05/25 MySQL
如何用PHP实现分布算法之一致性哈希算法
2021/05/26 PHP
Python 中面向接口编程
2022/05/20 Python