Django实现在线无水印抖音视频下载(附源码及地址)


Posted in Python onMay 06, 2021

Django实现在线无水印抖音视频下载(附源码及地址)

项目地址是:https://www.chenshiyang.com/dytk

接下来我们分析下源码简要看下实现原理。

实现原理

该项目不需要使用模型(models), 最核心的只有两个页面:一个主页面(home)展示包含下载url地址的表单,一个下载页面(download)处理表单请求,并展示去水印后的视频文件地址及文件大小,以及用于手机预览的二维码。

对应两个核心页面的路由如下所示,每个url对应一个视图函数。

# urls.py

from django.urls import path

from web.views import home, download

urlpatterns = [
    path('home', home),
    path('downloader', download),
]

#web/urls.py

from django.http import HttpResponse
from django.shortcuts import render, redirect

# Create your views here.
from common.utils import format_duration, load_media
from common.DouYin import DY

def home(request):
    """首页"""
    return render(request, 'home.html')

def download(request):
    """下载"""
    url = request.POST.get('url', None)
    assert url != None

    dy = DY()
    data = dy.parse(url)

    mp4_path, mp4_content_length = load_media(data['mp4'], 'mp4')
    mp3_path, mp3_content_length = load_media(data['mp3'], 'mp3')

    realpath = ''.join(['https://www.chenshiyang.com', mp4_path])

    print('realpath---------------------', realpath)

    if len(data['desc'].split('#')) > 2:
        topic = data['desc'].split('#')[2].rstrip('#')

    return render(request, 'download.html', locals())

可以看出通过home页面表单提交过来的下载url会交由download函数处理。common模块的DouYin.py中定义的DY类负责对url继续解析,爬取相关视频地址,通过自定义utils.py中的load_media方法下载文件,并返回文件路径以及文件大小。

由于解析下载url,从抖音爬取数据的代码都封装到DY类里了,所以我们有必要贴下这个类的代码。另外,我们还需要贴下load_media这个方法的代码。

# common/DouYin.py

# -*- coding: utf-8 -*-
# @Time    : 2020-07-03 13:10
# @Author  : chenshiyang
# @Email   : chenshiyang@blued.com
# @File    : DouYin.py
# @Software: PyCharm


import re
from urllib.parse import urlparse
import requests
from common.utils import format_duration


class DY(object):

    def __init__(self, app=None):
        self.app = app
        if app is not None:
            self.init_app(app)

        self.headers = {
            'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            # 'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'cache-control': 'no-cache',
            'cookie': 'sid_guard=2e624045d2da7f502b37ecf72974d311%7C1591170698%7C5184000%7CSun%2C+02-Aug-2020+07%3A51%3A38+GMT; uid_tt=0033579d9229eec4a4d09871dfc11271; sid_tt=2e624045d2da7f502b37ecf72974d311; sessionid=2e624045d2da7f502b37ecf72974d311',
            'pragma': 'no-cache',
            'sec-fetch-dest': 'document',
            'sec-fetch-mode': 'navigate',
            'sec-fetch-site': 'none',
            'sec-fetch-user': '?1',
            'upgrade-insecure-requests': '1',
            'user-agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1'
        }

        self.domain = ['www.douyin.com',
                       'v.douyin.com',
                       'www.snssdk.com',
                       'www.amemv.com',
                       'www.iesdouyin.com',
                       'aweme.snssdk.com']

    def init_app(self, app):
        self.app = app

    def parse(self, url):
        share_url = self.get_share_url(url)
        share_url_parse = urlparse(share_url)

        if share_url_parse.netloc not in self.domain:
            raise Exception("无效的链接")
        dytk = None
        vid = re.findall(r'\/share\/video\/(\d*)', share_url_parse.path)[0]
        match = re.search(r'\/share\/video\/(\d*)', share_url_parse.path)
        if match:
            vid = match.group(1)

        response = requests.get(
            share_url,
            headers=self.headers,
            allow_redirects=False)

        match = re.search('dytk: "(.*?)"', response.text)

        if match:
            dytk = match.group(1)

        if vid:
            return self.get_data(vid, dytk)
        else:
            raise Exception("解析失败")

    def get_share_url(self, url):
        response = requests.get(url,
                                headers=self.headers,
                                allow_redirects=False)

        if 'location' in response.headers.keys():
            return response.headers['location']
        elif '/share/video/' in url:
            return url
        else:
            raise Exception("解析失败")

    def get_data(self, vid, dytk):
        url = f"https://www.iesdouyin.com/web/api/v2/aweme/iteminfo/?item_ids={vid}&dytk={dytk}"
        response = requests.get(url, headers=self.headers, )
        result = response.json()
        if not response.status_code == 200:
            raise Exception("解析失败")
        item = result.get("item_list")[0]
        author = item.get("author").get("nickname")
        mp4 = item.get("video").get("play_addr").get("url_list")[0]
        cover = item.get("video").get("cover").get("url_list")[0]
        mp4 = mp4.replace("playwm", "play")
        res = requests.get(mp4, headers=self.headers, allow_redirects=True)
        mp4 = res.url
        desc = item.get("desc")
        mp3 = item.get("music").get("play_url").get("url_list")[0]

        data = dict()
        data['mp3'] = mp3
        data['mp4'] = mp4
        data['cover'] = cover
        data['nickname'] = author
        data['desc'] = desc
        data['duration'] = format_duration(item.get("duration"))
        return data

从代码你可以看到返回的data字典里包括了mp3和mp4源文件地址,以及视频的封面,作者昵称及描述等等。

接下来你可以看到load_media方法爬取了视频到本地,并提供了新的path和大小。

#common/utils.py

# -*- coding: utf-8 -*-
# @Time    : 2020-06-29 17:26
# @Author  : chenshiyang
# @Email   : chenshiyang@blued.com
# @File    : utils.py
# @Software: PyCharm
import os
import time

import requests


def format_duration(duration):
    """
    格式化时长
    :param duration 毫秒
    """

    total_seconds = int(duration / 1000)
    minute = total_seconds // 60
    seconds = total_seconds % 60
    return f'{minute:02}:{seconds:02}'

SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
    1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}


def approximate_size(size, a_kilobyte_is_1024_bytes=True):

    '''Convert a file size to human-readable form.
    Keyword arguments:
    size -- file size in bytes
    a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
                                if False, use multiples of 1000
    Returns: string
    '''

    if size < 0:
        raise ValueError('number must be non-negative')

    multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
    for suffix in SUFFIXES[multiple]:
        size /= multiple
        if size < multiple:
            return '{0:.1f} {1}'.format(size, suffix)

    raise ValueError('number too large')


def do_load_media(url, path):
    """
    对媒体下载
    :param url:         多媒体地址
    :param path:        文件保存路径
    :return:            None
    """
    try:
        headers = {
            "User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1"}
        pre_content_length = 0

        # 循环接收视频数据
        while True:
            # 若文件已经存在,则断点续传,设置接收来需接收数据的位置
            if os.path.exists(path):
                headers['Range'] = 'bytes=%d-' % os.path.getsize(path)
            res = requests.get(url, stream=True, headers=headers)

            content_length = int(res.headers['content-length'])
            # 若当前报文长度小于前次报文长度,或者已接收文件等于当前报文长度,则可以认为视频接收完成
            if content_length < pre_content_length or (
                    os.path.exists(path) and os.path.getsize(path) == content_length):
                break
            pre_content_length = content_length

            # 写入收到的视频数据
            with open(path, 'ab') as file:
                file.write(res.content)
                file.flush()
                print('receive data,file size : %d   total size:%d' % (os.path.getsize(path), content_length))
                return approximate_size(content_length, a_kilobyte_is_1024_bytes=False)

    except Exception as e:
        print('视频下载异常:{}'.format(e))


def load_media(url, path):
    basepath = os.path.abspath(os.path.dirname(os.path.dirname(__file__)))

    # 生成13位时间戳
    suffixes = str(int(round(time.time() * 1000)))
    path = ''.join(['/media/', path, '/', '.'.join([suffixes, path])])
    targetpath = ''.join([basepath, path])
    content_length = do_load_media(url, targetpath)
    return path, content_length


def main(url, suffixes, path):
    load_media(url, suffixes, path)


if __name__ == "__main__":
    # url = 'https://aweme.snssdk.com/aweme/v1/play/?video_id=v0200fe70000br155v26tgq06h08e0lg&ratio=720p&line=0'
    # suffixes = 'test'
    # main(url, suffixes, 'mp4',)

    print(approximate_size(3726257, a_kilobyte_is_1024_bytes=False))

接下来我们看下模板, 这个没什么好说的。

# templates/home.html

{% extends "base.html" %}

{% block content %}
  <div class="jumbotron custom-jum no-mrg">
    <div class="container">
      <div class="row">
        <div class="col-md-12">
          <div class="center">
            <div class="home-search">
              <h1>抖音无水印视频下载器</h1>
              <h2>将抖音无水印视频下载到Mp4和Mp3</h2>
            </div>
            <div class="form-home-search">
              <form id="form_download" action='https://www.chenshiyang.com/dytk/downloader' method='POST'>
                <div class="input-group col-lg-10 col-md-10 col-sm-10">
                  <input name="url" class="form-control input-md ht58" placeholder="输入抖音视频 URL ..." type="text"
                    required="" value="">
                  <span class="input-group-btn"><button class="btn btn-primary input-md btn-download ht58" type="submit"
                      id="btn_submit">下载</button></span>
                </div>
              </form>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
  </div>

  {% endblock %}

# templates/download.html

{% extends "base.html" %}

{% block content %}
  <div class="page-content">
  <div class="container">
    <div class="row">
      <div class="col-lg-12 col-centered">
        <div class="ads mrg-bt20 text-center">
          <ins class="adsbygoogle" style="display:inline-block;width:728px;height:90px"
            data-ad-client="ca-pub-2984659695526033" data-ad-slot="5734284394"></ins>

        </div>
        <div class="card">
          <div class="row">
            <div class="col-md-4 col-sm-4">
              <a href="{{mp4_path}}" rel="external nofollow"  rel="external nofollow"  data-toggle="modal" class="card-aside-column img-video"
                style="height: 252px; background: url(&quot;{{data.cover}}&quot;) 0% 0% / cover;" title="">
                <span class="btn-play-video"><i class="glyphicon glyphicon-play"></i></span>
                <p class="time-video" id="time">{{data.duration}}</p>
              </a>
              <h5>作者: {{data.nickname}}</h5>
              <h5><a href="#" rel="external nofollow" >{{topic}} <i class="open-new-window"></i></a></h5>
              <p class="card-text">{{data.desc}}</p>
            </div>
            <div class="col-md-8 col-sm-8 col-table">
              <table class="table">
                <thead>
                  <tr>
                    <th>format</th>
                    <th>size</th>
                    <th>Downloads</th>
                  </tr>
                </thead>
                <tbody>
                  <tr>

                    <td>mp4</td>
                    <td>{{mp4_content_length}}</td>
                    <td>
                      <a href="{{mp4_path}}" rel="external nofollow"  rel="external nofollow"  class="btn btn-download"  download="">下载</a>
                    </td>
                  </tr>
                  <tr>

                    <td>mp3</td>
                    <td>{{mp3_content_length}}</td>
                    <td>
                      <a href="{{mp3_path}}" rel="external nofollow"  class="btn btn-download"  download="">下载</a>
                    </td>
                  </tr>

                </tbody>

              </table>
            </div>
          </div>
        </div>

        <div class="card card-qrcode">
          <div class="row">
            <div class="col-md-12 qrcode">
              <div class="text-center">
                <p class="qrcode-p">扫描下面的二维码直接下载到您的智能手机或平板电脑!</p>
              </div>
            </div>
            <div class="col-md-4 col-centered qrcode">
              <div id="qrcode" title="{{realpath}}">
                <script src="/static/js/qrcode.min.js"></script>
                <script type="text/javascript">
                  new QRCode(document.getElementById("qrcode"), {
                    text: "{{realpath}}",
                    width: 120,
                    height: 120,
                    correctLevel: QRCode.CorrectLevel.L
                  });
</script>
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>

{% endblock %}

完整源码地址:

https://github.com/tinysheepyang/python_api

以上就是Django实现在线无水印抖音视频下载(附源码及地址)的详细内容,更多关于Django 无水印抖音视频下载的资料请关注三水点靠木其它相关文章!

Python 相关文章推荐
用Python计算三角函数之acos()方法的使用
May 15 Python
Python自动生产表情包
Mar 17 Python
python编程实现随机生成多个椭圆实例代码
Jan 03 Python
OpenCV+python手势识别框架和实例讲解
Aug 03 Python
python爬取网易云音乐评论
Nov 16 Python
Python 实现子类获取父类的类成员方法
Jan 11 Python
很酷的python表白工具 你喜欢我吗
Apr 11 Python
django实现类似触发器的功能
Nov 15 Python
Python 读取 YUV(NV12) 视频文件实例
Dec 09 Python
一文读懂Python 枚举
Aug 25 Python
Python中return函数返回值实例用法
Nov 19 Python
如何用python爬取微博热搜数据并保存
Feb 20 Python
Django给表单添加honeypot验证增加安全性
Django利用AJAX技术实现博文实时搜索
May 06 #Python
python 如何获取页面所有a标签下href的值
May 06 #Python
Python中常见的导入方式总结
May 06 #Python
Python基础之hashlib模块详解
May 06 #Python
用Python爬虫破解滑动验证码的案例解析
python本地文件服务器实例教程
You might like
php 图像函数大举例(非原创)
2009/06/20 PHP
PHP判断远程图片是否存在的几种方法
2014/05/04 PHP
php出现内存位置访问无效错误问题解决方法
2014/08/16 PHP
PHP在innodb引擎下快速代建全文搜索功能简明教程【基于xunsearch】
2016/10/14 PHP
使用正则去除php代码中的注释方法
2016/11/03 PHP
PHP实现的登录页面信息提示功能示例
2017/07/24 PHP
图像替换新技术 状态域方法
2010/01/28 Javascript
jQuery中的height innerHeight outerHeight区别示例介绍
2014/06/15 Javascript
JavaScript学习笔记之Function对象
2015/01/22 Javascript
javascript+ajax实现产品页面加载信息
2015/07/09 Javascript
JavaScript的Backbone.js框架入门学习指引
2016/05/07 Javascript
js多功能分页组件layPage使用方法详解
2016/05/19 Javascript
微信小程序中使元素占满整个屏幕高度实现方法
2016/12/14 Javascript
手动初始化Angular的模块与控制器
2016/12/26 Javascript
详解vue项目打包后通过百度的BAE发布到网上的流程
2018/03/05 Javascript
解决Layui数据表格中checkbox位置不居中的方法
2018/08/15 Javascript
浅谈JavaScript_DOM学习篇_图片切换小案例
2019/03/19 Javascript
vue实现新闻展示页的步骤详解
2019/04/11 Javascript
React 实现车牌键盘的示例代码
2019/12/20 Javascript
vue学习笔记之Vue中css动画原理简单示例
2020/02/29 Javascript
详解Vue中Axios封装API接口的思路及方法
2020/10/10 Javascript
在Linux中通过Python脚本访问mdb数据库的方法
2015/05/06 Python
Python网络编程详解
2017/10/31 Python
Python数据结构之哈夫曼树定义与使用方法示例
2018/04/22 Python
python并发编程多进程 模拟抢票实现过程
2019/08/20 Python
如何编写python的daemon程序
2021/01/07 Python
浅谈移动端网页图片预加载方案
2018/11/05 HTML / CSS
SHEIN香港:价格实惠的女性时尚服装
2018/08/14 全球购物
美国台面电器和厨具品牌:KitchenAid
2019/04/12 全球购物
意大利折扣和优惠券网站:Groupalia
2019/10/09 全球购物
乌克兰的第一家手表店:Deka
2020/03/05 全球购物
优秀班组长事迹
2014/05/31 职场文书
2015年营销工作总结范文
2015/04/23 职场文书
清明祭英烈活动总结
2015/05/11 职场文书
传单、海报早OUT了,另类传单营销方案送给你!
2019/07/15 职场文书
MySQL下使用Inplace和Online方式创建索引的教程
2021/05/26 MySQL