Django实现在线无水印抖音视频下载(附源码及地址)


Posted in Python onMay 06, 2021

Django实现在线无水印抖音视频下载(附源码及地址)

项目地址是:https://www.chenshiyang.com/dytk

接下来我们分析下源码简要看下实现原理。

实现原理

该项目不需要使用模型(models), 最核心的只有两个页面:一个主页面(home)展示包含下载url地址的表单,一个下载页面(download)处理表单请求,并展示去水印后的视频文件地址及文件大小,以及用于手机预览的二维码。

对应两个核心页面的路由如下所示,每个url对应一个视图函数。

# urls.py

from django.urls import path

from web.views import home, download

urlpatterns = [
    path('home', home),
    path('downloader', download),
]

#web/urls.py

from django.http import HttpResponse
from django.shortcuts import render, redirect

# Create your views here.
from common.utils import format_duration, load_media
from common.DouYin import DY

def home(request):
    """首页"""
    return render(request, 'home.html')

def download(request):
    """下载"""
    url = request.POST.get('url', None)
    assert url != None

    dy = DY()
    data = dy.parse(url)

    mp4_path, mp4_content_length = load_media(data['mp4'], 'mp4')
    mp3_path, mp3_content_length = load_media(data['mp3'], 'mp3')

    realpath = ''.join(['https://www.chenshiyang.com', mp4_path])

    print('realpath---------------------', realpath)

    if len(data['desc'].split('#')) > 2:
        topic = data['desc'].split('#')[2].rstrip('#')

    return render(request, 'download.html', locals())

可以看出通过home页面表单提交过来的下载url会交由download函数处理。common模块的DouYin.py中定义的DY类负责对url继续解析,爬取相关视频地址,通过自定义utils.py中的load_media方法下载文件,并返回文件路径以及文件大小。

由于解析下载url,从抖音爬取数据的代码都封装到DY类里了,所以我们有必要贴下这个类的代码。另外,我们还需要贴下load_media这个方法的代码。

# common/DouYin.py

# -*- coding: utf-8 -*-
# @Time    : 2020-07-03 13:10
# @Author  : chenshiyang
# @Email   : chenshiyang@blued.com
# @File    : DouYin.py
# @Software: PyCharm


import re
from urllib.parse import urlparse
import requests
from common.utils import format_duration


class DY(object):

    def __init__(self, app=None):
        self.app = app
        if app is not None:
            self.init_app(app)

        self.headers = {
            'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
            # 'accept-encoding': 'gzip, deflate, br',
            'accept-language': 'zh-CN,zh;q=0.9',
            'cache-control': 'no-cache',
            'cookie': 'sid_guard=2e624045d2da7f502b37ecf72974d311%7C1591170698%7C5184000%7CSun%2C+02-Aug-2020+07%3A51%3A38+GMT; uid_tt=0033579d9229eec4a4d09871dfc11271; sid_tt=2e624045d2da7f502b37ecf72974d311; sessionid=2e624045d2da7f502b37ecf72974d311',
            'pragma': 'no-cache',
            'sec-fetch-dest': 'document',
            'sec-fetch-mode': 'navigate',
            'sec-fetch-site': 'none',
            'sec-fetch-user': '?1',
            'upgrade-insecure-requests': '1',
            'user-agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1'
        }

        self.domain = ['www.douyin.com',
                       'v.douyin.com',
                       'www.snssdk.com',
                       'www.amemv.com',
                       'www.iesdouyin.com',
                       'aweme.snssdk.com']

    def init_app(self, app):
        self.app = app

    def parse(self, url):
        share_url = self.get_share_url(url)
        share_url_parse = urlparse(share_url)

        if share_url_parse.netloc not in self.domain:
            raise Exception("无效的链接")
        dytk = None
        vid = re.findall(r'\/share\/video\/(\d*)', share_url_parse.path)[0]
        match = re.search(r'\/share\/video\/(\d*)', share_url_parse.path)
        if match:
            vid = match.group(1)

        response = requests.get(
            share_url,
            headers=self.headers,
            allow_redirects=False)

        match = re.search('dytk: "(.*?)"', response.text)

        if match:
            dytk = match.group(1)

        if vid:
            return self.get_data(vid, dytk)
        else:
            raise Exception("解析失败")

    def get_share_url(self, url):
        response = requests.get(url,
                                headers=self.headers,
                                allow_redirects=False)

        if 'location' in response.headers.keys():
            return response.headers['location']
        elif '/share/video/' in url:
            return url
        else:
            raise Exception("解析失败")

    def get_data(self, vid, dytk):
        url = f"https://www.iesdouyin.com/web/api/v2/aweme/iteminfo/?item_ids={vid}&dytk={dytk}"
        response = requests.get(url, headers=self.headers, )
        result = response.json()
        if not response.status_code == 200:
            raise Exception("解析失败")
        item = result.get("item_list")[0]
        author = item.get("author").get("nickname")
        mp4 = item.get("video").get("play_addr").get("url_list")[0]
        cover = item.get("video").get("cover").get("url_list")[0]
        mp4 = mp4.replace("playwm", "play")
        res = requests.get(mp4, headers=self.headers, allow_redirects=True)
        mp4 = res.url
        desc = item.get("desc")
        mp3 = item.get("music").get("play_url").get("url_list")[0]

        data = dict()
        data['mp3'] = mp3
        data['mp4'] = mp4
        data['cover'] = cover
        data['nickname'] = author
        data['desc'] = desc
        data['duration'] = format_duration(item.get("duration"))
        return data

从代码你可以看到返回的data字典里包括了mp3和mp4源文件地址,以及视频的封面,作者昵称及描述等等。

接下来你可以看到load_media方法爬取了视频到本地,并提供了新的path和大小。

#common/utils.py

# -*- coding: utf-8 -*-
# @Time    : 2020-06-29 17:26
# @Author  : chenshiyang
# @Email   : chenshiyang@blued.com
# @File    : utils.py
# @Software: PyCharm
import os
import time

import requests


def format_duration(duration):
    """
    格式化时长
    :param duration 毫秒
    """

    total_seconds = int(duration / 1000)
    minute = total_seconds // 60
    seconds = total_seconds % 60
    return f'{minute:02}:{seconds:02}'

SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
    1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}


def approximate_size(size, a_kilobyte_is_1024_bytes=True):

    '''Convert a file size to human-readable form.
    Keyword arguments:
    size -- file size in bytes
    a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
                                if False, use multiples of 1000
    Returns: string
    '''

    if size < 0:
        raise ValueError('number must be non-negative')

    multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
    for suffix in SUFFIXES[multiple]:
        size /= multiple
        if size < multiple:
            return '{0:.1f} {1}'.format(size, suffix)

    raise ValueError('number too large')


def do_load_media(url, path):
    """
    对媒体下载
    :param url:         多媒体地址
    :param path:        文件保存路径
    :return:            None
    """
    try:
        headers = {
            "User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 13_2_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Mobile/15E148 Safari/604.1"}
        pre_content_length = 0

        # 循环接收视频数据
        while True:
            # 若文件已经存在,则断点续传,设置接收来需接收数据的位置
            if os.path.exists(path):
                headers['Range'] = 'bytes=%d-' % os.path.getsize(path)
            res = requests.get(url, stream=True, headers=headers)

            content_length = int(res.headers['content-length'])
            # 若当前报文长度小于前次报文长度,或者已接收文件等于当前报文长度,则可以认为视频接收完成
            if content_length < pre_content_length or (
                    os.path.exists(path) and os.path.getsize(path) == content_length):
                break
            pre_content_length = content_length

            # 写入收到的视频数据
            with open(path, 'ab') as file:
                file.write(res.content)
                file.flush()
                print('receive data,file size : %d   total size:%d' % (os.path.getsize(path), content_length))
                return approximate_size(content_length, a_kilobyte_is_1024_bytes=False)

    except Exception as e:
        print('视频下载异常:{}'.format(e))


def load_media(url, path):
    basepath = os.path.abspath(os.path.dirname(os.path.dirname(__file__)))

    # 生成13位时间戳
    suffixes = str(int(round(time.time() * 1000)))
    path = ''.join(['/media/', path, '/', '.'.join([suffixes, path])])
    targetpath = ''.join([basepath, path])
    content_length = do_load_media(url, targetpath)
    return path, content_length


def main(url, suffixes, path):
    load_media(url, suffixes, path)


if __name__ == "__main__":
    # url = 'https://aweme.snssdk.com/aweme/v1/play/?video_id=v0200fe70000br155v26tgq06h08e0lg&ratio=720p&line=0'
    # suffixes = 'test'
    # main(url, suffixes, 'mp4',)

    print(approximate_size(3726257, a_kilobyte_is_1024_bytes=False))

接下来我们看下模板, 这个没什么好说的。

# templates/home.html

{% extends "base.html" %}

{% block content %}
  <div class="jumbotron custom-jum no-mrg">
    <div class="container">
      <div class="row">
        <div class="col-md-12">
          <div class="center">
            <div class="home-search">
              <h1>抖音无水印视频下载器</h1>
              <h2>将抖音无水印视频下载到Mp4和Mp3</h2>
            </div>
            <div class="form-home-search">
              <form id="form_download" action='https://www.chenshiyang.com/dytk/downloader' method='POST'>
                <div class="input-group col-lg-10 col-md-10 col-sm-10">
                  <input name="url" class="form-control input-md ht58" placeholder="输入抖音视频 URL ..." type="text"
                    required="" value="">
                  <span class="input-group-btn"><button class="btn btn-primary input-md btn-download ht58" type="submit"
                      id="btn_submit">下载</button></span>
                </div>
              </form>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
  </div>

  {% endblock %}

# templates/download.html

{% extends "base.html" %}

{% block content %}
  <div class="page-content">
  <div class="container">
    <div class="row">
      <div class="col-lg-12 col-centered">
        <div class="ads mrg-bt20 text-center">
          <ins class="adsbygoogle" style="display:inline-block;width:728px;height:90px"
            data-ad-client="ca-pub-2984659695526033" data-ad-slot="5734284394"></ins>

        </div>
        <div class="card">
          <div class="row">
            <div class="col-md-4 col-sm-4">
              <a href="{{mp4_path}}" rel="external nofollow"  rel="external nofollow"  data-toggle="modal" class="card-aside-column img-video"
                style="height: 252px; background: url(&quot;{{data.cover}}&quot;) 0% 0% / cover;" title="">
                <span class="btn-play-video"><i class="glyphicon glyphicon-play"></i></span>
                <p class="time-video" id="time">{{data.duration}}</p>
              </a>
              <h5>作者: {{data.nickname}}</h5>
              <h5><a href="#" rel="external nofollow" >{{topic}} <i class="open-new-window"></i></a></h5>
              <p class="card-text">{{data.desc}}</p>
            </div>
            <div class="col-md-8 col-sm-8 col-table">
              <table class="table">
                <thead>
                  <tr>
                    <th>format</th>
                    <th>size</th>
                    <th>Downloads</th>
                  </tr>
                </thead>
                <tbody>
                  <tr>

                    <td>mp4</td>
                    <td>{{mp4_content_length}}</td>
                    <td>
                      <a href="{{mp4_path}}" rel="external nofollow"  rel="external nofollow"  class="btn btn-download"  download="">下载</a>
                    </td>
                  </tr>
                  <tr>

                    <td>mp3</td>
                    <td>{{mp3_content_length}}</td>
                    <td>
                      <a href="{{mp3_path}}" rel="external nofollow"  class="btn btn-download"  download="">下载</a>
                    </td>
                  </tr>

                </tbody>

              </table>
            </div>
          </div>
        </div>

        <div class="card card-qrcode">
          <div class="row">
            <div class="col-md-12 qrcode">
              <div class="text-center">
                <p class="qrcode-p">扫描下面的二维码直接下载到您的智能手机或平板电脑!</p>
              </div>
            </div>
            <div class="col-md-4 col-centered qrcode">
              <div id="qrcode" title="{{realpath}}">
                <script src="/static/js/qrcode.min.js"></script>
                <script type="text/javascript">
                  new QRCode(document.getElementById("qrcode"), {
                    text: "{{realpath}}",
                    width: 120,
                    height: 120,
                    correctLevel: QRCode.CorrectLevel.L
                  });
</script>
              </div>
            </div>
          </div>
        </div>
      </div>
    </div>
  </div>
</div>

{% endblock %}

完整源码地址:

https://github.com/tinysheepyang/python_api

以上就是Django实现在线无水印抖音视频下载(附源码及地址)的详细内容,更多关于Django 无水印抖音视频下载的资料请关注三水点靠木其它相关文章!

Python 相关文章推荐
Python格式化压缩后的JS文件的方法
Mar 05 Python
举例讲解Python的lambda语句声明匿名函数的用法
Jul 01 Python
Python基于matplotlib绘制栈式直方图的方法示例
Aug 09 Python
基于Python数据可视化利器Matplotlib,绘图入门篇,Pyplot详解
Oct 13 Python
Django 2.0版本的新特性抢先看!
Jan 05 Python
详解python中asyncio模块
Mar 03 Python
python 按照固定长度分割字符串的方法小结
Apr 30 Python
详解Python3中ceil()函数用法
Feb 19 Python
使用Python制作缩放自如的圣诞老人(圣诞树)
Dec 25 Python
Python使用ElementTree美化XML格式的操作
Mar 06 Python
Python 实现集合Set的示例
Dec 21 Python
python数字图像处理之图像自动阈值分割示例
Jun 28 Python
Django给表单添加honeypot验证增加安全性
Django利用AJAX技术实现博文实时搜索
May 06 #Python
python 如何获取页面所有a标签下href的值
May 06 #Python
Python中常见的导入方式总结
May 06 #Python
Python基础之hashlib模块详解
May 06 #Python
用Python爬虫破解滑动验证码的案例解析
python本地文件服务器实例教程
You might like
PHP 常用函数库和一些实用小技巧
2009/01/01 PHP
php tp验证表单与自动填充函数代码
2012/02/22 PHP
PHP控制前台弹出对话框的实现方法
2016/08/21 PHP
总结PHP如何获取当前主机、域名、网址、路径、端口和参数等
2016/09/09 PHP
详解PHP的抽象类和抽象方法以及接口总结
2019/03/15 PHP
Mozilla 表达式 __noSuchMethod__
2009/04/05 Javascript
分享9个最好用的JavaScript开发工具和代码编辑器
2015/03/24 Javascript
原生js实现移动开发轮播图、相册滑动特效
2015/04/17 Javascript
AngularJS中transclude用法详解
2016/11/03 Javascript
Angular ui.bootstrap.pagination分页
2017/01/20 Javascript
详解如何用webpack打包一个网站应用项目
2017/07/12 Javascript
ionic使用angularjs表单验证(模板验证)
2018/12/12 Javascript
Vue在 Nuxt.js 中重定向 404 页面的方法
2019/04/23 Javascript
javascript创建元素和删除元素实例小结
2019/06/19 Javascript
JS监听组合按键思路及实现过程
2020/04/17 Javascript
vue webpack build资源相对路径的问题及解决方法
2020/06/04 Javascript
vue3.0生命周期的示例代码
2020/09/24 Javascript
解决谷歌搜索技术文章时打不开网页问题的python脚本
2013/02/10 Python
python实现爬虫下载漫画示例
2014/02/16 Python
Python3基础之list列表实例解析
2014/08/13 Python
python实现提取百度搜索结果的方法
2015/05/19 Python
[原创]python爬虫(入门教程、视频教程)
2018/01/08 Python
使用apidocJs快速生成在线文档的实例讲解
2018/02/07 Python
Windows 8.1 64bit下搭建 Scrapy 0.22 环境
2018/11/18 Python
Python开启线程,在函数中开线程的实例
2019/02/22 Python
Pyqt5 实现跳转界面并关闭当前界面的方法
2019/06/19 Python
python的pstuil模块使用方法总结
2019/07/26 Python
美国知名的女性服饰品牌:LOFT(洛芙特)
2016/08/05 全球购物
美国背景检查、公共记录和人物搜索网站:BeenVerified
2018/02/25 全球购物
总务岗位职责
2013/11/19 职场文书
中国梦我的梦演讲稿
2014/04/23 职场文书
室内设计专业自荐信
2014/05/31 职场文书
小学生推普周国旗下讲话稿
2014/09/21 职场文书
华清池导游词
2015/02/02 职场文书
仓管员岗位职责范本
2015/04/01 职场文书
2015年教学管理工作总结
2015/05/20 职场文书