编程 Python

python中urllib包的网络请求教程

Posted in Python onApril 19, 2022

一、简介
二、发起请求
三、携带参数请求
四、获取响应数据
五、设置headers
六、使用代理
七、认证登录
八、设置cookie
九、异常处理
十、HTTP异常
十一、超时异常
十二、解析编码
十三、参数拼接
十四、请求链接解析
十五、拼接链接
十六、字典转换参数

一、简介

是一个 python 内置包，不需要额外安装即可使用
urllib 是 Python 标准库中用于网络请求的库，内置四个模块，分别是
urllib.request：用来打开和读取 url，可以用它来模拟发送请求，获取网页响应内容
urllib.error：用来处理 urllib.request 引起的异常，保证程序的正常执行
urllib.parse：用来解析 url，可以对 url 进行拆分、合并等
urllib.robotparse：用来解析 robots.txt 文件，判断网站是否能够进行爬取

二、发起请求

import urllib.request

# 方法一
resp = urllib.request.urlopen('http://www.baidu.com', timeout=1)
print(resp.read().decode('utf-8'))

# 方法二
request = urllib.request.Request('http://www.baidu.com')
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

三、携带参数请求

请求某些网页时需要携带一些数据

import urllib.parse
import urllib.request

params = {
'name':'autofelix',
'age':'25'
}

data = bytes(urllib.parse.urlencode(params), encoding='utf8')
response = urllib.request.urlopen("http://www.baidu.com/", data=data)
print(response.read().decode('utf-8'))

四、获取响应数据

import urllib.request

resp = urllib.request.urlopen('http://www.baidu.com')
print(type(resp))
print(resp.status)
print(resp.geturl())
print(resp.getcode())
print(resp.info())
print(resp.getheaders())
print(resp.getheader('Server'))

五、设置headers

import urllib.request

headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36'
}
request = urllib.request.Request(url="http://tieba.baidu.com/", headers=headers)
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

六、使用代理

import urllib.request

proxys = urllib.request.ProxyHandler({
'http': 'proxy.cn:8080',
'https': 'proxy.cn:8080'
})

opener = urllib.request.build_opener(proxys)
urllib.request.install_opener(opener)

request = urllib.request.Request(url="http://www.baidu.com/")
response = urllib.request.urlopen(request)
print(response.read().decode('utf-8'))

七、认证登录

有些网站需要携带账号和密码进行登录之后才能继续浏览网页

import urllib.request

url = "http://www.baidu.com/"
user = 'autofelix'
password = '123456'
pwdmgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
pwdmgr.add_password(None,url,user,password)

auth_handler = urllib.request.HTTPBasicAuthHandler(pwdmgr)
opener = urllib.request.build_opener(auth_handler)
response = opener.open(url)
print(response.read().decode('utf-8'))

八、设置cookie

如果请求的页面每次需要身份验证，我们可以使用 Cookies 来自动登录，免去重复登录验证的操作

import http.cookiejar
import urllib.request

cookie = http.cookiejar.CookieJar()
handler = urllib.request.HTTPCookieProcessor(cookie)
opener = urllib.request.build_opener(handler)
response = opener.open("http://www.baidu.com/")

f = open('cookie.txt', 'a')
for item in cookie:
f.write(item.name+" = "+item.value+'\n')
f.close()

九、异常处理

from urllib import error, request

try:
resp = request.urlopen('http://www.baidu.com')
except error.URLError as e:
print(e.reason)

十、HTTP异常

from urllib import error, request

try:
resp = request.urlopen('http://www.baidu.com')
except error.HTTPError as e:
print(e.reason, e.code, e.headers, sep='\n')
except error.URLError as e:
print(e.reason)
else:
print('request successfully')

十一、超时异常

import socket, urllib.request, urllib.error

try:
resp = urllib.request.urlopen('http://www.baidu.com', timeout=0.01)
except urllib.error.URLError as e:
print(type(e.reason))
if isinstance(e.reason,socket.timeout):
print('time out')

十二、解析编码

from urllib import parse

name = parse.quote('飞兔小哥')

# 转换回来
parse.unquote(name)

十三、参数拼接

在访问url时，我们常常需要传递很多的url参数
而如果用字符串的方法去拼接url的话，会比较麻烦

from urllib import parse

params = {'name': '飞兔', 'age': '27', 'height': '178'}
parse.urlencode(params)

十四、请求链接解析

from urllib.parse import urlparse

result = urlparse('http://www.baidu.com/index.html?user=autofelix')
print(type(result))
print(result)

十五、拼接链接

如果拼接的是两个链接，则以返回后面的链接
如果拼接是一个链接和参数，则返回拼接后的内容

from urllib.parse import urljoin

print(urljoin('http://www.baidu.com', 'index.html'))

十六、字典转换参数

from urllib.parse import urlencode

params = {
'name': 'autofelix',
'age': 27
}
baseUrl = 'http://www.baidu.com?'
print(baseUrl + urlencode(params))

到此这篇关于python 包中的 urllib 网络请求教程的文章就介绍到这了！

python中urllib包的网络请求教程

- Author -

autofelix

- Original Sources -

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

python中使用urllib2获取http请求状态码的代码例子

Jul 07 Python

使用SAE部署Python运行环境的教程

May 05 Python

Python中动态检测编码chardet的使用教程

Jul 06 Python

Python爬虫基础之XPath语法与lxml库的用法详解

Sep 13 Python

python使用Matplotlib画饼图

Sep 25 Python

python实现图片中文字分割效果

Jul 22 Python

详细整理python 字符串(str)与列表(list)以及数组(array)之间的转换方法

Aug 30 Python

python自动化测试无法启动谷歌浏览器问题

Oct 10 Python

python自动化实现登录获取图片验证码功能

Nov 20 Python

django在保存图像的同时压缩图像示例代码详解

Feb 11 Python

scrapy与selenium结合爬取数据(爬取动态网站)的示例代码

Sep 28 Python

scrapy实践之翻页爬取的实现

Jan 05 Python

python APScheduler执行定时任务介绍

Apr 19 #Python

Python数据可视化之Seaborn的安装及使用

python 闭包函数详细介绍

Apr 19 #Python

Python lambda匿名函数和三元运算符

Apr 19 #Python

Python使用mitmproxy工具监控手机下载手机小视频

使用Python通过企业微信应用给企业成员发消息

Python用any()函数检查字符串中的字母以及如何使用all()函数

Apr 14 #Python

You might like

php 获取mysql数据库信息代码

2009/03/12 PHP

smarty模板引擎之内建函数用法

2015/03/30 PHP

YII CLinkPager分页类扩展增加显示共多少页

2016/01/29 PHP

Valerio 发布了 Mootools

2006/09/23 Javascript

jquery $(this).attr $(this).val方法使用介绍

2013/10/08 Javascript

JavaScript声明变量名的语法规则

2015/07/10 Javascript

jquery实现页面常用的返回顶部效果

2016/03/04 Javascript

简单实现JS计算器功能

2016/12/21 Javascript

详解vue表单验证组件 v-verify-plugin

2017/04/19 Javascript

Angular4 中常用的指令入门总结

2017/06/12 Javascript

谈谈vue中mixin的一点理解

2017/12/12 Javascript

js获取html页面代码中图片地址的实现代码

2018/03/05 Javascript

vue.js element-ui tree树形控件改iview的方法

2018/03/29 Javascript

angular ng-model 无法获取值的处理方法

2018/10/02 Javascript

详解vue配置后台接口方式

2019/03/29 Javascript

vue 百度地图(vue-baidu-map)绘制方向箭头折线实例代码详解

2020/04/28 Javascript

解决vue+elementui项目打包后样式变化问题

2020/08/03 Javascript

uniapp电商小程序实现订单30分钟倒计时

2020/11/01 Javascript

python函数返回多个值的示例方法

2013/12/04 Python

Python3使用requests包抓取并保存网页源码的方法

2016/03/15 Python

Python ldap实现登录实例代码

2016/09/30 Python

详解tensorflow实现迁移学习实例

2018/02/10 Python

Python爬虫抓取指定网页图片代码实例

2020/07/24 Python

Python中的None与 NULL(即空字符)的区别详解

2020/09/24 Python

canvas实现扭蛋机动画效果的示例代码

2018/10/17 HTML / CSS

阿根廷旅游网站：almundo阿根廷

2018/02/12 全球购物

结婚典礼证婚词

2014/01/11 职场文书

《小山羊和小灰兔》教学反思

2014/02/19 职场文书

幼儿园儿童节主持词

2014/03/21 职场文书

爱之链教学反思

2014/04/30 职场文书

责任担保书范文

2014/05/21 职场文书

新农村建设典型材料

2014/05/31 职场文书

庆六一活动总结

2014/08/29 职场文书

教师对照四风自我剖析材料

2014/09/30 职场文书

放假通知范文

2015/04/14 职场文书

班组长如何制订适合本班组的工作计划？

2019/07/10 职场文书

python中urllib包的网络请求教程

目录

一、简介

二、发起请求

三、携带参数请求

四、获取响应数据

五、设置headers

六、使用代理

七、认证登录

八、设置cookie

九、异常处理

十、HTTP异常

十一、超时异常

十二、解析编码

十三、参数拼接

十四、请求链接解析

十五、拼接链接

十六、字典转换参数