Python 使用指定的网卡发送HTTP请求的实例


Posted in Python onAugust 21, 2019

需求: 一台机器上有多个网卡, 如何访问指定的 URL 时使用指定的网卡发送数据呢?

$ curl --interface eth0 www.baidu.com # curl interface 可以指定网卡

阅读 urllib.py 的源码, 追述到 open_http ?> httplib.HTTP ?> httplib.HTTP._connection_class = HTTPConnection

HTTPConnection 在创建的时候会指定一个 source_address.

HTTPConnection.connect 时调用 HTTPConnection._create_connection = socket.create_connection

# 先看一下本地网卡信息
$ ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
  options=3<RXCSUM,TXCSUM>
  inet6 ::1 prefixlen 128 
  inet 127.0.0.1 netmask 0xff000000 
  inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
  nd6 options=1<PERFORMNUD>
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
  ether c8:e0:eb:17:3a:73 
  inet6 fe80::cae0:ebff:fe17:3a73%en0 prefixlen 64 scopeid 0x4 
  inet 192.168.20.2 netmask 0xffffff00 broadcast 192.168.20.255
  nd6 options=1<PERFORMNUD>
  media: autoselect
  status: active
en1: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
  options=4<VLAN_MTU>
  ether 0c:5b:8f:27:9a:64 
  inet6 fe80::e5b:8fff:fe27:9a64%en8 prefixlen 64 scopeid 0xa 
  inet 192.168.8.100 netmask 0xffffff00 broadcast 192.168.8.255
  nd6 options=1<PERFORMNUD>
  media: autoselect (100baseTX <full-duplex>)
  status: active

可以看到en0和en1, 这两块网卡都可以访问公网. lo0是本地回环.

直接修改 socket.py 做测试.

def create_connection(address, timeout=_GLOBAL_DEFAULT_TIMEOUT,
           source_address=None):
  """If *source_address* is set it must be a tuple of (host, port)
  for the socket to bind as a source address before making the connection.
  An host of '' or port 0 tells the OS to use the default.
  source_address 如果设置, 必须是传递元组 (host, port), 默认是 ("", 0) 
  """

  host, port = address
  err = None
  for res in getaddrinfo(host, port, 0, SOCK_STREAM):
    af, socktype, proto, canonname, sa = res
    sock = None
    try:
      sock = socket(af, socktype, proto)
      # sock.bind(("192.168.20.2", 0)) # en0
      # sock.bind(("192.168.8.100", 0)) # en1
      # sock.bind(("127.0.0.1", 0)) # lo0
      if timeout is not _GLOBAL_DEFAULT_TIMEOUT:
        sock.settimeout(timeout)
      if source_address:
        print "socket bind source_address: %s" % source_address
        sock.bind(source_address)
      sock.connect(sa)
      return sock

    except error as _:
      err = _
      if sock is not None:
        sock.close()
  if err is not None:
    raise err
  else:
    raise error("getaddrinfo returns an empty list")

参考说明文档, 直接分三次绑定不通网卡的 IP 地址, 端口设置为0.

# 测试 en0
$ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()'
.148.245.16

# 测试 en1
$ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()'
.94.115.227

# 测试 lo0
$ python -c 'import urllib as u;print u.urlopen("http://ip.haschek.at").read()'
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen
  return opener.open(url)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open
  return getattr(self, name)(url)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http
  h.endheaders(data)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
  self._send_output(message_body)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
  self.send(msg)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
  self.connect()
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect
  self.timeout, self.source_address)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 578, in create_connection
  raise err
IOError: [Errno socket error] [Errno 49] Can't assign requested address

测试通过, 说明在多网卡情况下, 创建 socket 时绑定某块网卡的 IP 就可以, 端口需要设置为0. 如果端口不设置为0, 第二次请求时, 可以看到抛异常, 端口被占用.

Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 87, in urlopen
  return opener.open(url)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 213, in open
  return getattr(self, name)(url)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 350, in open_http
  h.endheaders(data)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
  self._send_output(message_body)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 893, in _send_output
  self.send(msg)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 855, in send
  self.connect()
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 832, in connect
  self.timeout, self.source_address)
 File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 577, in create_connection
  raise err
IOError: [Errno socket error] [Errno 48] Address already in use

如果是在项目中, 只需要把 socket.create_connection 这个函数的形参 source_address 设置为对应网卡的 (IP, 0) 就可以.

# test-interface_urllib.py
import socket
import urllib, urllib2

_create_socket = socket.create_connection

SOURCE_ADDRESS = ("127.0.0.1", 0)
#SOURCE_ADDRESS = ("172.28.153.121", 0)
#SOURCE_ADDRESS = ("172.16.30.41", 0)

def create_connection(*args, **kwargs):
  in_args = False
  if len(args) >=3:
    args = list(args)
    args[2] = SOURCE_ADDRESS
    args = tuple(args)
    in_args = True
  if not in_args:
    kwargs["source_address"] = SOURCE_ADDRESS
  print "args", args
  print "kwargs", str(kwargs)
  return _create_socket(*args, **kwargs)

socket.create_connection = create_connection

print urllib.urlopen("http://ip.haschek.at").read()

通过测试, 可以发现已经可以通过制定的网卡发送数据, 并且 IP 地址对应网卡分配的 IP.

问题, 爬虫经常使用 requests, requests 是否支持呢. 通过测试, 可以发现, requests 并没有使用 python 内置的 socket 模块.

看源码, requests 是如果创建的 socket 连接呢. 方法和查看 urllib 创建socket 的方式一样. 具体就不写了.

因为我用的是 python 2.7, 所以可以定位到 requests 使用的 socket 模块是 urllib3.utils.connection 的.

修改方法和 urllib 相差不大.

import urllib3.connection
_create_socket = urllib3.connection.connection.create_connection
# pass

urllib3.connection.connection.create_connection = create_connection
# pass

运行后, 可能会抛出异常. requests.exceptions.ConnectionError: Max retries exceeded with .. Invalid argument

这个异常不是每次出现, 跟 IP 段有关系, 跳转递归层数太多导致, 只需要将 kwargs 中的 socket_options去掉即可. 127.0.0.1肯定会出异常.

import socket
import urllib
import urllib2
import urllib3.connection

import requests as req

_default_create_socket = socket.create_connection
_urllib3_create_socket = urllib3.connection.connection.create_connection


SOURCE_ADDRESS = ("127.0.0.1", 0)
#SOURCE_ADDRESS = ("172.28.153.121", 0)
#SOURCE_ADDRESS = ("172.16.30.41", 0)

def default_create_connection(*args, **kwargs):
  try:
    del kwargs["socket_options"]
  except:
    pass
  in_args = False
  if len(args) >=3:
    args = list(args)
    args[2] = SOURCE_ADDRESS
    args = tuple(args)
    in_args = True
  if not in_args:
    kwargs["source_address"] = SOURCE_ADDRESS
  print "args", args
  print "kwargs", str(kwargs)
  return _default_create_socket(*args, **kwargs)

def urllib3_create_connection(*args, **kwargs):
  in_args = False
  if len(args) >=3:
    args = list(args)
    args[2] = SOURCE_ADDRESS
    in_args = True
    args = tuple(args)
  if not in_args:
    kwargs["source_address"] = SOURCE_ADDRESS
  print "args", args
  print "kwargs", str(kwargs)
  return _urllib3_create_socket(*args, **kwargs)

socket.create_connection = default_create_connection
# 因为偶尔会出问题, 所以使用默认的 socket.create_connection
# urllib3.connection.connection.create_connection = urllib3_create_connection
urllib3.connection.connection.create_connection = default_create_connection

print " *** test requests: " + req.get("http://ip.haschek.at").content
print " *** test urllib: " + urllib.urlopen("http://ip.haschek.at").read()
print " *** test urllib2: " + urllib2.urlopen("http://ip.haschek.at").read()

注意: 使用 urllib3.utils.connection 好像不起作用

稍微再完善一下, 就是把根据网卡名自动获取 IP.

import subprocess

def get_all_net_devices():
  sub = subprocess.Popen("ls /sys/class/net", shell=True, stdout=subprocess.PIPE)
  sub.wait()
  net_devices = sub.stdout.read().strip().splitlines()
  # ['eth0', 'eth1', 'lo']
  # 这里简单过滤一下网卡名字, 根据需求改动
  net_devices = [i for i in net_devices if "ppp" in i]
  return net_devices
ALL_DEVICES = get_all_net_devices()

def get_local_ip(device_name):
  sub = subprocess.Popen("/sbin/ifconfig en0 | grep '%s ' | awk '{print $2}'" % device_name, shell=True, stdout=subprocess.PIPE)
  sub.wait()
  ip = sub.stdout.read().strip()
  return ip

def random_local_ip():
  return get_local_ip(random.choice(ALL_DEVICES))

# code ...

只需要把 args[2] = SOURCE_ADDRESS 和 kwargs["source_address"] = SOURCE_ADDRESS改成 random_local_ip() 或者 get_local_ip("eth0")

至于有什么用途, 就全凭想象了.

以上这篇Python 使用指定的网卡发送HTTP请求的实例就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
使用PDB模式调试Python程序介绍
Apr 05 Python
python实现微信自动回复功能
Apr 11 Python
Python模拟登录的多种方法(四种)
Jun 01 Python
Python Pandas分组聚合的实现方法
Jul 02 Python
Django框架之登录后自定义跳转页面的实现方法
Jul 18 Python
opencv中图像叠加/图像融合/按位操作的实现
Apr 01 Python
Python importlib模块重载使用方法详解
Oct 13 Python
Python爬虫Scrapy框架CrawlSpider原理及使用案例
Nov 20 Python
Python实现抖音热搜定时爬取功能
Mar 16 Python
python实现对doc、txt、xls等文档的读写操作
Apr 02 Python
python小型的音频操作库mp3Play
Apr 24 Python
Python爬虫 简单介绍一下Xpath及使用
Apr 26 Python
Python turtle绘画象棋棋盘
Aug 21 #Python
Python随机函数库random的使用方法详解
Aug 21 #Python
Django+zTree构建组织架构树的方法
Aug 21 #Python
python的移位操作实现详解
Aug 21 #Python
基于Python的微信机器人开发 微信登录和获取好友列表实现解析
Aug 21 #Python
Python+OpenCv制作证件图片生成器的操作方法
Aug 21 #Python
Python数据可视化实现正态分布(高斯分布)
Aug 21 #Python
You might like
日本十大最佳动漫,全都是二次元的神级作品
2019/10/05 日漫
获取用户Ip地址通用方法与常见安全隐患(HTTP_X_FORWARDED_FOR)
2013/06/01 PHP
Zend Framework分页类用法详解
2016/03/22 PHP
javascript TextArea动态显示剩余字符
2008/10/22 Javascript
jQuery 使用个人心得
2009/02/26 Javascript
JavaScript 面向对象编程(1) 基础
2010/05/18 Javascript
判断输入是否为空,获得输入类型的JS代码
2013/10/30 Javascript
javascript在子页面中函数无法调试问题解决方法
2014/01/17 Javascript
JsRender for index循环索引用法详解
2014/10/31 Javascript
JavaScript监听文本框回车事件并过滤文本框空格的方法
2015/04/16 Javascript
Vue集成Iframe页面的方法示例
2017/12/12 Javascript
vue-cli+webpack项目 修改项目名称的方法
2018/02/28 Javascript
详解express + mock让前后台并行开发
2018/06/06 Javascript
微信小程序定位当前城市的方法
2018/07/19 Javascript
angular实现input输入监听的示例
2018/08/31 Javascript
Vue.js + Nuxt.js 项目中使用 Vee-validate 表单校验
2019/04/22 Javascript
JavaScript数组排序小程序实现解析
2020/01/13 Javascript
Typescript3.9 常用新特性一览(推荐)
2020/05/14 Javascript
简单谈谈offsetleft、offsetTop和offsetParent
2020/12/04 Javascript
[01:33]一分钟玩转DOTA2第三弹:DOTA2&DotA快捷操作大对比
2014/06/04 DOTA
Python解析json之ValueError: Expecting property name enclosed in double quotes: line 1 column 2(char 1)
2017/07/06 Python
在python中画正态分布图像的实例
2019/07/08 Python
Django中的用户身份验证示例详解
2019/08/07 Python
tensorflow 获取checkpoint中的变量列表实例
2020/02/11 Python
pycharm中导入模块错误时提示Try to run this command from the system terminal
2020/03/26 Python
python 使用事件对象asyncio.Event来同步协程的操作
2020/05/04 Python
python绘制分布折线图的示例
2020/09/24 Python
美团网旗下网上订餐平台:美团外卖
2020/03/05 全球购物
2019史上最全Database工程师题库
2015/12/06 面试题
大学生写自荐信的技巧
2014/01/08 职场文书
食品安全处置方案
2014/06/14 职场文书
乡镇党建工作汇报材料
2014/08/14 职场文书
2014年教研工作总结
2014/12/06 职场文书
小学优秀教师材料
2014/12/15 职场文书
详解Python生成器和基于生成器的协程
2021/06/03 Python
Python面向对象之成员相关知识总结
2021/06/24 Python