用Python编写一个高效的端口扫描器的方法


Posted in Python onDecember 20, 2018

PyPortScanner

python多线程端口扫描器。

输出示例:

用Python编写一个高效的端口扫描器的方法

Github

此端口扫描器的源码,文档及详细调用方法见Github PythonPortScanner by Yaokai。

背景

有时候,在进行网络相关的研究的时候,我们需要执行一些有目的的参数测量。而端口扫描就是其中比较普遍也比较重要的一项。所谓的端口扫描,就是指通过TCP握手或者别的方式来判别一个给定主机上的某些端口是否处理开放,或者说监听的状态。现有的使用比较广泛的端口扫描工具是nmap。毋庸置疑,nmap是一款非常强大且易于使用的软件。但nmap是一款运行于terminal中的软件,有时在别的代码中调用并不是很方便,甚至没有相应的库。另外,nmap依赖的其他库较多,在较老的系统中可能无法使用较新的nmap,这样会造成扫描的不便。另外,nmap在扫描时需要root权限。基于这个原因,我用python2.7自带的库开发了一款高效的多线程端口扫描器来满足使用需要。

具体实现

I. 利用TCP握手连接扫描一个给定的(ip,port)地址对

为了实现端口扫描,我们首先明白如何使用python socket与给定的(ip, port)进行TCP握手。为了完成TCP握手,我们需要先初始化一个TCP socket。在python中新建一个TCP socket的代码如下:

TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) #(1)
TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) #(2)
TCP_sock.settimeout(delay) #(3)

其中(1)是初始化socket的代码,socket.AF_INTE参数表示IPv4 socketsocket.SOCK_STREAM参数表示TCP socket。这样我们就初始化了一个使用IPv4,TCP协议的socket。

(2)使用了socket.setsockopt()来设置socket的另一些参数。socket.SOL_SOCKET指定当前socket将使用setsockopt()中后面的参数。socket.SO_REUSEPORT表明当前socket使用了可复用端口的设置。socket.SO_REUSEPORT具体含义可以参考我的另一篇文章。

(3)将socket的连接超时时间设置为delay变量所对应的时间(以秒为单位)。这么做是为了防止我们在一个连接上等待太久。
了解了如何新建一个socket,我们就可以开始对给定的(ip,port)对进行TCP连接。代码如下:

try:
  result = TCP_sock.connect_ex((ip, int(port_number)))

  # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
  if result == 0:
    output[port_number] = 'OPEN'
  else:
    output[port_number] = 'CLOSE'

    TCP_sock.close()

except socket.error as e:
  output[port_number] = 'CLOSE'
  pass

因为这是一个I/O操作,为了处理可能出现的异常,我们需要在try,except块处理这部分操作。其次,我们根据socket.connect_ex()方法连接目标地址,通过该方法返回的状态代码来判断连接是否成功。该方法返回0代表连接成功。所以当返回值为0的时候将当前端口记录为打开状态。反之记录为关闭。另外,当连接操作出现异常的时候,我们也将端口记录为关闭状态,因为其并不能被成功连接(可能因为防火墙或者数据包被过滤等原因)。

需要注意的是,在连接完成后我们一定要调用socket.close()方法来关闭与远程端口之间的TCP连接。否则的话我们的扫描操作可能会引起所谓的TCP连接悬挂问题(Hanging TCP connection)。

总结起来,TCP握手扫描的整体代码如下:

"""
Perform status checking for a given port on a given ip address using TCP handshake

Keyword arguments:
ip -- the ip address that is being scanned
port_number -- the port that is going to be checked
delay -- the time in seconds that a TCP socket waits until timeout
output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
"""
def __TCP_connect(ip, port_number, delay, output):
  # Initilize the TCP socket object
  TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
  TCP_sock.settimeout(delay)

  try:
    result = TCP_sock.connect_ex((ip, int(port_number)))

    # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
    if result == 0:
      output[port_number] = 'OPEN'
    else:
      output[port_number] = 'CLOSE'

    TCP_sock.close()

  except socket.error as e:

    output[port_number] = 'CLOSE'
    pass

II. 多线程扫描端口

单线程扫描虽然逻辑简单,但无疑是及其低效的。因为在扫描过程中要进行大量的数据包的发送和接受,所以这是一个I/O密集型的操作。如果只是用单线程进行扫描的话,程序会在等待回复的过程中浪费大量的时间。因此多线程的操作是很有必要的。这里,一个很自然的思路就是为每一个端口单独开一个线程进行扫描。

在这里我们将需要扫描的端口列表定为从Nmap中得到的前1000个使用频率最高的端口:

__port_list = [1,3,6,9,13,17,19,20,21,22,23,24,25,30,32,37,42,49,53,70,79,80,81,82,83,84,88,89,99,106,109,110,113,119,125,135,139,143,146,161,163,179,199,211,222,254,255,259,264,280,301,306,311,340,366,389,406,416,425,427,443,444,458,464,481,497,500,512,513,514,524,541,543,544,548,554,563,...]

对于一个给定的ip地址,扫描的过程是这样的:

1. 取出一个端口
2. 新建一条线程,利用__TCP_connect()函数对该(ip,port)进行连接操作。
3. 调用thread.start()thread.join()方法,使扫描的子线程开始工作并且命令主线程等待子线程死亡后再结束。
4. 重复这个过程直到所有的端口都被扫描过。

根据以上思路,多线程扫描的代码如下:

"""
Open multiple threads to perform port scanning

Keyword arguments:
ip -- the ip address that is being scanned
delay -- the time in seconds that a TCP socket waits until timeout
output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
"""
def __scan_ports_helper(ip, delay, output):

  '''
  Multithreading port scanning
  '''

  port_index = 0

  while port_index < len(__port_list):

    # Ensure that the number of cocurrently running threads does not exceed the thread limit
    while threading.activeCount() < __thread_limit and port_index < len(__port_list):

      # Start threads
      thread = threading.Thread(target = __TCP_connect, args = (ip, __port_list[port_index], delay, output))
      thread.start()
      # lock the thread until all threads complete
      thread.join()
      port_index = port_index + 1

其中__thread_limit参数是用来限制线程数目的。output是一个字典,以(port: status)的形式保存了扫描的结果。
thread.join()保证了主线程只有在所有子线程都结束之后才会继续执行,从而确保了我们一定会扫描全部的端口。

III. 多线程扫描多个网站

在多线程扫描端口的同时,如果我们能够多线程扫描多个网站,那么扫描的效率还将进一步提高。为了达到这个目的,我们需要另一个线程去管理一个网站对应的对其端口进行扫描的所有子线程。

除此之外,在这种情况下,我们必须删去__scan_ports_helper()中的thread.join()。否则主线程就会被端口扫描子线程阻塞,我们也就无法多线程扫描多个网站了。

在不使用join()的情况下,我们如何确保一个网站的扫描线程只有在完成对其全部端口的扫描之后才会返回呢?这里我使用的方法是检测output字典的长度。因为在全部扫描完成后,output的长度一定与__port_list的长度一致。

改变后的代码如下:

def __scan_ports_helper(ip, delay, output):

  '''
  Multithreading port scanning
  '''

  port_index = 0

  while port_index < len(__port_list):

    # Ensure that the number of cocurrently running threads does not exceed the thread limit
    while threading.activeCount() < __thread_limit and port_index < len(__port_list):

      # Start threads
      thread = threading.Thread(target = __TCP_connect, args = (ip, __port_list[port_index], delay, output))
      thread.start()
      port_index = port_index + 1

  while (len(output) < len(self.target_ports)):
    continue

根据以上扫描线程的代码,端口扫描的管理线程的代码如下所示:

"""
Controller of the __scan_ports_helper() function

Keyword arguments:
ip -- the ip address that is being scanned
delay -- the time in seconds that a TCP socket waits until timeout
"""    

def __scan_ports(websites, output_ip, delay):

  scan_result = {}

  for website in websites:
    website = str(website)
    scan_result[website] = {}

    thread = threading.Thread(target = __scan_ports_helper, args = (ip, delay, scan_result[website]))
    thread.start()
    # lock the script until all threads complete
    thread.join()

  return scan_result

至此,我们就完成了一个多线程端口扫描器的全部代码。

IV. 总结!利用这些代码扫描给定网站并输出结果

处于输出方便的考虑,我并没有使用多线程扫描多个网站,同时对每个网站多线程扫描多个端口的方法。在这个例子中只进行了多线程扫描端口,但同时只扫描一个网站的操作。整合起来的代码如下:

import sys
import subprocess
import socket
import threading
import time

class PortScanner:

  # default ports to be scanned
  # or put any ports you want to scan here!
  __port_list = [1,3,6,9,13,17,19,20,21,22,23,24,25,30,32,37,42,49,53,70,79,80,81,82,83,84,88,89,99,106,109,110,113,119,125,135,139,143,146,161,163,179,199,211,222,254,255,259,264,280,301,306,311,340,366,389,406,416,425,427,443,444,458,464,481,497,500,512,513,514,524,541,543,544,548,554,563]
  # default thread number limit
  __thread_limit = 1000
  # default connection timeout time inseconds
  __delay = 10


  """
  Constructor of a PortScanner object

  Keyword arguments:
  target_ports -- the list of ports that is going to be scanned (default self.__port_list)
  """
  def __init__(self, target_ports = None):
    # If target ports not given in the arguments, use default ports
    # If target ports is given in the arguments, use given port lists
    if target_ports is None:
      self.target_ports = self.__port_list
    else:
      self.target_ports = target_ports


  """
  Return the usage information for invalid input host name. 
  """
  def __usage(self):
    print('python Port Scanner v0.1')
    print('please make sure the input host name is in the form of "something.com" or "http://something.com!"\n')


  """
  This is the function need to be called to perform port scanning

  Keyword arguments:
  host_name -- the hostname that is going to be scanned
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def scan(self, host_name, message = ''):

    if 'http://' in host_name or 'https://' in host_name:
      host_name = host_name[host_name.find('://') + 3 : ]

    print('*' * 60 + '\n')
    print('start scanning website: ' + str(host_name))

    try:
      server_ip = socket.gethostbyname(str(host_name))
      print('server ip is: ' + str(server_ip))

    except socket.error as e:
      # If the DNS resolution of a website cannot be finished, abort that website.

      #print(e)
      print('hostname %s unknown!!!' % host_name)

      self.__usage()

      return {}

      # May need to return specificed values to the DB in the future

    start_time = time.time()
    output = self.__scan_ports(server_ip, self.__delay, message)
    stop_time = time.time()

    print('host %s scanned in %f seconds' %(host_name, stop_time - start_time))

    print('finish scanning!\n')

    return output


  """
  Set the maximum number of thread for port scanning

  Keyword argument:
  num -- the maximum number of thread running concurrently (default 1000)
  """
  def set_thread_limit(self, num):
    num = int(num)

    if num <= 0 or num > 50000:

      print('Warning: Invalid thread number limit! Please make sure the thread limit is within the range of (1, 50,000)!')
      print('The scanning process will use default thread limit!')

      return

    self.__thread_limit = num


  """
  Set the time out delay for port scanning in seconds

  Keyword argument:
  delay -- the time in seconds that a TCP socket waits until timeout (default 10)
  """
  def set_delay(self, delay):

    delay = int(delay)
    if delay <= 0 or delay > 100:

      print('Warning: Invalid delay value! Please make sure the input delay is within the range of (1, 100)')
      print('The scanning process will use the default delay time')

      return 

    self.__delay = delay


  """
  Print out the list of ports being scanned
  """
  def show_target_ports(self):
    print ('Current port list is:')
    print (self.target_ports)


  """
  Print out the delay in seconds that a TCP socket waits until timeout
  """
  def show_delay(self):
    print ('Current timeout delay is :%d' %(int(self.__delay)))


  """
  Open multiple threads to perform port scanning

  Keyword arguments:
  ip -- the ip address that is being scanned
  delay -- the time in seconds that a TCP socket waits until timeout
  output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def __scan_ports_helper(self, ip, delay, output, message):

    '''
    Multithreading port scanning
    '''

    port_index = 0

    while port_index < len(self.target_ports):

      # Ensure that the number of cocurrently running threads does not exceed the thread limit
      while threading.activeCount() < self.__thread_limit and port_index < len(self.target_ports):

        # Start threads
        thread = threading.Thread(target = self.__TCP_connect, args = (ip, self.target_ports[port_index], delay, output, message))
        thread.start()
        port_index = port_index + 1


  """
  Controller of the __scan_ports_helper() function

  Keyword arguments:
  ip -- the ip address that is being scanned
  delay -- the time in seconds that a TCP socket waits until timeout
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """    
  def __scan_ports(self, ip, delay, message):

    output = {}

    thread = threading.Thread(target = self.__scan_ports_helper, args = (ip, delay, output, message))
    thread.start()

    # Wait until all port scanning threads finished
    while (len(output) < len(self.target_ports)):
      continue

    # Print openning ports from small to large
    for port in self.target_ports:
      if output[port] == 'OPEN':
        print(str(port) + ': ' + output[port] + '\n')

    return output



  """
  Perform status checking for a given port on a given ip address using TCP handshake

  Keyword arguments:
  ip -- the ip address that is being scanned
  port_number -- the port that is going to be checked
  delay -- the time in seconds that a TCP socket waits until timeout
  output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def __TCP_connect(self, ip, port_number, delay, output, message):
    # Initilize the TCP socket object
    TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
    TCP_sock.settimeout(delay)


    # Initilize a UDP socket to send scanning alert message if there exists an non-empty message
    if message != '':
      UDP_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
      UDP_sock.sendto(str(message), (ip, int(port_number)))

    try:
      result = TCP_sock.connect_ex((ip, int(port_number)))
      if message != '':
        TCP_sock.sendall(str(message))

      # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
      if result == 0:
        output[port_number] = 'OPEN'
      else:
        output[port_number] = 'CLOSE'

      TCP_sock.close()

    except socket.error as e:

      output[port_number] = 'CLOSE'
      pass

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
linux环境下安装pyramid和新建项目的步骤
Nov 27 Python
python调用java的Webservice示例
Mar 10 Python
在Python3中初学者应会的一些基本的提升效率的小技巧
Mar 31 Python
在Python中操作字典之clear()方法的使用
May 21 Python
Python scikit-learn 做线性回归的示例代码
Nov 01 Python
Python 批量合并多个txt文件的实例讲解
May 08 Python
解决python3 安装完Pycurl在import pycurl时报错的问题
Oct 15 Python
python Matplotlib底图中鼠标滑过显示隐藏内容的实例代码
Jul 31 Python
python实现翻译word表格小程序
Feb 27 Python
pycharm下配置pyqt5的教程(anaconda虚拟环境下+tensorflow)
Mar 25 Python
k-means 聚类算法与Python实现代码
Jun 01 Python
python使用ctypes库调用DLL动态链接库
Oct 22 Python
python re正则匹配网页中图片url地址的方法
Dec 20 #Python
python使用pdfminer解析pdf文件的方法示例
Dec 20 #Python
python爬取指定微信公众号文章
Dec 20 #Python
在Django中URL正则表达式匹配的方法
Dec 20 #Python
python采集微信公众号文章
Dec 20 #Python
Linux下Pycharm、Anaconda环境配置及使用踩坑
Dec 19 #Python
python爬虫之urllib,伪装,超时设置,异常处理的方法
Dec 19 #Python
You might like
一个取得文件扩展名的函数
2006/10/09 PHP
PHP调试函数和日志记录函数分享
2015/01/31 PHP
PHP框架自动加载类文件原理详解
2017/06/06 PHP
用JavaScript玩转游戏物理(一)运动学模拟与粒子系统
2010/06/19 Javascript
jquery load()在firefox(火狐)下显示不正常的解决方法
2011/04/05 Javascript
JS判断页面加载状态以及添加遮罩和缓冲动画的代码
2012/10/11 Javascript
JS中令人发指的valueOf方法介绍
2013/02/22 Javascript
jQuery输入城市查看地图使用介绍
2013/05/08 Javascript
div模拟选择框示例代码
2013/11/03 Javascript
JS的事件绑定深入认识
2014/06/26 Javascript
js检查是否关闭浏览器的方法
2016/08/02 Javascript
如何实现星星评价(jquery.raty.js插件)
2016/12/21 Javascript
VsCode插件整理(小结)
2017/09/14 Javascript
基于JSONP原理解析(推荐)
2017/12/04 Javascript
vue全局组件与局部组件使用方法详解
2018/03/29 Javascript
微信小程序自定义可滑动日历界面
2018/12/28 Javascript
jQuery pager.js 插件动态分页功能实例分析
2019/08/02 jQuery
解决layer 关闭当前弹窗 关闭遮罩层 input值获取不到的问题
2019/09/25 Javascript
从表单校验看JavaScript策略模式的使用详解
2020/10/17 Javascript
Python  连接字符串(join %)
2008/09/06 Python
python获得linux下所有挂载点(mount points)的方法
2015/04/29 Python
Python enumerate索引迭代代码解析
2018/01/19 Python
Tensorflow的可视化工具Tensorboard的初步使用详解
2018/02/11 Python
python实现各种插值法(数值分析)
2019/07/30 Python
Python3实现配置文件差异对比脚本
2019/11/18 Python
Jupyter Notebook 实现正常显示中文和负号
2020/04/24 Python
详解修改Anaconda中的Jupyter Notebook默认工作路径的三种方式
2021/01/24 Python
深入浅析HTML5中的SVG
2015/11/27 HTML / CSS
阿里健康官方海外旗舰店:阿里健康国际自营
2017/11/24 全球购物
数控机械专业个人的自我评价
2014/01/02 职场文书
商超业务员岗位职责
2014/03/12 职场文书
HR求职自荐信范文
2014/06/21 职场文书
酒店周年庆活动方案
2014/08/21 职场文书
临时租车协议范本
2014/09/23 职场文书
父亲婚礼答谢词
2015/01/04 职场文书
2015年度学校应急管理工作总结
2015/10/22 职场文书