用Python编写一个高效的端口扫描器的方法


Posted in Python onDecember 20, 2018

PyPortScanner

python多线程端口扫描器。

输出示例:

用Python编写一个高效的端口扫描器的方法

Github

此端口扫描器的源码,文档及详细调用方法见Github PythonPortScanner by Yaokai。

背景

有时候,在进行网络相关的研究的时候,我们需要执行一些有目的的参数测量。而端口扫描就是其中比较普遍也比较重要的一项。所谓的端口扫描,就是指通过TCP握手或者别的方式来判别一个给定主机上的某些端口是否处理开放,或者说监听的状态。现有的使用比较广泛的端口扫描工具是nmap。毋庸置疑,nmap是一款非常强大且易于使用的软件。但nmap是一款运行于terminal中的软件,有时在别的代码中调用并不是很方便,甚至没有相应的库。另外,nmap依赖的其他库较多,在较老的系统中可能无法使用较新的nmap,这样会造成扫描的不便。另外,nmap在扫描时需要root权限。基于这个原因,我用python2.7自带的库开发了一款高效的多线程端口扫描器来满足使用需要。

具体实现

I. 利用TCP握手连接扫描一个给定的(ip,port)地址对

为了实现端口扫描,我们首先明白如何使用python socket与给定的(ip, port)进行TCP握手。为了完成TCP握手,我们需要先初始化一个TCP socket。在python中新建一个TCP socket的代码如下:

TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) #(1)
TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) #(2)
TCP_sock.settimeout(delay) #(3)

其中(1)是初始化socket的代码,socket.AF_INTE参数表示IPv4 socketsocket.SOCK_STREAM参数表示TCP socket。这样我们就初始化了一个使用IPv4,TCP协议的socket。

(2)使用了socket.setsockopt()来设置socket的另一些参数。socket.SOL_SOCKET指定当前socket将使用setsockopt()中后面的参数。socket.SO_REUSEPORT表明当前socket使用了可复用端口的设置。socket.SO_REUSEPORT具体含义可以参考我的另一篇文章。

(3)将socket的连接超时时间设置为delay变量所对应的时间(以秒为单位)。这么做是为了防止我们在一个连接上等待太久。
了解了如何新建一个socket,我们就可以开始对给定的(ip,port)对进行TCP连接。代码如下:

try:
  result = TCP_sock.connect_ex((ip, int(port_number)))

  # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
  if result == 0:
    output[port_number] = 'OPEN'
  else:
    output[port_number] = 'CLOSE'

    TCP_sock.close()

except socket.error as e:
  output[port_number] = 'CLOSE'
  pass

因为这是一个I/O操作,为了处理可能出现的异常,我们需要在try,except块处理这部分操作。其次,我们根据socket.connect_ex()方法连接目标地址,通过该方法返回的状态代码来判断连接是否成功。该方法返回0代表连接成功。所以当返回值为0的时候将当前端口记录为打开状态。反之记录为关闭。另外,当连接操作出现异常的时候,我们也将端口记录为关闭状态,因为其并不能被成功连接(可能因为防火墙或者数据包被过滤等原因)。

需要注意的是,在连接完成后我们一定要调用socket.close()方法来关闭与远程端口之间的TCP连接。否则的话我们的扫描操作可能会引起所谓的TCP连接悬挂问题(Hanging TCP connection)。

总结起来,TCP握手扫描的整体代码如下:

"""
Perform status checking for a given port on a given ip address using TCP handshake

Keyword arguments:
ip -- the ip address that is being scanned
port_number -- the port that is going to be checked
delay -- the time in seconds that a TCP socket waits until timeout
output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
"""
def __TCP_connect(ip, port_number, delay, output):
  # Initilize the TCP socket object
  TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
  TCP_sock.settimeout(delay)

  try:
    result = TCP_sock.connect_ex((ip, int(port_number)))

    # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
    if result == 0:
      output[port_number] = 'OPEN'
    else:
      output[port_number] = 'CLOSE'

    TCP_sock.close()

  except socket.error as e:

    output[port_number] = 'CLOSE'
    pass

II. 多线程扫描端口

单线程扫描虽然逻辑简单,但无疑是及其低效的。因为在扫描过程中要进行大量的数据包的发送和接受,所以这是一个I/O密集型的操作。如果只是用单线程进行扫描的话,程序会在等待回复的过程中浪费大量的时间。因此多线程的操作是很有必要的。这里,一个很自然的思路就是为每一个端口单独开一个线程进行扫描。

在这里我们将需要扫描的端口列表定为从Nmap中得到的前1000个使用频率最高的端口:

__port_list = [1,3,6,9,13,17,19,20,21,22,23,24,25,30,32,37,42,49,53,70,79,80,81,82,83,84,88,89,99,106,109,110,113,119,125,135,139,143,146,161,163,179,199,211,222,254,255,259,264,280,301,306,311,340,366,389,406,416,425,427,443,444,458,464,481,497,500,512,513,514,524,541,543,544,548,554,563,...]

对于一个给定的ip地址,扫描的过程是这样的:

1. 取出一个端口
2. 新建一条线程,利用__TCP_connect()函数对该(ip,port)进行连接操作。
3. 调用thread.start()thread.join()方法,使扫描的子线程开始工作并且命令主线程等待子线程死亡后再结束。
4. 重复这个过程直到所有的端口都被扫描过。

根据以上思路,多线程扫描的代码如下:

"""
Open multiple threads to perform port scanning

Keyword arguments:
ip -- the ip address that is being scanned
delay -- the time in seconds that a TCP socket waits until timeout
output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
"""
def __scan_ports_helper(ip, delay, output):

  '''
  Multithreading port scanning
  '''

  port_index = 0

  while port_index < len(__port_list):

    # Ensure that the number of cocurrently running threads does not exceed the thread limit
    while threading.activeCount() < __thread_limit and port_index < len(__port_list):

      # Start threads
      thread = threading.Thread(target = __TCP_connect, args = (ip, __port_list[port_index], delay, output))
      thread.start()
      # lock the thread until all threads complete
      thread.join()
      port_index = port_index + 1

其中__thread_limit参数是用来限制线程数目的。output是一个字典,以(port: status)的形式保存了扫描的结果。
thread.join()保证了主线程只有在所有子线程都结束之后才会继续执行,从而确保了我们一定会扫描全部的端口。

III. 多线程扫描多个网站

在多线程扫描端口的同时,如果我们能够多线程扫描多个网站,那么扫描的效率还将进一步提高。为了达到这个目的,我们需要另一个线程去管理一个网站对应的对其端口进行扫描的所有子线程。

除此之外,在这种情况下,我们必须删去__scan_ports_helper()中的thread.join()。否则主线程就会被端口扫描子线程阻塞,我们也就无法多线程扫描多个网站了。

在不使用join()的情况下,我们如何确保一个网站的扫描线程只有在完成对其全部端口的扫描之后才会返回呢?这里我使用的方法是检测output字典的长度。因为在全部扫描完成后,output的长度一定与__port_list的长度一致。

改变后的代码如下:

def __scan_ports_helper(ip, delay, output):

  '''
  Multithreading port scanning
  '''

  port_index = 0

  while port_index < len(__port_list):

    # Ensure that the number of cocurrently running threads does not exceed the thread limit
    while threading.activeCount() < __thread_limit and port_index < len(__port_list):

      # Start threads
      thread = threading.Thread(target = __TCP_connect, args = (ip, __port_list[port_index], delay, output))
      thread.start()
      port_index = port_index + 1

  while (len(output) < len(self.target_ports)):
    continue

根据以上扫描线程的代码,端口扫描的管理线程的代码如下所示:

"""
Controller of the __scan_ports_helper() function

Keyword arguments:
ip -- the ip address that is being scanned
delay -- the time in seconds that a TCP socket waits until timeout
"""    

def __scan_ports(websites, output_ip, delay):

  scan_result = {}

  for website in websites:
    website = str(website)
    scan_result[website] = {}

    thread = threading.Thread(target = __scan_ports_helper, args = (ip, delay, scan_result[website]))
    thread.start()
    # lock the script until all threads complete
    thread.join()

  return scan_result

至此,我们就完成了一个多线程端口扫描器的全部代码。

IV. 总结!利用这些代码扫描给定网站并输出结果

处于输出方便的考虑,我并没有使用多线程扫描多个网站,同时对每个网站多线程扫描多个端口的方法。在这个例子中只进行了多线程扫描端口,但同时只扫描一个网站的操作。整合起来的代码如下:

import sys
import subprocess
import socket
import threading
import time

class PortScanner:

  # default ports to be scanned
  # or put any ports you want to scan here!
  __port_list = [1,3,6,9,13,17,19,20,21,22,23,24,25,30,32,37,42,49,53,70,79,80,81,82,83,84,88,89,99,106,109,110,113,119,125,135,139,143,146,161,163,179,199,211,222,254,255,259,264,280,301,306,311,340,366,389,406,416,425,427,443,444,458,464,481,497,500,512,513,514,524,541,543,544,548,554,563]
  # default thread number limit
  __thread_limit = 1000
  # default connection timeout time inseconds
  __delay = 10


  """
  Constructor of a PortScanner object

  Keyword arguments:
  target_ports -- the list of ports that is going to be scanned (default self.__port_list)
  """
  def __init__(self, target_ports = None):
    # If target ports not given in the arguments, use default ports
    # If target ports is given in the arguments, use given port lists
    if target_ports is None:
      self.target_ports = self.__port_list
    else:
      self.target_ports = target_ports


  """
  Return the usage information for invalid input host name. 
  """
  def __usage(self):
    print('python Port Scanner v0.1')
    print('please make sure the input host name is in the form of "something.com" or "http://something.com!"\n')


  """
  This is the function need to be called to perform port scanning

  Keyword arguments:
  host_name -- the hostname that is going to be scanned
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def scan(self, host_name, message = ''):

    if 'http://' in host_name or 'https://' in host_name:
      host_name = host_name[host_name.find('://') + 3 : ]

    print('*' * 60 + '\n')
    print('start scanning website: ' + str(host_name))

    try:
      server_ip = socket.gethostbyname(str(host_name))
      print('server ip is: ' + str(server_ip))

    except socket.error as e:
      # If the DNS resolution of a website cannot be finished, abort that website.

      #print(e)
      print('hostname %s unknown!!!' % host_name)

      self.__usage()

      return {}

      # May need to return specificed values to the DB in the future

    start_time = time.time()
    output = self.__scan_ports(server_ip, self.__delay, message)
    stop_time = time.time()

    print('host %s scanned in %f seconds' %(host_name, stop_time - start_time))

    print('finish scanning!\n')

    return output


  """
  Set the maximum number of thread for port scanning

  Keyword argument:
  num -- the maximum number of thread running concurrently (default 1000)
  """
  def set_thread_limit(self, num):
    num = int(num)

    if num <= 0 or num > 50000:

      print('Warning: Invalid thread number limit! Please make sure the thread limit is within the range of (1, 50,000)!')
      print('The scanning process will use default thread limit!')

      return

    self.__thread_limit = num


  """
  Set the time out delay for port scanning in seconds

  Keyword argument:
  delay -- the time in seconds that a TCP socket waits until timeout (default 10)
  """
  def set_delay(self, delay):

    delay = int(delay)
    if delay <= 0 or delay > 100:

      print('Warning: Invalid delay value! Please make sure the input delay is within the range of (1, 100)')
      print('The scanning process will use the default delay time')

      return 

    self.__delay = delay


  """
  Print out the list of ports being scanned
  """
  def show_target_ports(self):
    print ('Current port list is:')
    print (self.target_ports)


  """
  Print out the delay in seconds that a TCP socket waits until timeout
  """
  def show_delay(self):
    print ('Current timeout delay is :%d' %(int(self.__delay)))


  """
  Open multiple threads to perform port scanning

  Keyword arguments:
  ip -- the ip address that is being scanned
  delay -- the time in seconds that a TCP socket waits until timeout
  output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def __scan_ports_helper(self, ip, delay, output, message):

    '''
    Multithreading port scanning
    '''

    port_index = 0

    while port_index < len(self.target_ports):

      # Ensure that the number of cocurrently running threads does not exceed the thread limit
      while threading.activeCount() < self.__thread_limit and port_index < len(self.target_ports):

        # Start threads
        thread = threading.Thread(target = self.__TCP_connect, args = (ip, self.target_ports[port_index], delay, output, message))
        thread.start()
        port_index = port_index + 1


  """
  Controller of the __scan_ports_helper() function

  Keyword arguments:
  ip -- the ip address that is being scanned
  delay -- the time in seconds that a TCP socket waits until timeout
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """    
  def __scan_ports(self, ip, delay, message):

    output = {}

    thread = threading.Thread(target = self.__scan_ports_helper, args = (ip, delay, output, message))
    thread.start()

    # Wait until all port scanning threads finished
    while (len(output) < len(self.target_ports)):
      continue

    # Print openning ports from small to large
    for port in self.target_ports:
      if output[port] == 'OPEN':
        print(str(port) + ': ' + output[port] + '\n')

    return output



  """
  Perform status checking for a given port on a given ip address using TCP handshake

  Keyword arguments:
  ip -- the ip address that is being scanned
  port_number -- the port that is going to be checked
  delay -- the time in seconds that a TCP socket waits until timeout
  output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def __TCP_connect(self, ip, port_number, delay, output, message):
    # Initilize the TCP socket object
    TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
    TCP_sock.settimeout(delay)


    # Initilize a UDP socket to send scanning alert message if there exists an non-empty message
    if message != '':
      UDP_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
      UDP_sock.sendto(str(message), (ip, int(port_number)))

    try:
      result = TCP_sock.connect_ex((ip, int(port_number)))
      if message != '':
        TCP_sock.sendall(str(message))

      # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
      if result == 0:
        output[port_number] = 'OPEN'
      else:
        output[port_number] = 'CLOSE'

      TCP_sock.close()

    except socket.error as e:

      output[port_number] = 'CLOSE'
      pass

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
Windows下python2.7.8安装图文教程
May 26 Python
使用Python监控文件内容变化代码实例
Jun 04 Python
Python操作rabbitMQ的示例代码
Mar 19 Python
Ubuntu18.04下python版本完美切换的解决方法
Jun 14 Python
对python特殊函数 __call__()的使用详解
Jul 02 Python
python实现LBP方法提取图像纹理特征实现分类的步骤
Jul 11 Python
python2.7实现复制大量文件及文件夹资料
Aug 31 Python
Python读取YAML文件过程详解
Dec 30 Python
如何使用selenium和requests组合实现登录页面
Feb 03 Python
Python响应对象text属性乱码解决方案
Mar 31 Python
浅谈python 中的 type(), dtype(), astype()的区别
Apr 09 Python
python spilt()分隔字符串的实现示例
May 21 Python
python re正则匹配网页中图片url地址的方法
Dec 20 #Python
python使用pdfminer解析pdf文件的方法示例
Dec 20 #Python
python爬取指定微信公众号文章
Dec 20 #Python
在Django中URL正则表达式匹配的方法
Dec 20 #Python
python采集微信公众号文章
Dec 20 #Python
Linux下Pycharm、Anaconda环境配置及使用踩坑
Dec 19 #Python
python爬虫之urllib,伪装,超时设置,异常处理的方法
Dec 19 #Python
You might like
全国FM电台频率大全 - 28 甘肃省
2020/03/11 无线电
用php获取远程图片并把它保存到本地的代码
2008/04/07 PHP
域名和cookie问题(域名后缀)
2012/10/10 PHP
深入分析使用mysql_fetch_object()以对象的形式返回查询结果
2013/06/05 PHP
smarty自定义函数用法示例
2016/05/20 PHP
PHP从零开始打造自己的MVC框架之路由类实现方法分析
2019/06/03 PHP
JTrackBar水平拖动效果
2007/07/15 Javascript
固定表格行列(expression)在IE下适用
2013/07/25 Javascript
CSS+jQuery实现的一个放大缩小动画效果
2013/09/24 Javascript
一个网页标题title的闪动提示效果实现思路
2014/03/22 Javascript
js+ajax实现获取文件大小的方法
2015/12/08 Javascript
jquery实现文字单行横移或翻转(上下、左右跳转)
2017/01/08 Javascript
微信小程序 点击控件后选中其它反选实例详解
2017/02/21 Javascript
详解用webpack2搭建angular2的项目
2017/06/22 Javascript
jQuery实现仿京东防抖动菜单效果示例
2018/07/06 jQuery
基于jQuery ztree实现表格风格的树状结构
2018/08/31 jQuery
详解vue+axios给开发环境和生产环境配置不同的接口地址
2019/08/16 Javascript
JS实现压缩上传图片base64长度功能
2019/12/03 Javascript
Python Socketserver实现FTP文件上传下载代码实例
2020/03/27 Python
opencv-python的RGB与BGR互转方式
2020/06/02 Python
pytorch判断是否cuda 判断变量类型方式
2020/06/23 Python
CSS3 实现发光边框特效
2020/11/11 HTML / CSS
找到不普通的东西:Bonanza
2016/10/20 全球购物
New Balance澳大利亚官网:运动鞋和健身服装
2019/02/23 全球购物
编码转换,怎样实现将GB2312编码的字符串转换为ISO-8859-1编码的字符串
2014/01/07 面试题
MYSQL相比于其他数据库有哪些特点
2013/07/19 面试题
求职推荐信
2013/10/28 职场文书
拾金不昧的表扬信
2014/01/16 职场文书
县优秀教师事迹材料
2014/01/31 职场文书
法定代表人授权委托书
2014/04/04 职场文书
毕业论文评语大全
2014/04/29 职场文书
投资建议书模板
2014/05/12 职场文书
班级元旦晚会开幕词
2015/01/29 职场文书
辞职信的写法
2015/02/27 职场文书
幼儿园安全管理制度
2015/08/05 职场文书
新学期家长寄语2016
2015/12/03 职场文书