用Python编写一个高效的端口扫描器的方法


Posted in Python onDecember 20, 2018

PyPortScanner

python多线程端口扫描器。

输出示例:

用Python编写一个高效的端口扫描器的方法

Github

此端口扫描器的源码,文档及详细调用方法见Github PythonPortScanner by Yaokai。

背景

有时候,在进行网络相关的研究的时候,我们需要执行一些有目的的参数测量。而端口扫描就是其中比较普遍也比较重要的一项。所谓的端口扫描,就是指通过TCP握手或者别的方式来判别一个给定主机上的某些端口是否处理开放,或者说监听的状态。现有的使用比较广泛的端口扫描工具是nmap。毋庸置疑,nmap是一款非常强大且易于使用的软件。但nmap是一款运行于terminal中的软件,有时在别的代码中调用并不是很方便,甚至没有相应的库。另外,nmap依赖的其他库较多,在较老的系统中可能无法使用较新的nmap,这样会造成扫描的不便。另外,nmap在扫描时需要root权限。基于这个原因,我用python2.7自带的库开发了一款高效的多线程端口扫描器来满足使用需要。

具体实现

I. 利用TCP握手连接扫描一个给定的(ip,port)地址对

为了实现端口扫描,我们首先明白如何使用python socket与给定的(ip, port)进行TCP握手。为了完成TCP握手,我们需要先初始化一个TCP socket。在python中新建一个TCP socket的代码如下:

TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) #(1)
TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) #(2)
TCP_sock.settimeout(delay) #(3)

其中(1)是初始化socket的代码,socket.AF_INTE参数表示IPv4 socketsocket.SOCK_STREAM参数表示TCP socket。这样我们就初始化了一个使用IPv4,TCP协议的socket。

(2)使用了socket.setsockopt()来设置socket的另一些参数。socket.SOL_SOCKET指定当前socket将使用setsockopt()中后面的参数。socket.SO_REUSEPORT表明当前socket使用了可复用端口的设置。socket.SO_REUSEPORT具体含义可以参考我的另一篇文章。

(3)将socket的连接超时时间设置为delay变量所对应的时间(以秒为单位)。这么做是为了防止我们在一个连接上等待太久。
了解了如何新建一个socket,我们就可以开始对给定的(ip,port)对进行TCP连接。代码如下:

try:
  result = TCP_sock.connect_ex((ip, int(port_number)))

  # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
  if result == 0:
    output[port_number] = 'OPEN'
  else:
    output[port_number] = 'CLOSE'

    TCP_sock.close()

except socket.error as e:
  output[port_number] = 'CLOSE'
  pass

因为这是一个I/O操作,为了处理可能出现的异常,我们需要在try,except块处理这部分操作。其次,我们根据socket.connect_ex()方法连接目标地址,通过该方法返回的状态代码来判断连接是否成功。该方法返回0代表连接成功。所以当返回值为0的时候将当前端口记录为打开状态。反之记录为关闭。另外,当连接操作出现异常的时候,我们也将端口记录为关闭状态,因为其并不能被成功连接(可能因为防火墙或者数据包被过滤等原因)。

需要注意的是,在连接完成后我们一定要调用socket.close()方法来关闭与远程端口之间的TCP连接。否则的话我们的扫描操作可能会引起所谓的TCP连接悬挂问题(Hanging TCP connection)。

总结起来,TCP握手扫描的整体代码如下:

"""
Perform status checking for a given port on a given ip address using TCP handshake

Keyword arguments:
ip -- the ip address that is being scanned
port_number -- the port that is going to be checked
delay -- the time in seconds that a TCP socket waits until timeout
output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
"""
def __TCP_connect(ip, port_number, delay, output):
  # Initilize the TCP socket object
  TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
  TCP_sock.settimeout(delay)

  try:
    result = TCP_sock.connect_ex((ip, int(port_number)))

    # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
    if result == 0:
      output[port_number] = 'OPEN'
    else:
      output[port_number] = 'CLOSE'

    TCP_sock.close()

  except socket.error as e:

    output[port_number] = 'CLOSE'
    pass

II. 多线程扫描端口

单线程扫描虽然逻辑简单,但无疑是及其低效的。因为在扫描过程中要进行大量的数据包的发送和接受,所以这是一个I/O密集型的操作。如果只是用单线程进行扫描的话,程序会在等待回复的过程中浪费大量的时间。因此多线程的操作是很有必要的。这里,一个很自然的思路就是为每一个端口单独开一个线程进行扫描。

在这里我们将需要扫描的端口列表定为从Nmap中得到的前1000个使用频率最高的端口:

__port_list = [1,3,6,9,13,17,19,20,21,22,23,24,25,30,32,37,42,49,53,70,79,80,81,82,83,84,88,89,99,106,109,110,113,119,125,135,139,143,146,161,163,179,199,211,222,254,255,259,264,280,301,306,311,340,366,389,406,416,425,427,443,444,458,464,481,497,500,512,513,514,524,541,543,544,548,554,563,...]

对于一个给定的ip地址,扫描的过程是这样的:

1. 取出一个端口
2. 新建一条线程,利用__TCP_connect()函数对该(ip,port)进行连接操作。
3. 调用thread.start()thread.join()方法,使扫描的子线程开始工作并且命令主线程等待子线程死亡后再结束。
4. 重复这个过程直到所有的端口都被扫描过。

根据以上思路,多线程扫描的代码如下:

"""
Open multiple threads to perform port scanning

Keyword arguments:
ip -- the ip address that is being scanned
delay -- the time in seconds that a TCP socket waits until timeout
output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
"""
def __scan_ports_helper(ip, delay, output):

  '''
  Multithreading port scanning
  '''

  port_index = 0

  while port_index < len(__port_list):

    # Ensure that the number of cocurrently running threads does not exceed the thread limit
    while threading.activeCount() < __thread_limit and port_index < len(__port_list):

      # Start threads
      thread = threading.Thread(target = __TCP_connect, args = (ip, __port_list[port_index], delay, output))
      thread.start()
      # lock the thread until all threads complete
      thread.join()
      port_index = port_index + 1

其中__thread_limit参数是用来限制线程数目的。output是一个字典,以(port: status)的形式保存了扫描的结果。
thread.join()保证了主线程只有在所有子线程都结束之后才会继续执行,从而确保了我们一定会扫描全部的端口。

III. 多线程扫描多个网站

在多线程扫描端口的同时,如果我们能够多线程扫描多个网站,那么扫描的效率还将进一步提高。为了达到这个目的,我们需要另一个线程去管理一个网站对应的对其端口进行扫描的所有子线程。

除此之外,在这种情况下,我们必须删去__scan_ports_helper()中的thread.join()。否则主线程就会被端口扫描子线程阻塞,我们也就无法多线程扫描多个网站了。

在不使用join()的情况下,我们如何确保一个网站的扫描线程只有在完成对其全部端口的扫描之后才会返回呢?这里我使用的方法是检测output字典的长度。因为在全部扫描完成后,output的长度一定与__port_list的长度一致。

改变后的代码如下:

def __scan_ports_helper(ip, delay, output):

  '''
  Multithreading port scanning
  '''

  port_index = 0

  while port_index < len(__port_list):

    # Ensure that the number of cocurrently running threads does not exceed the thread limit
    while threading.activeCount() < __thread_limit and port_index < len(__port_list):

      # Start threads
      thread = threading.Thread(target = __TCP_connect, args = (ip, __port_list[port_index], delay, output))
      thread.start()
      port_index = port_index + 1

  while (len(output) < len(self.target_ports)):
    continue

根据以上扫描线程的代码,端口扫描的管理线程的代码如下所示:

"""
Controller of the __scan_ports_helper() function

Keyword arguments:
ip -- the ip address that is being scanned
delay -- the time in seconds that a TCP socket waits until timeout
"""    

def __scan_ports(websites, output_ip, delay):

  scan_result = {}

  for website in websites:
    website = str(website)
    scan_result[website] = {}

    thread = threading.Thread(target = __scan_ports_helper, args = (ip, delay, scan_result[website]))
    thread.start()
    # lock the script until all threads complete
    thread.join()

  return scan_result

至此,我们就完成了一个多线程端口扫描器的全部代码。

IV. 总结!利用这些代码扫描给定网站并输出结果

处于输出方便的考虑,我并没有使用多线程扫描多个网站,同时对每个网站多线程扫描多个端口的方法。在这个例子中只进行了多线程扫描端口,但同时只扫描一个网站的操作。整合起来的代码如下:

import sys
import subprocess
import socket
import threading
import time

class PortScanner:

  # default ports to be scanned
  # or put any ports you want to scan here!
  __port_list = [1,3,6,9,13,17,19,20,21,22,23,24,25,30,32,37,42,49,53,70,79,80,81,82,83,84,88,89,99,106,109,110,113,119,125,135,139,143,146,161,163,179,199,211,222,254,255,259,264,280,301,306,311,340,366,389,406,416,425,427,443,444,458,464,481,497,500,512,513,514,524,541,543,544,548,554,563]
  # default thread number limit
  __thread_limit = 1000
  # default connection timeout time inseconds
  __delay = 10


  """
  Constructor of a PortScanner object

  Keyword arguments:
  target_ports -- the list of ports that is going to be scanned (default self.__port_list)
  """
  def __init__(self, target_ports = None):
    # If target ports not given in the arguments, use default ports
    # If target ports is given in the arguments, use given port lists
    if target_ports is None:
      self.target_ports = self.__port_list
    else:
      self.target_ports = target_ports


  """
  Return the usage information for invalid input host name. 
  """
  def __usage(self):
    print('python Port Scanner v0.1')
    print('please make sure the input host name is in the form of "something.com" or "http://something.com!"\n')


  """
  This is the function need to be called to perform port scanning

  Keyword arguments:
  host_name -- the hostname that is going to be scanned
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def scan(self, host_name, message = ''):

    if 'http://' in host_name or 'https://' in host_name:
      host_name = host_name[host_name.find('://') + 3 : ]

    print('*' * 60 + '\n')
    print('start scanning website: ' + str(host_name))

    try:
      server_ip = socket.gethostbyname(str(host_name))
      print('server ip is: ' + str(server_ip))

    except socket.error as e:
      # If the DNS resolution of a website cannot be finished, abort that website.

      #print(e)
      print('hostname %s unknown!!!' % host_name)

      self.__usage()

      return {}

      # May need to return specificed values to the DB in the future

    start_time = time.time()
    output = self.__scan_ports(server_ip, self.__delay, message)
    stop_time = time.time()

    print('host %s scanned in %f seconds' %(host_name, stop_time - start_time))

    print('finish scanning!\n')

    return output


  """
  Set the maximum number of thread for port scanning

  Keyword argument:
  num -- the maximum number of thread running concurrently (default 1000)
  """
  def set_thread_limit(self, num):
    num = int(num)

    if num <= 0 or num > 50000:

      print('Warning: Invalid thread number limit! Please make sure the thread limit is within the range of (1, 50,000)!')
      print('The scanning process will use default thread limit!')

      return

    self.__thread_limit = num


  """
  Set the time out delay for port scanning in seconds

  Keyword argument:
  delay -- the time in seconds that a TCP socket waits until timeout (default 10)
  """
  def set_delay(self, delay):

    delay = int(delay)
    if delay <= 0 or delay > 100:

      print('Warning: Invalid delay value! Please make sure the input delay is within the range of (1, 100)')
      print('The scanning process will use the default delay time')

      return 

    self.__delay = delay


  """
  Print out the list of ports being scanned
  """
  def show_target_ports(self):
    print ('Current port list is:')
    print (self.target_ports)


  """
  Print out the delay in seconds that a TCP socket waits until timeout
  """
  def show_delay(self):
    print ('Current timeout delay is :%d' %(int(self.__delay)))


  """
  Open multiple threads to perform port scanning

  Keyword arguments:
  ip -- the ip address that is being scanned
  delay -- the time in seconds that a TCP socket waits until timeout
  output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def __scan_ports_helper(self, ip, delay, output, message):

    '''
    Multithreading port scanning
    '''

    port_index = 0

    while port_index < len(self.target_ports):

      # Ensure that the number of cocurrently running threads does not exceed the thread limit
      while threading.activeCount() < self.__thread_limit and port_index < len(self.target_ports):

        # Start threads
        thread = threading.Thread(target = self.__TCP_connect, args = (ip, self.target_ports[port_index], delay, output, message))
        thread.start()
        port_index = port_index + 1


  """
  Controller of the __scan_ports_helper() function

  Keyword arguments:
  ip -- the ip address that is being scanned
  delay -- the time in seconds that a TCP socket waits until timeout
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """    
  def __scan_ports(self, ip, delay, message):

    output = {}

    thread = threading.Thread(target = self.__scan_ports_helper, args = (ip, delay, output, message))
    thread.start()

    # Wait until all port scanning threads finished
    while (len(output) < len(self.target_ports)):
      continue

    # Print openning ports from small to large
    for port in self.target_ports:
      if output[port] == 'OPEN':
        print(str(port) + ': ' + output[port] + '\n')

    return output



  """
  Perform status checking for a given port on a given ip address using TCP handshake

  Keyword arguments:
  ip -- the ip address that is being scanned
  port_number -- the port that is going to be checked
  delay -- the time in seconds that a TCP socket waits until timeout
  output -- a dict() that stores result pairs in {port, status} style (status = 'OPEN' or 'CLOSE')
  message -- the message that is going to be included in the scanning packets, in order to prevent
    ethical problem (default: '')
  """
  def __TCP_connect(self, ip, port_number, delay, output, message):
    # Initilize the TCP socket object
    TCP_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    TCP_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
    TCP_sock.settimeout(delay)


    # Initilize a UDP socket to send scanning alert message if there exists an non-empty message
    if message != '':
      UDP_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
      UDP_sock.sendto(str(message), (ip, int(port_number)))

    try:
      result = TCP_sock.connect_ex((ip, int(port_number)))
      if message != '':
        TCP_sock.sendall(str(message))

      # If the TCP handshake is successful, the port is OPEN. Otherwise it is CLOSE
      if result == 0:
        output[port_number] = 'OPEN'
      else:
        output[port_number] = 'CLOSE'

      TCP_sock.close()

    except socket.error as e:

      output[port_number] = 'CLOSE'
      pass

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
简单文件操作python 修改文件指定行的方法
May 15 Python
Python标准库之Sys模块使用详解
May 23 Python
常用python编程模板汇总
Feb 12 Python
浅谈python中的实例方法、类方法和静态方法
Feb 17 Python
python logging模块的使用总结
Jul 09 Python
Django框架模板用法入门教程
Nov 04 Python
python argparser的具体使用
Nov 10 Python
浅析python表达式4+0.5值的数据类型
Feb 26 Python
Django 实现对已存在的model进行更改
Mar 28 Python
详解python with 上下文管理器
Sep 02 Python
Pandas直接读取sql脚本的方法
Jan 21 Python
Pytorch之扩充tensor的操作
Mar 04 Python
python re正则匹配网页中图片url地址的方法
Dec 20 #Python
python使用pdfminer解析pdf文件的方法示例
Dec 20 #Python
python爬取指定微信公众号文章
Dec 20 #Python
在Django中URL正则表达式匹配的方法
Dec 20 #Python
python采集微信公众号文章
Dec 20 #Python
Linux下Pycharm、Anaconda环境配置及使用踩坑
Dec 19 #Python
python爬虫之urllib,伪装,超时设置,异常处理的方法
Dec 19 #Python
You might like
ThinkPHP模板引擎之导入资源文件方法详解
2014/06/18 PHP
ThinkPHP视图查询详解
2014/06/30 PHP
PHP全功能无变形图片裁剪操作类与用法示例
2017/01/10 PHP
JavaScript的目的分析
2007/01/05 Javascript
JQuery中SetTimeOut传参问题探讨
2013/05/10 Javascript
jquery ui dialog实现弹窗特效的思路及代码
2013/08/03 Javascript
你不需要jQuery(三) 新AJAX方法fetch()
2016/06/14 Javascript
关于动态生成dom绑定事件失效的原因及解决方法
2016/08/06 Javascript
自动化测试读写64位操作系统的注册表
2016/08/15 Javascript
最好用的Bootstrap fileinput.js文件上传组件
2016/12/12 Javascript
angularjs实现下拉列表的选中事件示例
2017/03/03 Javascript
vue.js利用defineProperty实现数据的双向绑定
2017/04/28 Javascript
Angular.JS中指令ng-if的注意事项小结
2017/06/21 Javascript
js原生方法被覆盖,从新赋值原生的方法
2018/01/02 Javascript
node前端开发模板引擎Jade的入门
2018/05/11 Javascript
vue实现引入本地json的方法分析
2018/07/12 Javascript
关于vue3默认把所有onSomething当作v-on事件绑定的思考
2020/05/15 Javascript
JavaScript实现浏览器网页自动滚动并点击的示例代码
2020/12/05 Javascript
绘制微信小程序验证码功能的实例代码
2021/01/05 Javascript
JavaScript canvas实现雨滴特效
2021/01/10 Javascript
pycharm 使用心得(九)解决No Python interpreter selected的问题
2014/06/06 Python
使用wxPython获取系统剪贴板中的数据的教程
2015/05/06 Python
Python中分支语句与循环语句实例详解
2018/09/13 Python
python实现爬虫抓取小说功能示例【抓取金庸小说】
2019/08/09 Python
python3多线程知识点总结
2019/09/26 Python
html5的画布canvas——画出弧线、旋转的图形实例代码+效果图
2013/06/09 HTML / CSS
丝芙兰波兰:Sephora.pl
2018/03/25 全球购物
String和StringBuffer的区别
2015/08/13 面试题
中专毕业个人的自荐信格式
2013/09/21 职场文书
国际政治个人自荐信范文
2013/11/26 职场文书
捐助倡议书范文
2014/04/15 职场文书
流动人口婚育证明范本
2014/09/26 职场文书
党的群众路线教育实践活动个人批评与自我批评
2014/10/16 职场文书
工程部岗位职责
2015/02/10 职场文书
蔬果开业典礼发言稿应该怎么写?
2019/09/03 职场文书
特别篇动画《总之就是非常可爱 ~制服~》PV公开,2022年夏季播出
2022/04/04 日漫