编程 Python

python elasticsearch环境搭建详解

Posted in Python onSeptember 02, 2019

windows下载zip

linux下载tar

下载地址：https://www.elastic.co/downloads/elasticsearch

解压后运行：bin/elasticsearch (or bin\elasticsearch.bat on Windows)

检查是否成功：访问 http://localhost:9200

linux下不能以root用户运行，

普通用户运行报错：

java.nio.file.AccessDeniedException

原因：当前用户没有执行权限

解决方法： chown linux用户名 elasticsearch安装目录 -R

例如：chown ealsticsearch /data/wwwroot/elasticsearch-6.2.4 -R

PS：其他Java软件报.AccessDeniedException错误也可以同样方式解决，给执行用户相应的目录权限即可

2|0代码实例

如下的代码实现类似链家网小区搜索功能。

从文件读取小区及地址信息写入es,然后通过小区所在城市code及搜索关键字匹配到对应小区。

代码主要包含三部分内容：

1.创建索引

2.用bulk将批量数据存储到es

3.数据搜索

注意：

代码的es版本交低2.xx版本，高版本在创建的索引数据类型有所不同

#coding:utf8
from __future__ import unicode_literals
import os
import time
import config
from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk

class ElasticSearch():
  def __init__(self, index_name,index_type,ip ="127.0.0.1"):
    '''
    :param index_name: 索引名称
    :param index_type: 索引类型
    '''
    self.index_name =index_name
    self.index_type = index_type
    # 无用户名密码状态
    #self.es = Elasticsearch([ip])
    #用户名密码状态
    self.es = Elasticsearch([ip],http_auth=('elastic', 'password'),port=9200)
  def create_index(self,index_name="ftech360",index_type="community"):
    '''
    创建索引,创建索引名称为ott，类型为ott_type的索引
    :param ex: Elasticsearch对象
    :return:
    '''
    #创建映射
    _index_mappings = {
      "mappings": {
        self.index_type: {
          "properties": {
            "city_code": {
              "type": "string",
              # "index": "not_analyzed"
            },
            "name": {
              "type": "string",
              # "index": "not_analyzed"
            },
            "address": {
              "type": "string",
              # "index": "not_analyzed"
            }
          }
        }

      }
    }
    if self.es.indices.exists(index=self.index_name) is True:
      self.es.indices.delete(index=self.index_name)
    res = self.es.indices.create(index=self.index_name, body=_index_mappings)
    print res

  def build_data_dict(self):
    name_dict = {}
    with open(os.path.join(config.datamining_dir,'data_output','house_community.dat')) as f:
      for line in f:
        line_list = line.decode('utf-8').split('\t')
        community_code = line_list[6]
        name = line_list[7]
        city_code = line_list[0]
        name_dict[community_code] = (name,city_code)

    address_dict = {}
    with open(os.path.join(config.datamining_dir,'data_output','house_community_detail.dat')) as f:
      for line in f:
        line_list = line.decode('utf-8').split('\t')
        community_code = line_list[6]
        address = line_list[10]
        address_dict[community_code] = address

    return name_dict,address_dict

  def bulk_index_data(self,name_dict,address_dict):
    '''
    用bulk将批量数据存储到es
    :return:
    '''
    list_data = []
    for community_code, data in name_dict.items():
      tmp = {}
      tmp['code'] = community_code
      tmp['name'] = data[0]
      tmp['city_code'] = data[1]
      
      if community_code in address_dict:
        tmp['address'] = address_dict[community_code]
      else:
        tmp['address'] = ''

      list_data.append(tmp)
    ACTIONS = []
    for line in list_data:
      action = {
        "_index": self.index_name,
        "_type": self.index_type,
        "_id": line['code'], #_id 小区code
        "_source": {
          "city_code": line['city_code'],
          "name": line['name'],
          "address": line['address']
          }
      }
      ACTIONS.append(action)
      # 批量处理
    success, _ = bulk(self.es, ACTIONS, index=self.index_name, raise_on_error=True)
    #单条写入 单条写入速度很慢
    #self.es.index(index=self.index_name,doc_type="doc_type_test",body = action)

    print('Performed %d actions' % success)

  def delete_index_data(self,id):
    '''
    删除索引中的一条
    :param id:
    :return:
    '''
    res = self.es.delete(index=self.index_name, doc_type=self.index_type, id=id)
    print res

  def get_data_id(self,id):
    res = self.es.get(index=self.index_name, doc_type=self.index_type,id=id)
    # # 输出查询到的结果
    print res['_source']['city_code'], res['_id'], res['_source']['name'], res['_source']['address']

  def get_data_by_body(self, name, city_code):
    # doc = {'query': {'match_all': {}}}
    doc = {
      "query": {
        "bool":{
          "filter":{
            "term":{
            "city_code": city_code
            }
          },
          "must":{
            "multi_match": {
              "query": name,
              "type":"phrase_prefix",
              "fields": ['name^3', 'address'],
              "slop":1,
              
              }

          }
        }
      }
    }
    _searched = self.es.search(index=self.index_name, doc_type=self.index_type, body=doc)
    data = _searched['hits']['hits']
    return data
     

if __name__=='__main__':
  #数据插入es
  obj = ElasticSearch("ftech360","community")
  obj.create_index()
  name_dict, address_dict = obj.build_data_dict()
  obj.bulk_index_data(name_dict,address_dict)

  #从es读取数据
  obj2 = ElasticSearch("ftech360","community")
  obj2.get_data_by_body(u'保利','510100')

以上就是全部知识点内容，感谢大家的阅读和对三水点靠木的支持。

python elasticsearch环境搭建详解

- Author -

古月月月胡

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

python网络编程学习笔记(四)：域名系统

Jun 09 Python

Python多进程同步简单实现代码

Apr 27 Python

python实现mysql的读写分离及负载均衡

Feb 04 Python

python按综合、销量排序抓取100页的淘宝商品列表信息

Feb 24 Python

python实现一个简单的并查集的示例代码

Mar 19 Python

Python实现的求解最大公约数算法示例

May 03 Python

Python爬虫之正则表达式的使用教程详解

Oct 25 Python

pandas筛选某列出现编码错误的解决方法

Nov 07 Python

Win10里python3创建虚拟环境的步骤

Jan 31 Python

在python3中实现更新界面

Feb 21 Python

python输入一个水仙花数(三位数) 输出百位十位个位实例

May 03 Python

Django+RestFramework API接口及接口文档并返回json数据操作

Jul 12 Python

关于pymysql模块的使用以及代码详解

Sep 01 #Python

使用Python将字符串转换为格式化的日期时间字符串

Sep 01 #Python

Python 使用多属性来进行排序

Sep 01 #Python

详解一种用django_cache实现分布式锁的方式

Sep 01 #Python

python实现多进程通信实例分析

Sep 01 #Python

python输出带颜色字体实例方法

Sep 01 #Python

基于Django框架的权限组件rbac实例讲解

Aug 31 #Python

You might like

修改php.ini实现Mysql导入数据库文件最大限制的修改方法

2007/12/11 PHP

php数组函数序列之shuffle()和array_rand() 随机函数使用介绍

2011/10/29 PHP

基于php冒泡排序算法的深入理解

2013/06/09 PHP

PHP关于IE下的iframe跨域导致session丢失问题解决方法

2013/10/10 PHP

php验证码生成代码

2015/11/11 PHP

从零开始学习jQuery (十一) 实战表单验证与自动完成提示插件

2011/02/23 Javascript

javascript实现tabs选项卡切换效果(扩展版)

2013/03/19 Javascript

javascript简单事件处理和with用法介绍

2013/09/16 Javascript

浅析JavaScript中的隐式类型转换

2013/12/05 Javascript

IE下使用jQuery重置iframe地址时内存泄露问题解决办法

2015/02/05 Javascript

js实现文件上传表单域美化特效

2015/11/02 Javascript

微信小程序表单验证错误提示效果

2017/05/19 Javascript

解决vue项目打包后提示图片文件路径错误的问题

2018/07/04 Javascript

Vue2(三)实现子菜单展开收缩，带动画效果实现方法

2019/04/28 Javascript

微信小程序3D轮播实现代码

2019/09/19 Javascript

Element Steps步骤条的使用方法

2020/07/26 Javascript

vue中watch和computed的区别与使用方法

2020/08/23 Javascript

微信小程序抽奖组件的使用步骤

2021/01/11 Javascript

Python中尝试多线程编程的一个简明例子

2015/04/07 Python

Python使用MONGODB入门实例

2015/05/11 Python

Python序列化基础知识（json/pickle）

2017/10/19 Python

DataFrame中的object转换成float的方法

2018/04/10 Python

Python创建字典的八种方式

2019/02/27 Python

python 计算两个列表的相关系数的实现

2019/08/29 Python

Python selenium的基本使用方法分析

2019/12/21 Python

Python sublime安装及配置过程详解

2020/06/29 Python

利用HTML5中Geolocation获取地理位置调用Google Map API在Google Map上定位

2013/01/23 HTML / CSS

亚洲最大的眼镜批发商和零售商之一：Glasseslit

2018/10/08 全球购物

学习雷锋标语

2014/06/25 职场文书

公司领导班子对照材料

2014/08/18 职场文书

2014领导班子“四风问题”对照检查材料思想汇报（执法局）

2014/09/21 职场文书

朝花夕拾读书笔记

2015/06/29 职场文书

小学体育跳绳课教学反思

2016/02/16 职场文书

python基于scrapy爬取京东笔记本电脑数据并进行简单处理和分析

2021/04/14 Python

mysql事务隔离级别详情

2021/10/24 MySQL

使用Redis做预定库存缓存功能

2022/04/02 Redis