SkillAgentSearch skills...

Esengine

ElasticSearch ODM (Object Document Mapper) for Python - pip install esengine

Install / Use

/learn @seek-ai/Esengine
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<img src="https://raw.githubusercontent.com/catholabs/esengine/master/octosearch.gif" align="left" width="192px" height="132px"/> <img align="left" width="0" height="192px" hspace="10"/>

esengine - The Elasticsearch Object Document Mapper

PyPI versions downloads Travis CI Coverage Status Code Health

esengine is an ODM (Object Document Mapper) it maps Python classes in to Elasticsearch index/doc_type and object instances() in to Elasticsearch documents.

<br><br>

Modeling

Out of the box ESengine takes care only of the Modeling and CRUD operations including:

  • Index, DocType and Mapping specification
  • Fields and its types coercion
  • basic CRUD operations (Create, Read, Update, Delete)

Communication

ESengine does not communicate directly with ElasticSearch, it only creates the basic structure, To communicate it relies on an ES client providing the transport methods (index, delete, update etc).

ES client

ESengine does not enforce the use of the official ElasticSearch client, but you are encouraged to use it because it is well maintained and has the support to bulk operations. But you are free to use another client or create your own (useful for tests).

Querying the data

ESengine does not enforce or encourage you to use a DSL language for queries, out of the box you have to write the elasticsearch payload representation as a raw Python dictionary. However ESEngine comes with utils.payload helper module to help you building payloads in a less verbose and Pythonic way.

Why not elasticsearch_dsl?

ElasticSearch DSL is an excellent tool, a very nice effort by the maintainers of the official ES library, it is handy in most of the cases, but because it is built on top of operator overiding, sometimes leads to a confuse query building, sometimes it is better to write raw_queries or use a simpler payload builder having more control and visibility of what os being generated.

ElasticSearch_DSL as a high level abstraction promotes Think only of Python objects, dont't worry about Elastic queries while ESengine promotes Know well the Elastic queries and then write them as Python objects.

ElasticSearch_DSL is more powerful and more complete, tight more with ES specifications while ESEngine is simpler, lightweight shipping only the basics.

Project Stage

It is in beta-Release, working in production, but missing a lot of features, you can help using, testing,, discussing or coding!

Getting started

Installation

ESengine needs a client to communicate with E.S, you can use one of the following:

  • ElasticSearch-py (official)
  • Py-Elasticsearch (unofficial)
  • Create your own implementing the same api-protocol
  • Use the MockES provided as py.test fixture (only for tests)

Because of bulk operations you are recommendded to use elasticsearch-py (Official E.S Python library) so the instalation depends on the version of elasticsearch you are using.

in short

Install the client and then install ESEngine

  • for 2.0 + use "elasticsearch>=2.0.0,<3.0.0"
  • for 1.0 + use "elasticsearch>=1.0.0,<2.0.0"
  • under 1.0 use "elasticsearch<1.0.0"

For the latest use:

$ pip install elasticsearch
$ pip install esengine

Or install them together

Elasticsearch 2.x

pip install esengine[es2]

Elasticsearch 1.x

pip install esengine[es1]

Elasticsearch 0.90.x

pip install esengine[es0]

The above command will install esengine and the elasticsearch library specific for you ES version.

Usage

# importing

from elasticsearch import ElasticSearch
from esengine import Document, StringField

# Defining a document
class Person(Document):
    # define _meta attributes
    _doctype = "person"  # optional, it can be set after using "having" method
    _index = "universe"  # optional, it can be set after using "having" method
    _es = ElasticSearch(host='host', port=port)  # optional, it can be explicit passed to methods
    
    # define fields
    name = StringField()

# Initializing mappings and settings
Person.init()

If you do not specify an "id" field, ESEngine will automatically add "id" as StringField. It is recommended that when specifying you use StringField for ids.

TIP: import base module

A good practice is to import the base module, look the same example

import esengine as ee

class Person(ee.Document):
    name = ee.StringField()

Fields

Base Fields

name = StringField()
age = IntegerField()
weight = FloatField()
factor = LongField()
active = BooleanField()
birthday = DateField()

Special Fields

GeoPointField

A field to hold GeoPoint with modes dict|array|string and its mappings

class Obj(Document):
    location = GeoPointField(mode='dict')  # default
    # An object representation with lat and lon explicitly named

Obj.init() # important to put the proper mapping for geo location

obj = Obj()

obj.location = {"lat": 40.722, "lon": -73.989}}

class Obj(Document):
    location = GeoPointField(mode='string')
    # A string representation, with "lat,lon"

obj.location = "40.715, -74.011"

class Obj(Document):
    location = GeoPointField(mode='array')
    # An array representation with [lon,lat].

obj.location = [-73.983, 40.719]

ObjectField

A field to hold nested one-dimension objects, schema-less or with properties validation.

# accepts only dictionaries having strct "street" and "number" keys
address = ObjectField(properties={"street": "string", "number": "integer"})

# Accepts any Python dictionary
extravalues = ObjectField() 

ArrayField

A Field to hold arrays (python lists)

In the base, any field can accept multi parameter

colors = StringField(multi=True)   # accepts ["blue", "green", "yellow", ....]

But sometimes (specially for nested objects) it is better to be explicit, and also it generates a better mapping

# accepts an array of strings ["blue", "green", "yellow", ....]
colors = ArrayField(StringField()) 

It is available for any other field

locations = ArrayField(GeoPointField())
numbers = ArrayField(IntegerField())
fractions = ArrayField(FloatField())
addresses = ArrayField(ObjectField(properties={"street": "string", "number": "integer"}))
list_of_lists_of_strings = ArrayField(ArrayField(StringField()))

Indexing

person = Person(id=1234, name="Gonzo")
person.save()  # or pass .save(es=es_client_instance) if not specified in model 

Getting by id

Person.get(id=1234)

filtering by IDS

ids = [1234, 5678, 9101]
power_trio = Person.filter(ids=ids)

filtering by fields

Person.filter(name="Gonzo")

Searching

ESengine does not try to create abstraction for query building, by default ESengine only implements search transport receiving a raw ES query in form of a Python dictionary.

query = {
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            },
            "filter": {
                "ids": {
                    "values": [1, 2]
                }
            }
        }
    }
}
Person.search(query, size=10)

Getting all documents (match_all)

Person.all()

# with more arguments

Person.all(size=20)

Counting

Person.count(name='Gonzo')

Updating

A single document

A single document can be updated simply using the .save() method


person = Person.get(id=1234)
person.name = "Another Name"
person.save()

Updating a Resultset

The Document methods .get, .filter and .search will return an instance of ResultSet object. This object is an Iterator containing the hits reached by the filtering or search process and exposes some CRUD methods[ update, delete and reload ] to deal with its results.

people = Person.filter(field='value')
people.update(another_field='another_value')

When updating documents sometimes you need the changes done in the E.S index reflected in the objects of the ResultSet iterator, so you can use .reload method to perform that action.

The use of reload method

people = Person.filter(field='value')
print people
... <Resultset: [{'field': 'value', 'another_field': None}, 
                 {'field': 'value', 'another_field': None}]>

# Updating another field on both instances
people.update(another_field='another_value')
print people
... <Resultset: [{'field': 'value', 'another_field': None}, {'field': 'value', 'another_field': None}]>

# Note that in E.S index the values weres changed but the current ResultSet is not updated by defaul
# you have to fire an update
people.reload()

print people
... <Resultset: [{'field': 'value', 'another_field': 'another_value'},
                 {'field': 'value', 'another_field': 'another_value'}]>


Deleting documents

A ResultSet

people = Person.all()
people.delete()

A single document

Person.get(id=123).delete()

Bulk operations

ESEngine takes advantage of elasticsearch-py helpers for bulk actions, the ResultSet object uses bulk melhod to update and delete documents.

But you can

Related Skills

View on GitHub
GitHub Stars108
CategoryDevelopment
Updated3mo ago
Forks19

Languages

Python

Security Score

92/100

Audited on Dec 28, 2025

No findings