Bungiesearch
UNMAINTAINED CODE -- Elasticsearch-dsl-py django wrapper with mapping generator
Install / Use
/learn @ChristopherRabotin/BungiesearchREADME
WARNING: UNMAINTAINED
This package is no longer maintained. You may want to check out the elasticsearch-dsl-py <https://github.com/elasticsearch/elasticsearch-dsl-py>__ or django-haystack <https://github.com/django-haystack/django-haystack>__.
Bungiesearch
|Build Status| |Coverage Status|
.. contents:: Table of contents :depth: 2
Purpose
Bungiesearch is a Django wrapper for
elasticsearch-dsl-py <https://github.com/elasticsearch/elasticsearch-dsl-py>__.
It inherits from elasticsearch-dsl-py's Search class, so all the
fabulous features developed by the elasticsearch-dsl-py team are also
available in Bungiesearch. In addition, just like Search,
Bungiesearch is a lazy searching class (and iterable), meaning you can
call functions in a row, or do something like the following.
.. code:: python
lazy = Article.objects.search.query('match', _all='Description')
print len(lazy) # Prints the number of hits by only fetching the number of items.
for item in lazy[5:10]:
print item
Features
-
Core Python friendly
- Iteration (
[x for x in lazy_search]) - Get items (
lazy_search[10]) - Number of hits via
len(len(lazy_search))
- Iteration (
-
Index management
- Creating and deleting an index.
- Creating, updating and deleting doctypes and their mappings.
- Update index doctypes.
-
Django Model Mapping
-
Very easy mapping (no lies).
-
Automatic model mapping (and supports undefined models by returning a
Resultinstance ofelasticsearch-dsl-py). -
Efficient database fetching:
- One fetch for all items of a given model.
- Fetches only desired fields.
-
-
Django Manager
- Easy model integration:
MyModel.search.query("match", _all="something to search"). - Search aliases (search shortcuts with as many parameters as
wanted):
Tweet.object.bungie_title_search("bungie")orArticle.object.bungie_title_search("bungie"), wherebungie_title_searchis uniquely defined.
- Easy model integration:
-
Django signals
- Connect to post save and pre delete signals for the elasticsearch index to correctly reflect the database (almost) at all times.
-
Requirements
- Django >= 1.8
- Python 2.7, 3.4, 3.5
Feature examples
See section "Full example" at the bottom of page to see the code needed to perform these following examples. ### Query a word (or list thereof) on a managed model.
Article.objects.search.query('match', _all='Description')
Use a search alias on a model's manager.
``Article.objects.bsearch_title_search('title')``
Use a search alias on a bungiesearch instance.
Article.objects.search.bsearch_title_search('title').bsearch_titlefilter('filter this title')
Iterate over search results
.. code:: python
# Will print the Django model instance.
for result in Article.objects.search.query('match', _all='Description'):
print result
Fetch a single item
~~~~~~~~~~~~~~~~~~~
.. code:: python
Article.objects.search.query('match', _all='Description')[0]
Get the number of returned items
.. code:: python
print len(Article.objects.search.query('match', _all='Description'))
Deferred model instantiation
.. code:: python
# Will print the Django model instance's primary key. Will only fetch the `pk` field from the database.
for result in Article.objects.search.query('match', _all='Description').only('pk'):
print result.pk
Elasticsearch limited field fetching
.. code:: python
# Will print the Django model instance. However, elasticsearch's response only has the `_id` field.
for result in Article.objects.search.query('match', _all='Description').fields('_id'):
print result
Get a specific number of items with an offset.
This is actually elasticseach-dsl-py functionality, but it's
demonstrated here because we can iterate over the results via
Bungiesearch.
.. code:: python
for item in Article.objects.bsearch_title_search('title').only('pk').fields('_id')[5:7]:
print item
Lazy objects
~~~~~~~~~~~~
.. code:: python
lazy = Article.objects.bsearch_title_search('title')
print len(lazy)
for item in lazy.filter('range', effective_date={'lte': '2014-09-22'}):
print item
Installation
============
Unless noted otherwise, each step is required.
Install the package
-------------------
The easiest way is to install the package from PyPi:
``pip install bungiesearch``
**Note:** Check your version of Django after installing bungiesearch. It
was reported to me directly that installing bungiesearch may upgrade
your version of Django, although I haven't been able to confirm that
myself. Bungiesearch depends on Django 1.7 and above.
In Django
---------
Updating your Django models
~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Note:** this part is only needed if you want to be able to use search
aliases, which allow you to define shortcuts to complex queries,
available directly from your Django models. I think it's extremely
practical.
1. Open your ``models.py`` file.
2. Add the bungiesearch manager import:
``from bungiesearch.managers import BungiesearchManager``
3. Find the model, or models, you wish to index on Elasticsearch and set
them to be managed by Bungiesearch by adding the objects field to
them, as such: ``objects = BungiesearchManager()``. You should now
have a Django model `similar to
this <https://github.com/ChristopherRabotin/bungiesearch#django-model>`__.
Creating bungiesearch search indexes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The search indexes define how bungiesearch should serialize each of the
model's objects. It effectively defines how your object is serialized
and how the ES index should be structured. These are referred to as
`ModelIndex <https://github.com/ChristopherRabotin/bungiesearch#modelindex-1>`__\ es.
A good practice here is to have all the bungiesearch stuff in its own
package. For example, for the section of the Sparrho platform that uses
Django, we have a package called ``search`` where we define the search
indexes, and a subpackage called ``aliases`` which has the many aliases
we use (more on that latter).
1. Create a subclass of ``ModelIndex``, which you can import from from
``bungiesearch.indices import ModelIndex``, in a new module
preferably.
2. In this class, define a class called ``Meta``: it will hold meta
information of this search index for bungiesearch's internal working.
3. Import the Django model you want to index (from your models file)
and, in the Meta class, define a field called ``model``, which must
be set to the model you want indexed.
4. By default, bungiesearch will index every field of your model. This
may not always be desired, so you can define which fields must be
excluded in this ``Meta`` class, via the exclude field.
5. There are plenty of options, so definitely have a read through the
documentation for
`ModelIndex <https://github.com/ChristopherRabotin/bungiesearch#modelindex-1>`__.
Here's `an
example <https://github.com/ChristopherRabotin/bungiesearch#modelindex>`__ of a
search index. There can be many such definitions in a file.
Django settings
~~~~~~~~~~~~~~~
This is the final required step. Here's the `full
documentation <https://github.com/ChristopherRabotin/bungiesearch#settings>`__ of
this step.
1. Open your settings file and add a ``BUNGIESEARCH`` variable, which
must be a dictionary.
2. Define ``URLS`` as a list of URLs (which can contain only one) of
your ES servers.
3. Define the ``INDICES`` key as a dictionary where the key is the name
of the index on ES that you want, and the value is the full Python
path to the module which has all the ModelIndex classes for to be
indexed on that index name.
4. Set ``ALIASES`` to an empty dictionary (until you define any search
aliases).
5. You can keep other values as their defaults.
In your shell
-------------
Create the ES indexes
~~~~~~~~~~~~~~~~~~~~~
From your shell, in the Django environment, run the following:
``python manage.py search_index --create``
Start populating the index
--------------------------
Run the following which will take each of the objects in your model,
serialize them, and add them to the elasticsearch index.
``python manage.py search_index --update``
**Note:** With additional parameters, you can limit the number of
documents to be indexed, as well as set conditions on whether they
should be indexed based on updated time for example.
In Elasticsearch
----------------
You can now open your elasticsearch dashboard, such as Elastic HQ, and
see that your index is created with the appropriate mapping and has
items that are indexed.
Quick start example
===================
This example is from the ``test`` folder. It may be partially out-dated,
so please refer to the ``test`` folder for the latest version.
Procedure
---------
1. In your models.py file (or your managers.py), import bungiesearch and
use it as a model manager.
2. Define one or more ModelIndex subclasses which define the mapping
between your Django model and elasticsearch.
3. (Optional) Define SearchAlias subclasses which make it trivial to
call complex elasticsearch-dsl-py functions.
4. Add a BUNGIESEARCH variable in your Django settings, which must
contain the elasticsearch URL(s), the modules for the indices, the
modules for the search aliases and the signal definitions.
Example
-------
Here's the code which is applicable to the previous examples. ### Django
Model
.. code:: python
from django.db import models
from bungiesearch.managers import BungiesearchManager
class Article(models.Mod
