Riakkit

Google Groups/Mailing list: https://groups.google.com/forum/#!forum/riakkit or riakkit@googlegroups.com

What is this?..

Riakkit is an object mapper and a RAD tool for Riak.

Riakkit underwent a quite significant change from 0.5 to 0.6 (notice we went back to Alpha). There's a lot of backend changes, although most front end methods are unchanged. Additional API is also introduced.

Just FYI: The project tries to follow the Google Python style guide.

Licensed under LGPLv3

Installation

Requires version 1.4.0 or above of riak-python-client. Probably best if you just grab the repository version.

Also, you need to change the setting of search to enable in your app.config

{riak_search, [
  {enabled, false}
]}

This is if you want to use search.

Then, proceed to do pip install riakkit or easy_install riakkit.

Concept

There's 2 parts to Riakkit. riakkit.SimpleDocument and riakkit.Document. Document is a subclass of SimpleDocument and it provides the RAD capabilities. This means Document helps you track dependencies, make sure everything is and allow you to write a prototype in days without much effort, dealing with all the issues with tracking different models and how they relate to each other.

This, however, comes with a cost. Document has a lot of overhead. There's a lot of code present that slows the program down (I'm working on improving it, but it's not the topmost priority right now). This is why when you need to start scaling, it's recommended that you stop using Document for your models.

This is where SimpleDocument comes into play. SimpleDocument does not communicate with Riak. That means all the convinience methods such as save and delete is not available any more. However, many methods are still available, like addLinks, indexes and all that. Along with many methods you may not even know exists, such as toRiakObject. If you use this, it's your responsibility to track relationships (this may change, depending on if an efficient way to track relationships comes along or not) and save all the objects to database.

Since all it really does is validate and convert values if necessary, SimpleDocument is very very fast. If you ever run the unittest yourself, you will actually see a difference. This also comes at a cost. You will have to track the relationships.

You could also subclass SimpleDocument and make it like Document, but with less overhead. If you come up with something that's almost exactly like Document but faster, please make a pull request! (LGPL don't require you to do so, but hey, it's nice to do it)

"Fast Track"

Using riakkit with the higher level API should be simple. Here's how to get started:

>>> from riakkit import *
>>> import riak
>>> some_client = riak.RiakClient()
>>> class BlogPost(Document):
...     # bucket name is required for each subclass of Document, unless you
...     # are extending Document.
...     # Each class gets their unique bucket_name.
...     bucket_name = "test_blog"
...
...     # Client is required for each subclass of Document
...     client = some_client
...
...     title = StringProperty(required=True) # StringProperty auto converts all strings to unicode
...     content = StringProperty() # let's say content is not required.
...     some_cool_attribute = FloatProperty() # Just a random attribute for demo purpose
...     def __str__(self): # Totally optional..
...         return "%s:%s" % (self.title, self.content)

Make sense, right? We imported riakkit and riak, created a connection, and a Document subclass.

>>> post = BlogPost(title="hi")
>>> print post
hi:None
>>> post.save() #doctest: +ELLIPSIS
<__main__.BlogPost object at ...>

Saving is easy, but how do we modify?

>>> post.title = "Hello"
>>> post.content = "mrrow"
>>> post.save() #doctest: +ELLIPSIS
<__main__.BlogPost object at ...>
>>> print post
Hello:mrrow
>>> key = post.key # Stores a key...

Since the title is required.. we cannot save if it's not filled out.

>>> another_post = BlogPost(content="lolol")
>>> another_post.save()
Traceback (most recent call last):
    ...
ValidationError: None doesn't pass validation for property 'title'

What about getting it from the database?

>>> same_post = BlogPost.get(key, False) # False means that we don't want the cached copy, but reload the object from the database even if it''s available in cache
>>> print same_post
Hello:mrrow

All object that's constructed using Document that's been get are the same instance. There's one object per key. Any changes to the object will be reflected in all the references to it. A WeakValueDictionary is used to cache all the objects.

>>> same_post is post
True

However, if your data got modified outside of riakkit, you could use the .reload() function for document objects.

>>> same_post.reload() # Obviously we haven't changed anything, but if we did, this would get those changes
>>> print same_post.title
Hello

You can also use dictionary notation. However, there's Document is not a superclass of dict!

>>> print same_post.title
Hello
>>> print same_post["title"]
Hello

Need another attribute not in your schema? No problem.

>>> same_post.random_attr = 42
>>> same_post.save() # doctest: +ELLIPSIS
<__main__.BlogPost object at ...>
>>> print same_post.random_attr
42

Again, you can see the changes are instantly reflected on the other reference to it:

>>> print post.random_attr
42

While setting an attribute in your schema is allowed, getting one while it's not in the scheme AND not already set will raise an AttributeError.

>>> same_post.none_existent
Traceback (most recent call last):
  ...
AttributeError: Attribute none_existent not found with BlogPost.

Accessing an attribute that's IN your schema but NOT set will return None, or whatever default value you got. (Some properties already have a default value. Example: if you don't set a ListProperty, it will return [] if you get it)

>> print same_post.some_cool_attribute  # Remember? We never set this
None

Deleting objects is equally as easy.

>>> same_post.delete()
>>> BlogPost.get(key) #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
    ...
NotFoundError: Key '<yourkey>' not found!

Referencing Documents

You can link to a "foreign" document very easily. Let me illustrate:

>>> class User(Document):
...     bucket_name = "doctest_users"
...     client = some_client
...
...     name = StringProperty(required=True)
...     post = ReferenceProperty(reference_class=BlogPost)
>>> user = User(name="mrrow")
>>> some_post = BlogPost(title="Hello", content="World")
>>> user.post = some_post
>>> user.save() # doctest: +ELLIPSIS
<__main__.User object at ...>
>>> print user.post.title
Hello
>>> same_user = User.load(user.key)
>>> print same_user.post.title
Hello

You can also "back reference" these documents. The API is similar to Google App Engine's ReferenceProperty.

>>> class Comment(Document):
...     bucket_name = "doctest_comments"
...     client = some_client
...
...     title = StringProperty()
...     owner = ReferenceProperty(reference_class=User,
...                               collection_name="comments")

Note how we specified the reference_class. This will activate additional validation. Also, collection_name knows where to go.

>>> a_comment = Comment(title="Riakkit ftw!")
>>> a_comment.owner = user
>>> a_comment.save() # doctest: +ELLIPSIS
<__main__.Comment object at ...>

This should save both the a_comment, and the user object. So no need to user.reload(). Since the same_user variable is just a reference to user, there is no need to reload that, either (Behaviour introduced after v0.3.2a).

>>> print user.comments[0].title
Riakkit ftw!
>>> print same_user.comments[0].title
Riakkit ftw!

Let's add another comment.

>>> another_comment = Comment(title="Moo")

Note that ReferenceProperty and MultiReferenceProperty requires a reference_class.

Let's look at MultiReferenceProperty, it's very simple as it's just a list of Documents

>>> class Cake(Document):
...     bucket_name = "test_cake"
...     client = some_client
...
...     type = EnumProperty(["chocolate", "icecream"])
...     owner = MultiReferenceProperty(reference_class=User, collection_name="cakes")
>>> person = User(name="John")
>>> cake = Cake(type="chocolate", owner=[])
>>> cake.owner.append(person)
>>> cake.save() #doctest: +ELLIPSIS
<__main__.Cake object at ...>
>>> print cake.owner[0].name
John
>>> print person.cakes[0].type
chocolate
>>> cake.owner = [user]
>>> cake.save() #doctest: +ELLIPSIS
<__main__.Cake object at ...>
>>> print person.cakes
[]
>>> print cake.owner[0].name
mrrow

Advanced Query

Searching

You've see getting with get_with_key, what about searching and map reduce?

Searching is done through Document.search(querytext). This required enable_search to be on. Otherwise you're limited to map reduce.

Also, you're required to install the search onto buckets that you will use f

Riakkit

Install / Use

README

Riakkit

What is this?..

Installation

Concept

"Fast Track"

Referencing Documents

Advanced Query

Searching