Riakkit
DO NOT USE. Riakkit. An ORM for Python Riak for RAD. Similar to mongokit and couchdbkit. This is about to be deprecated by kvkit
Install / Use
/learn @shuhaowu/RiakkitREADME
Riakkit
Google Groups/Mailing list: https://groups.google.com/forum/#!forum/riakkit or riakkit@googlegroups.com
What is this?..
Riakkit is an object mapper and a RAD tool for Riak.
Riakkit underwent a quite significant change from 0.5 to 0.6 (notice we went back to Alpha). There's a lot of backend changes, although most front end methods are unchanged. Additional API is also introduced.
Just FYI: The project tries to follow the Google Python style guide.
Licensed under LGPLv3
Installation
Requires version 1.4.0 or above of riak-python-client. Probably best if you just grab the repository version.
Also, you need to change the setting of search to enable in your app.config
{riak_search, [
{enabled, false}
]}
This is if you want to use search.
Then, proceed to do pip install riakkit or easy_install riakkit.
Concept
There's 2 parts to Riakkit. riakkit.SimpleDocument and riakkit.Document.
Document is a subclass of SimpleDocument and it provides the RAD capabilities.
This means Document helps you track dependencies, make sure everything is
and allow you to write a prototype in days without much effort, dealing with
all the issues with tracking different models and how they relate to each other.
This, however, comes with a cost. Document has a lot of overhead. There's a
lot of code present that slows the program down (I'm working on improving it,
but it's not the topmost priority right now). This is why when you need to start
scaling, it's recommended that you stop using Document for your models.
This is where SimpleDocument comes into play. SimpleDocument does not
communicate with Riak. That means all the convinience methods such as save and
delete is not available any more. However, many methods are still available,
like addLinks, indexes and all that. Along with many methods you may not
even know exists, such as toRiakObject. If you use this, it's your
responsibility to track relationships (this may change, depending on if
an efficient way to track relationships comes along or not) and save all the
objects to database.
Since all it really does is validate and convert values if necessary,
SimpleDocument is very very fast. If you ever run the unittest yourself, you
will actually see a difference. This also comes at a cost. You will have to
track the relationships.
You could also subclass SimpleDocument and make it like Document, but with
less overhead. If you come up with something that's almost exactly like
Document but faster, please make a pull request! (LGPL don't require you to
do so, but hey, it's nice to do it)
"Fast Track"
Using riakkit with the higher level API should be simple. Here's how to get started:
>>> from riakkit import *
>>> import riak
>>> some_client = riak.RiakClient()
>>> class BlogPost(Document):
... # bucket name is required for each subclass of Document, unless you
... # are extending Document.
... # Each class gets their unique bucket_name.
... bucket_name = "test_blog"
...
... # Client is required for each subclass of Document
... client = some_client
...
... title = StringProperty(required=True) # StringProperty auto converts all strings to unicode
... content = StringProperty() # let's say content is not required.
... some_cool_attribute = FloatProperty() # Just a random attribute for demo purpose
... def __str__(self): # Totally optional..
... return "%s:%s" % (self.title, self.content)
Make sense, right? We imported riakkit and riak, created a connection, and a Document subclass.
>>> post = BlogPost(title="hi")
>>> print post
hi:None
>>> post.save() #doctest: +ELLIPSIS
<__main__.BlogPost object at ...>
Saving is easy, but how do we modify?
>>> post.title = "Hello"
>>> post.content = "mrrow"
>>> post.save() #doctest: +ELLIPSIS
<__main__.BlogPost object at ...>
>>> print post
Hello:mrrow
>>> key = post.key # Stores a key...
Since the title is required.. we cannot save if it's not filled out.
>>> another_post = BlogPost(content="lolol")
>>> another_post.save()
Traceback (most recent call last):
...
ValidationError: None doesn't pass validation for property 'title'
What about getting it from the database?
>>> same_post = BlogPost.get(key, False) # False means that we don't want the cached copy, but reload the object from the database even if it''s available in cache
>>> print same_post
Hello:mrrow
All object that's constructed using Document that's been get are the
same instance. There's one object per key. Any changes to
the object will be reflected in all the references to it. A
WeakValueDictionary is used to cache all the objects.
>>> same_post is post
True
However, if your data got modified outside of riakkit, you could use the
.reload() function for document objects.
>>> same_post.reload() # Obviously we haven't changed anything, but if we did, this would get those changes
>>> print same_post.title
Hello
You can also use dictionary notation. However, there's Document is not a superclass of dict!
>>> print same_post.title
Hello
>>> print same_post["title"]
Hello
Need another attribute not in your schema? No problem.
>>> same_post.random_attr = 42
>>> same_post.save() # doctest: +ELLIPSIS
<__main__.BlogPost object at ...>
>>> print same_post.random_attr
42
Again, you can see the changes are instantly reflected on the other reference to it:
>>> print post.random_attr
42
While setting an attribute in your schema is allowed, getting one while it's not in the scheme AND not already set will raise an AttributeError.
>>> same_post.none_existent
Traceback (most recent call last):
...
AttributeError: Attribute none_existent not found with BlogPost.
Accessing an attribute that's IN your schema but NOT set will return
None, or whatever default value you got. (Some properties already have a
default value. Example: if you don't set a ListProperty, it will return [] if
you get it)
>> print same_post.some_cool_attribute # Remember? We never set this
None
Deleting objects is equally as easy.
>>> same_post.delete()
>>> BlogPost.get(key) #doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
NotFoundError: Key '<yourkey>' not found!
Referencing Documents
You can link to a "foreign" document very easily. Let me illustrate:
>>> class User(Document):
... bucket_name = "doctest_users"
... client = some_client
...
... name = StringProperty(required=True)
... post = ReferenceProperty(reference_class=BlogPost)
>>> user = User(name="mrrow")
>>> some_post = BlogPost(title="Hello", content="World")
>>> user.post = some_post
>>> user.save() # doctest: +ELLIPSIS
<__main__.User object at ...>
>>> print user.post.title
Hello
>>> same_user = User.load(user.key)
>>> print same_user.post.title
Hello
You can also "back reference" these documents. The API is similar to
Google App Engine's ReferenceProperty.
>>> class Comment(Document):
... bucket_name = "doctest_comments"
... client = some_client
...
... title = StringProperty()
... owner = ReferenceProperty(reference_class=User,
... collection_name="comments")
Note how we specified the reference_class. This will activate additional
validation. Also, collection_name knows where to go.
>>> a_comment = Comment(title="Riakkit ftw!")
>>> a_comment.owner = user
>>> a_comment.save() # doctest: +ELLIPSIS
<__main__.Comment object at ...>
This should save both the a_comment, and the user object. So no need to
user.reload(). Since the same_user variable is just a reference to user,
there is no need to reload that, either (Behaviour introduced after v0.3.2a).
>>> print user.comments[0].title
Riakkit ftw!
>>> print same_user.comments[0].title
Riakkit ftw!
Let's add another comment.
>>> another_comment = Comment(title="Moo")
Note that ReferenceProperty and MultiReferenceProperty requires a
reference_class.
Let's look at MultiReferenceProperty, it's very simple as it's just a list of
Documents
>>> class Cake(Document):
... bucket_name = "test_cake"
... client = some_client
...
... type = EnumProperty(["chocolate", "icecream"])
... owner = MultiReferenceProperty(reference_class=User, collection_name="cakes")
>>> person = User(name="John")
>>> cake = Cake(type="chocolate", owner=[])
>>> cake.owner.append(person)
>>> cake.save() #doctest: +ELLIPSIS
<__main__.Cake object at ...>
>>> print cake.owner[0].name
John
>>> print person.cakes[0].type
chocolate
>>> cake.owner = [user]
>>> cake.save() #doctest: +ELLIPSIS
<__main__.Cake object at ...>
>>> print person.cakes
[]
>>> print cake.owner[0].name
mrrow
Advanced Query
Searching
You've see getting with get_with_key, what about searching and map reduce?
Searching is done through Document.search(querytext). This required enable_search to be on. Otherwise you're limited to map reduce.
Also, you're required to install the search onto buckets that you will use f

