SkillAgentSearch skills...

Dictshield

A typed dictionary for Python... sorta.

Install / Use

/learn @j2labs/Dictshield
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

DictShield

This project is now called Schematics and exists over here: https://github.com/j2labs/schematics

This code and documentation is here to preserve amusing history of this project.

Legacy Docs

Aside from being a cheeky excuse to make people say things that sound sorta dirty, DictShield is a database-agnostic modeling system. It provides a way to model, validate and reshape data easily. All without requiring any particular database.

A blog model might look like this:

from dictshield.document import Document
from dictshield.fields import StringField

class BlogPost(Document):
    title = StringField(max_length=40)
    body = StringField(max_length=4096)

DictShield objects serialize to JSON by default. Store them in Memcached, MongoDB, Riak, whatever you need.

>>> from dictshield.document import Document
>>> from dictshield.fields import StringField
>>> class Comment(Document):
...   name = StringField(max_length=10)
...   body = StringField(max_length=4000)
...
>>> data = {'name':'a hacker', 'body':'DictShield makes validation easy'}
>>> Comment(**data).validate()
True

Let's see what happens if we try using invalid data.

>>> data['name'] = 'a hacker with a name that is too long'
>>> Comment(**data).validate()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/to/site-packages/dictshield/document.py", line 280, in validate
    field._validate(value)
  File "/path/to/site-packages/dictshield/fields/base.py", line 99, in _validate
    self.validate(value)
  File "/path/to/site-packages/dictshield/fields/base.py", line 224, in validate
    self.field_name, value)
dictshield.base.ShieldException: String value is too long - name:a hacker with a name who is too long

Combining dictshield with JSON coming from a web request is quite natural as well. Say we have some data coming in from an iPhone:

json_data = request.post.get('data')
data = json.loads(json_data)

Validating the data then looks like this: Comment(**data).validate().

Easy.

The Design

DictShield aims to provides helpers for a few types of common needs for modeling. It has been useful on the server-side so far, but I believe it could also serve for building an RPC.

  1. Creating Flexible Documents

  2. Easy To Use With Databases Or Caches

  3. A Type System

  4. Validation Of Types

  5. Input / Output Shaping

DictShield also allows for object hierarchy's to be mapped into dictionaries too. This is useful primarily to those who use DictShield to instantiate classes representing their data instead of just filtering dictionaries through the class's static methods.

Example Uses

There are a few ways to use DictShield. A simple case is to create a class structure that has typed fields. DictShield offers multiple types in fields.py, like an EmailField or DecimalField.

Creating Flexible Documents

Below is an example of a Media class with a single field, the title.

from dictshield.document import Document
from dictshield.fields import StringField

class Media(Document):
    """Simple document that has one StringField member
    """
    title = StringField(max_length=40)

You create the class just like you would any Python class. And we'll see how that class is represented when serialized to a Python dictionary.

m = Media()
m.title = 'Misc Media'
m.to_python()

The output from this looks like:

{
    '_types': ['Media'],
    '_cls': 'Media',
    'title': u'Misc Media'
}

All the meta information is removed and we have just a barebones representation of our data. Notice that the class information is still there as _cls and _types.

More On Object Modeling

We see two keys that come from Media's meta class: _types and _cls. _types stores the hierachy of Document classes used to create the document. _cls stores the specific class instance. This becomes more obvious when I subclass Media to create the Movie document below.

import datetime
from dictshield.fields import IntField

class Movie(Media):
    """Subclass of Foo. Adds bar and limits publicly shareable
    fields to only 'bar'.
    """
    _public_fields = ['title','year']
    year = IntField(min_value=1950,
                    max_value=datetime.datetime.now().year)
    personal_thoughts = StringField(max_length=255)

Here's an instance of the Movie class:

mv = Movie()
mv.title = u'Total Recall'
mv.year = 1990
mv.personal_thoughts = u'I wish I had three hands...'

This is the document serialized to a Python dictionary:

{
    'personal_thoughts': u'I wish I had three hands...',
    '_types': ['Media', 'Media.Movie'],
    'title': u'Total Recall',
    '_cls': 'Media.Movie',
    'year': 1990
}

Notice that _types has kept track of the relationship between Movie and Media.

Easy To Use With Databases Or Caches

We could pass this directly to Mongo to save it.

>>> db.test_collection.save(m.to_python())

Or if we were using Riak.

>>> media = bucket.new('test_key', data=m.to_python())
>>> media.store()

Or maybe we're storing json in a memcached.

>>> mc["test_key"] = m.to_json()

A Type System

DictShield has its own type system - every field within a Document is defined with a specific type, for example a string will be defined as StringField. This "strong typing" makes serialising/deserialising semi-structured data to and from Python much more robust.

All Types

A complete list of the types supported by DictShield:

| TYPE | DESCRIPTION | |------------------------:|:--------------------------------------------------------------------------| | Text fields | | | StringField | A unicode string | | URLField | A valid URL | | EmailField | A valid email address | | ID fields | | | UUIDField | A valid UUID value, optionally auto-populates empty values with new UUIDs | | ObjectIDField | Wraps a MongoDB "BSON" ObjectId | | Numeric fields | | | NumberField | Any number (the parent of all the other numeric fields) | | IntField | An integer | | LongField | A long | | FloatField | A float | | DecimalField | A fixed-point decimal number | | Hashing fields | | | MD5Field | An MD5 hash | | SHA1Field | An SHA1 hash | |'Native type' fields | | | BooleanField | A boolean | | DateTimeField | A datetime | | GeoPointField | A geo-value of the form x, y (latitude, longitude) | | Containers | | | ListField | Wraps a standard field, so multiple instances of the field can be used | | SortedListField | A ListField which sorts the list before saving, so list is always sorted| | DictField | Wraps a standard Python dictionary | | MultiValueDictField | Django's implementation of a MultiValueDict. | | EmbeddedDocumentField | Stores a DictShield EmbeddedDocument |

Fields can also receive some arguments for customizing their behavior. The currently accepted arguments are:

| ARGUMENT | DESCRIPTION | |------------------------: |:---------------------------------------------------------------- | field_name=None | The name of the field in serialized form. | | required=False | This field must have a value or validation and serialization will fail. | | default=None | Either a default value or callable that produces a default. | | id_field=False | Set to True if this field should be used as the id field. | | validation=None | Supply an alternate function for validation for this field. | | choices=None | Limit the possible values for this field by passing a list. | | description=None | Set an alternate field description for serialization to jsonschema. | | minimized_field_name=None | Name of the field to use when serializing the document with short names. | | uniq_field=None | Legacy arg. Will be removed soon. |

A Close Look at the MD5Field

This is what the MD5Field

View on GitHub
GitHub Stars70
CategoryDevelopment
Updated1y ago
Forks14

Languages

Python

Security Score

65/100

Audited on Nov 28, 2024

No findings