moma-django overview --> Django + MongoDB: building a custom ORM layer
-
Upload
gadi-oren -
Category
Technology
-
view
60 -
download
0
description
Transcript of moma-django overview --> Django + MongoDB: building a custom ORM layer
![Page 1: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/1.jpg)
Moma-DjangoOverviewDjango Boston meetup, 02-27-2014
![Page 2: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/2.jpg)
Django + MongoDB: building a custom ORM layer
Overview of the talk:
moma-django is a MongoDB manager for Django. It provides native Django ORM support for MongoDB documents, including the query API and the admin interface. It was developed as a part of two commercial products and released as an open source. In the talk we will review the motivation behind its developments, its features and go through 2-3 examples of how to use some of the features: migrating an existing model, advanced queries and the admin interface. If time permits we will discuss unit testing and south migrations
![Page 3: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/3.jpg)
Who are we?
Company: Cloudoscope.com What we do:
– Cloudoscope’s product enable IT vendors to automate the pre-sales process by collecting and analyzing prospect IT performance
– Previous product - Lucidel: B2C marketing analytics based on website data
– Data intensive projects / sites, NoSQL, analytics focus (as a way of funding)
Gadi Oren: @gadioren, gadioren
![Page 4: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/4.jpg)
Why moma-django?
Certain problems can be addressed well with NoSQL The team wants to experiment with a NoSQL
HOWEVER: A lot of code needs to be rewritten Team learn a new API Some of the tools and procedures are no longer functioning
and should be replaced– Admin interface– Unit testing environment
Some of the data need to be somewhat de-normalized*
![Page 5: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/5.jpg)
Why moma-django? (our example)
Needed a very efficient way of processing timeseries The timeseries where constantly growing We required very detailed search/slice/dice capabilities to
find the timeseries to be processed Some of the data was optional (e.g. demographics
information was never complete) Document size, content and structure varied widelyHowever, we have a small distributed team and we did not
want to create a massive project We started experimenting using a stub Manager doing small
iterations, adding functionality as we needed over nine months
![Page 6: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/6.jpg)
Other packages
PyMongo – a dependency for moma-django
MongoEngine – somewhat similar concepts in terms of models
Non relational versions of Django
![Page 7: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/7.jpg)
“Native” - advantages
Django packages and plugins (e.g. Admin functionality)
Using similar code conventions
Easier to bring in new team members
Use the same unit testing frameworks (e.g. Jenkins)
Simple experimentation and migration path
![Page 8: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/8.jpg)
Let’s make it interactiveQuestions Anyone??? (Example Application)
Small question asking application Allows voting and adding images Implemented as a django application over MongoDB, using
moma-django
Register and login at http://momadjango.org
Ask away!
![Page 9: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/9.jpg)
Migrating an existing model
class TstBook(models.Model): name = models.CharField(max_length=64) publish_date = MongoDateTimeField() author = models.ForeignKey('testing.TstAuthor') class Meta: unique_together = ['name', 'author']
class TstAuthor(models.Model): first_name = models.CharField(max_length=32) last_name = models.CharField(max_length=32)
class TstBook(MongoModel): name = models.CharField(max_length=64) publish_date = MongoDateTimeField() author = models.ForeignKey('testing.TstAuthor') class Meta: unique_together = ['name', 'author']
class TstAuthor(MongoModel): first_name = models.CharField(max_length=32) last_name = models.CharField(max_length=32)
models.signals.post_syncdb.connect(post_syncdb_mongo_handler)
![Page 10: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/10.jpg)
Migrating an existing model (2)
Syncdb:
Add objects
![Page 11: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/11.jpg)
Migrating an existing model (2)
Syncdb:
Add objects
>>> TstBook(name=“Good night half moon”, publish_date=datetime.datetime(2014,2,20), author=TstAuthor.objects.get(first_name=“Gadi”)).save()
![Page 12: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/12.jpg)
Migrating an existing model (3) Breaching uniqueness try and save the same object again:
![Page 13: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/13.jpg)
Migrating an existing model (4) In Mongo: content, indexes
Admin
class Meta: unique_together = ['name', 'author']
![Page 14: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/14.jpg)
New field types
MongoIDField – Internal. Used to hold the MongoDB object ID
MongoDateTimeField – Used for Datetime ValuesField – Used to represent a list of objects of any type StringListField – Used for a list of stringsDictionaryField – Used as a dictionary
Current limitation: nested structures have limited support
![Page 15: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/15.jpg)
Queries and update – 1: bulk insert
records.append( { "_id" : ObjectId("502abdabf7f16836f100285a"), "time_on_site" : 290, "user_id" : 1154449631, "account_id" : NumberLong(5), "campaign" : "(not set)", "first_visit_date" : ISODate("2012-07-30T17:10:06Z"), "referral_path" : "(not set)", "source" : "google", "exit_page_path" : "/some-analysis/lion-king/", "landing_page_path" : "(not set)", "keyword" : "wikipedia lion king", "date" : ISODate("2012-07-30T00:00:00Z"), "visit_count" : 1, "page_views" : 3, "visit_id" : "false---------------1154449631.1343668206", "goal_values" : { }, "goal_starts" : { }, "demographics" : { }, "goal_completions" : { }, "location" : { "cr" : "United States", "rg" : "California", "ct" : "Pasadena" }, })
UniqueVisit.objects.filter(account__in=self.list_of_accounts).delete()
UniqueVisit.objects.bulk_insert( records )
![Page 16: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/16.jpg)
Queries and update – 2: examples
def ISODate(timestr): res = datetime.strptime(timestr, "%Y-%m-%dT%H:%M:%SZ") res = res.replace(tzinfo=timezone.utc) return res
# Datetimeqs = UniqueVisit.objects.filter( first_visit_date__lte =ISODate("2012-07-30T12:29:05Z"))self.assertEqual( qs.query.spec, dict( # pymongo expression {'first_visit_date': {'$lte': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}))
# Multiple conditionsqs = UniqueVisit.objects.filter( first_visit_date__lte =ISODate("2012-07-30T12:29:05Z"), time_on_site__gt =10, page_views__gt =2)self.assertEqual( qs.query.spec, dict( # pymongo expression {'time_on_site': {'$gt': 10.0}, 'page_views': {'$gt': 2}, 'first_visit_date': {'$lte': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}))
![Page 17: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/17.jpg)
Queries and update– 3: examples
# Different query optimizationsqs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time_on_site =25)|Q(time_on_site =275))self.assertEqual( qs.query.spec, dict( # pymongo expression {'time_on_site': {'$in': [10.0, 25.0, 275.0]}}))
# Multiple or Q expressionsqs = UniqueVisit.objects.filter(Q(time_on_site =10)|Q(time_on_site =25)|Q(time_on_site =275)|Q(source = 'bing'))self.assertEqual( qs.query.spec, dict( # pymongo expression {'$or': [{'time_on_site': 10.0}, {'time_on_site': 25.0}, {'time_on_site': 275.0}, {'source': 'bing'}]}))
# Negate Qqs = UniqueVisit.objects.filter(~Q(first_visit_date =ISODate("2012-07-30T12:29:05Z")))self.assertEqual( qs.query.spec, dict( # pymongo expression {'first_visit_date': {'$ne': datetime(2012, 7, 30, 12, 29, 5, tzinfo=timezone.utc)}}))
![Page 18: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/18.jpg)
Queries – 4: extensions beyond standard Django
# Dot notationqs = UniqueVisit.objects.filter(location__rg__exact ="New York")self.assertEqual( qs.query.spec, dict(( # pymongo expression {'location.rg': 'New York'}))
# Check key existenceqs = UniqueVisit.objects.filter(demographics__age__exists ="true")self.assertEqual( qs.query.spec, dict(( # pymongo expression {'demographics.age': {'$exists': 'true'}}))
# variable typeqs = UniqueVisit.objects.filter(landing_page_path__type = int)self.assertEqual( qs.query.spec, dict(( # pymongo expression {'landing_page_path': {'$type': 16}}))
![Page 19: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/19.jpg)
Queries - by the structure of documents# How many documents in the DB?>>> UniqueVisit.objects.all().count()20>>> # For how many documents in the DB do we have age information?>>> UniqueVisit.objects.filter(demographics__age__exists ="true").count()7>>> # For how many documents in the DB do we have gender information?>>> UniqueVisit.objects.filter(demographics__gender__exists ="true").count()3>>> # For how many documents in the DB do we have gender and age information?>>> UniqueVisit.objects.filter(demographics__age__exists ="true“, demographics__gender__exists ="true").count()1>>>
![Page 20: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/20.jpg)
Manipulating documents payload
# Store an image: get the image from the “POST” upload form (snippet)docfile = request.FILES['docfile']question_id = form.cleaned_data['question_id']docfile_name = docfile.namedocfile_name_changed = _replace_dots(docfile.name)question = Question.objects.get(id=question_id)
# Store meta-dataquestion.docs.update({docfile_name_changed : docfile.content_type})question.image.update( {docfile_name_changed +'_url' : '/static/display/s_'+docfile_name, docfile_name_changed +'_name' : docfile_name, docfile_name_changed +'_content_type' : docfile.content_type})
# Store the actual image binary block (small scale implementation)file_read = docfile.file.read() # Note – this is a naïve implementation!file_data = base64.b64encode(file_read)question.image.update({docfile_name_changed +'_data' : file_data})question.save()
# Modelclass Question(MongoModel): user = models.ForeignKey(User) date = MongoDateTimeField(db_index=True) question = models.CharField(max_length=256 )
docs = DictionaryField(models.CharField()) image = DictionaryField(models.TextField()) audio = DictionaryField() other = DictionaryField()
vote_ids = ValuesField(models.IntegerField())
def __unicode__(self): return u'%s[%s %s]' % (self.question, self.date, self.user, ) class Meta: unique_together = ['user', 'question',]
![Page 21: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/21.jpg)
Admin interface
![Page 22: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/22.jpg)
So – what’s next?
Github: https://github.com/gadio/moma-django If you want to contribute – please contact (forking is also an
option) Contact: gadi.oren.1 at gmail.com or
gadi at Cloudoscope.com
![Page 23: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/23.jpg)
Backup
![Page 24: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/24.jpg)
South
Dealing with apps with mixed models South to disregard the model
# Enabling South for the non conventional mongo model
add_introspection_rules( [ ( (MongoIdField, MongoDateTimeField, DictionaryField ), [], { "max_length": ["max_length", {"default": None}], }, ), ], ["^moma_django.fields.*",])
![Page 25: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/25.jpg)
Unit testing
The model name is defined in settings.py In unit testing run, a new mongo DB schema is created
MONGO_COLLECTION prefixed with “test_”(e.g. test_momaexample)
MONGO_HOST = 'localhost'MONGO_PORT = 27017MONGO_COLLECTION = 'momaexample'
![Page 26: moma-django overview --> Django + MongoDB: building a custom ORM layer](https://reader033.fdocuments.net/reader033/viewer/2022061110/5453d178af795907578b5101/html5/thumbnails/26.jpg)
Moma-django on google…