What's new in Django community blogs?

Database Handling in Python

May 23 2016 [Archived Version] □ Published at tuts+

In the information age we are living in, we can see how much data the world is exchanging. We are basically creating, storing, and retrieving data, extensively! There should be a way to handle all that—it couldn't be spread everywhere without any management, right? Here comes the Database Management System (DBMS).

The DBMS is a software system that enables you to create, store, modify, retrieve, and otherwise handle data from a database. Such systems also vary in size, ranging from small systems that simply run on your personal computer to larger ones running on mainframes.

Our focus in this tutorial is on Python rather than database design. Yes, Python is wonderfully able to interact with databases, and this is what I'm going to show you in this tutorial.

Let's get started!

Python Database API

As mentioned above, Python is able to interact with databases. But, how can it do that? Python uses what's called the Python Database API in order to interface with databases. This API allows us to program different database management systems (DBMS). For those different DBMS, however, the process followed on the code level is the same, which is as follows:

  1. Establish a connection to your database of choice.
  2. Create a cursor to communicate with the data.
  3. Manipulate the data using SQL (interact).
  4. Tell the connection to either apply the SQL manipulations to the data and make them permanent (commit), or tell it to abort those manipulations (rollback), thus returning the data to the state before the interactions occurred.
  5. Close the connection to the database.


SQLite is an open-source, full-featured, self-contained (requires little support from external libraries), serverless (does not require a server to run the database engine on, and is a locally stored database), zero-configuration (nothing to install nor configure), SQL-based lightweight database management system (SQL queries can be run on SQLite tables), and uses one data file in order to store data.

The nice thing to know is that SQLite is used by large companies like Google, Apple, Microsoft, etc., which makes it very reliable. In this tutorial, we are going to use SQLite to interact with the database, and more specifically will be working with the sqlite3 module in Python.

Python and SQLite

As mentioned above, working with databases involves five main steps. Let's see those steps in action.

1. Establish a Connection to Your Database of Choice

This step is achieved as follows:

conn = sqlite3.connect('company.db')

As mentioned in the sqlite3 documentation:

To use the module, you must first create a Connection object that represents the database.

In the above code, notice that the data will be stored in the file company.db.

2. Create a Cursor to Communicate With the Data

The next step in working with the database is creating a cursor, as follows:

curs = conn.cursor()

3. Manipulate the Data Using SQL

After connecting with the database and creating a cursor, we are now ready to work (interact) with data. In other words, we can now run SQL commands on the database company.db.

Let's say we want to create a new table employee in our database company. In this case, we need to run a SQL command. In order to do that, we will use the execute() method of the sqlite3 module. The Python statement will thus look as follows:

curs.execute('create table employee(name, age)')

This statement will run a SQL command that will create a table called employee, with two columns (fields) name and age.

We can now run a new SQL command that will insert data in the table, as follows:

curs.execute("insert into employee values ('Ali', 28)") 

You can also insert multiple values at once, as follows:

values = [('Brad',54), ('Ross', 34), ('Muhammad', 28), ('Bilal', 44)]

In this case, rather than using the method execute(), we will use the method executemany() to execute the above multiple values.

curs.executemany('insert into employee values(?,?)', values)

4. Commit the Changes

In this step, we would like to apply (commit) the changes we have made in the previous step. This is simply done as follows:


5. Close the Connection to the Database

After performing our manipulations and committing the changes, the last step will be to close the connection:


Let's put all the steps together in one script. The program will look as follows (notice that we have to import the sqlite3 module first):

If you run the script, you should get a file called company.db in your current directory. Download this file as we will use it in the next step.

Let's Browse the Database

Having created a database, a table, and added some data, let's see what's inside company.db (the file you downloaded in the above section). For this, we are going to use a nice tool: DB Browser for SQLite. Go ahead and download the tool on your machine. Once you open the program, you should get a screen that looks as follows:

DB Browser for SQLite home screen

Open the database using the Open Database button at the top, in which case you should get the Database Structure, as follows:

DB Browser for SQLite Database Structure screen

Notice that we have the table employee listed, with two fields name and age.

In order to confirm that our code above worked and the data has been added to the table, click on the Browse Data tab. You should see something like the following:

DB Browser for SQLite Browse Data tab

So, as you can see, a database (company) and a table (employee) have been created, and data has been successfully added to the table.

This tutorial was a scratch on the surface to get you started in working with databases using Python. You can learn about more methods from the sqlite3 module, where you will be able to carry out different database operations such as updating and querying the database. Have fun!

Django meetup Amsterdam 18 May 2016

May 23 2016 [Archived Version] □ Published at Reinout van Rees' weblog under tags  django

Summary of the Django meetup organized at crunchr in Amsterdam, the Netherlands.

(I gave a talk on the django admin, which I of course don't have a summary of, yet, though my brother made a summary of an almost-identical talk I did the friday before)

Reducing boilerplate with class-based views - Priy Werry

A view can be more than just a function. They can also be class based, django has quite a lot of them. For example the TemplateView that is very quick for rendering a template. Boilerplate reduction.

Django REST framework is a good example of class based views usage. It really helps you to reduce the number of boring boilerplate and concentrate on your actual code.

Examples of possible boilerplate code:

  • Parameter validation.
  • Pagination.
  • Ordering.
  • Serialisation.

They wanted to handle this a bit like django's middleware mechanism, but then view-specific. So they wrote a base class that performed most of the boilerplate steps. So the actual views could be fairly simple.

It also helps with unit testing: normally you'd have to test all the corner cases in all your views, now you only have to test your base class for that.

Custom base classes also often means you have methods that you might re-define in subclasses to get extra functionality. In those cases make sure you call the parent class's original method (when needed).

Users of your views (especially when it is an API) get the advantage of having a more consistent API. It is automatically closer to the specification. The API should also be easier to read.

Meetups on steroids - Bob Aalsma

"Can I get a consultant for this specific subject?" Some managers find it difficult to allow this financially. A special course with a deep-dive is easier to allow.

He would like to be a kind of a broker between students and teachers to arrange it: "Meetups on steroids: pick the subject - any subject; pick the date - any date; pick the group - any group"

Security in Django - Joery van der Zwart

Joery comes out of the security world. He doesn't know anything from the inside of django, but a lot of the outside of django. He's tested a lot of them.

Security is as strong as its weekest link. People are often the weakest link. Django doesn't protect you if you explicitly circumvent its security mechanisms as a programmer.

Django actually protects you a lot!

A good thing is to look at the OWASP list of top 10 errors. (See also Florian Apolloner's talk at 'Django under the Hood' 2015, for instance).

  • SQL injection. Protection is integrated in django. But watch out when doing raw sql queries, because they are really raw and unprotected. If you work through the model layer, you're safe.
  • Authentication and sessions. Django's SessionSecurityMiddleware is quite good. He has some comments on authentication, though, so he advices to do that one yourself. (Note: the local core committer looked quite suspicious as this was the first he heard about it. Apparently there are a number of CVEs that are unfixed in Django. Joery will share the numbers.)
  • XSS injection. User-fillable fields that aren't escaped. Django by default... yes, protects you against this. Unless you use {% autoescape off %}, so don't do that.
  • Direct object reference. He doesn't agree with this point. So ignore it.
  • Security misconfiguration. Basically common sense. Don't have DEBUG = True on in your production site. Django's output is very detailed and thus very useful for anyone breaking into your site.
  • Sensitive data. Enable https. Django doesn't enforce it. But use https. Extra incentive: google lowers the page ranking for non-https sites...
  • Access control. It is very very very hard to get into Django this way. He says django is one of the few systems to fix it this rigidly!
  • CSRF. Django protects you. Unless you explicitly use @csfr_exempt...
  • Known vulnerabilities. Update django! Because there have been fixes in django. Older versions are thus broken.
  • Insecure forwards/redirects. Once you've enabled django's default middleware, you're secure.

So Django is quite secure, but you are not.

Look at django's security documentation. And look at https://www.ponycheckup.com. You can check your site with it. The good is that it is simple. It only checks django itself, though.

With some other tools (like nessus) you have to watch out for false positives, though. So if you don't know to interpret the result, you'll be scared shitless.

A good one: Qualys SSLlabs https checker to get your ssl certificate completely right. (Note: see my blog post fixing ssl certificate chains for some extra background.)

"Owasp zap": open source tool that combines checker and reduces the number of false positives.

The summary:

  • Good: django with middleware.
  • Good: django provides a good starting point.
  • Bad: experimenting. Be very sure you're doing it right. Look at documentation.
  • Bad: do it yourself. Most of the times.

Django girls Amsterdam

On 25 june there'll be a django girls workshop in Amsterdam. Everything's set, but they do still need coaches.

Deploying a Django Website on Heroku

May 21 2016 [Archived Version] □ Published at DjangoTricks under tags  administration advanced amazon web services deployment git

Photo by Frances Gunn

Once you have a working project, you have to host it somewhere. One of the most popular deployment platforms nowadays is Heroku. Heroku belongs to a Platform as a Service (PaaS) category of cloud computing services. Every Django project you host on Heroku is running inside a smart container in a fully managed runtime environment. Your project can scale horizontally (adding more computing machines) and you pay for what you use starting with a free tier. Moreover, you won't need much of system administrator's skills to do the deployment - once you do the initial setup, the further deployment is as simple as pushing Git repository to a special heroku remote.

However, there are some gotchas to know before choosing Heroku for your Django project:

  • One uses PostgreSQL database with your project. MySQL is not an option.
  • You cannot store your static and media files on Heroku. One should use Amazon S3 or some other storage for that.
  • There is no mailing server associated with Heroku. One can use third-party SendGrid plugin with additional costs, GMail SMTP server with sent email amount limitations, or some other SMTP server.
  • The Django project must be version-controlled under Git.
  • Heroku works with Python 2.7. Python 3 is not yet supported.

Recently I deployed a small Django project on Heroku. To have a quick reference for the future, I summed up the process here providing instructions how to do that for future reference.

1. Install Heroku Toolbelt

Sign up for a Heroku account. Then install Heroku tools for doing all the deployment work in the shell.

To connect your shell with Heroku, type:

$ heroku login

When asked, enter your Heroku account's email and password.

2. Prepare Pip Requirements

Activate your project's virtual environment and install Python packages required for Heroku:

(myproject_env)$ pip install django-toolbelt

This will install django, psycopg2, gunicorn, dj-database-url, static3, and dj-static to your virtual environment.

Install boto and Django Storages to be able to store static and media files on an S3 bucket:

(myproject_env)$ pip install boto
(myproject_env)$ pip install django-storages

Go to your project's directory and create the pip requirements that Heroku will use in the cloud for your project:

(myproject_env)$ pip freeze -l > requirements.txt

3. Create Heroku-specific Files

You will need two files to tell Heroku what Python version to use and how to start a webserver.

In your project's root directory create a file named runtime.txt with the following content:


Then at the same location create a file named Procfile with the following content:

web: gunicorn myproject.wsgi --log-file -

4. Configure the Settings

As mentioned in the "Web Development with Django Cookbook - Second Edition", we keep the developmnent and production settings in separate files both importing the common settings from a base file.

Basically we have myproject/conf/base.py with the settings common for all environments.

Then myproject/conf/dev.py contains the local database and dummy email configuration as follows:

# -*- coding: UTF-8 -*-
from __future__ import unicode_literals
from .base import *

"default": {
"ENGINE": "django.db.backends.postgresql",
"HOST": "localhost",
"NAME": "myproject",
"PORT": "",
"USER": "postgres"

EMAIL_BACKEND = "django.core.mail.backends.console.EmailBackend"

Lastly for the production settings we need myproject/conf/prod.py with special database configuration, non-debug mode, and unrestrictive allowed hosts as follows:

# -*- coding: UTF-8 -*-
from __future__ import unicode_literals
from .base import *
import dj_database_url

"default": dj_database_url.config()


DEBUG = False

Now let's open myproject/settings.py and add the following content:

# -*- coding: UTF-8 -*-
from __future__ import unicode_literals
from .conf.dev import *

Finally, open the myproject/wsgi.py and change the location of the DJANGO_SETTINGS_MODULE there:

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myproject.conf.prod")

5. Set Up Amazon S3 for Static and Media Files

Create an Amazon S3 bucket myproject.media at the AWS Console (web interface for Amazon Web Services). Go to the properties of the bucket, expand "Permissions" section, click on the "add bucket policy" button and enter the following:

"Version": "2008-10-17",
"Statement": [
"Sid": "AllowPublicRead",
"Effect": "Allow",
"Principal": {
"AWS": "*"
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::myproject.media/*"

This ensures that files on the S3 bucket will be accessible publicly without any API keys.

Go back to your Django project and add storages to the INSTALLED_APPS in myproject/conf/base.py:

# ...

Media files and static files will be stored on different paths under S3 bucket. To implement that, we need to create two Python classes under a new file myproject/s3utils.py as follows:

# -*- coding: UTF-8 -*-
from __future__ import unicode_literals
from storages.backends.s3boto import S3BotoStorage

class StaticS3BotoStorage(S3BotoStorage):
Storage for static files.

def __init__(self, *args, **kwargs):
kwargs['location'] = 'static'
super(StaticS3BotoStorage, self).__init__(*args, **kwargs)

class MediaS3BotoStorage(S3BotoStorage):
Storage for uploaded media files.

def __init__(self, *args, **kwargs):
kwargs['location'] = 'media'
super(MediaS3BotoStorage, self).__init__(*args, **kwargs)

Finally, let's edit the myproject/conf/base.py and add AWS settings:

AWS_S3_SECURE_URLS = False       # use http instead of https
AWS_QUERYSTRING_AUTH = False # don't add complex authentication-related query parameters for requests
AWS_S3_ACCESS_KEY_ID = "..." # Your S3 Access Key
AWS_S3_SECRET_ACCESS_KEY = "..." # Your S3 Secret
AWS_STORAGE_BUCKET_NAME = "myproject.media"
AWS_S3_HOST = "s3-eu-west-1.amazonaws.com" # Change to the media center you chose when creating the bucket

STATICFILES_STORAGE = "myproject.s3utils.StaticS3BotoStorage"
DEFAULT_FILE_STORAGE = "myproject.s3utils.MediaS3BotoStorage"

# the next monkey patch is necessary to allow dots in the bucket names
import ssl
if hasattr(ssl, '_create_unverified_context'):
ssl._create_default_https_context = ssl._create_unverified_context

Collect static files to the S3 bucket:

(myproject_env)$ python manage.py collectstatic --noinput

6. Set Up Gmail to Send Emails

Open myproject/conf/prod.py and add the following settings:

EMAIL_HOST = "smtp.gmail.com"
EMAIL_HOST_USER = "[email protected]"
EMAIL_HOST_PASSWORD = "mygmailpassword"

7. Push to Heroku

Commit and push all the changes to your Git origin remote. Personally I prefer using SourceTree to do that, but you can also do that in the command line, PyCharm, or another software.

In your project directory type the following:

(myproject_env)$ heroku create my-unique-project

This will create a Git remote called "heroku", and a new Heroku project "my-unique-project" which can be later accessed at http://my-unique-project.herokuapp.com.

Push the changes to heroku remote:

(myproject_env)$ git push heroku master

8. Transfer Your Local Postgres Database To Heroku

Create local database dump:

(myproject_env)$ PGPASSWORD=mypassword pg_dump -Fc --no-acl --no-owner -h localhost -U myuser mydb > mydb.dump

Upload the database dump temporarily to some server, for example, S3 bucket: http://myproject.media.s3-eu-west-1.amazonaws.com/mydb.dump. Then import that dump into the Heroku database:

(myproject_env)$ heroku pg:backups restore 'http://myproject.media.s3-eu-west-1.amazonaws.com/mydb.dump' DATABASE_URL

Remove the database dump from S3 server.

9. Set Environment Variables

If your Git repository is not private, put your secret values in environment variables rather than in the Git repository directly.

(myproject_env)$ heroku config:set AWS_S3_ACCESS_KEY_ID=ABCDEFG123
$ heroku config:set AWS_S3_SECRET_ACCESS_KEY=aBcDeFg123

To read out the environment variables you can type:

(myproject_env)$ heroku config

To read out the environment variables in the Python code open myproject/conf/base.py and type:

import os
AWS_S3_ACCESS_KEY_ID = os.environ.get("AWS_S3_ACCESS_KEY_ID", "")

10. Set DNS Settings

Open your domain settings and set CNAME to "my-unique-project.herokuapp.com".

At last, you are done! Drop in the comments if I missed some part. For the new updates, see the next section.

*. Update Production

Push the changes to heroku remote:

(myproject_env)$ git push heroku master

If you have changed something in the static files, collect them again:

(myproject_env)$ python manage.py collectstatic --noinput

Collecting static files to S3 bucket takes quite a long time, so I do not recommend to do that automatically every time when you want to deploy to Heroku.

Further Reading

You can read more about Django on Heroku in the following resources:


May 21 2016 [Archived Version] □ Published at Latest Django packages added

Django 1.10 alpha 1 released

May 20 2016 [Archived Version] □ Published at The Django weblog

As part of the Django 1.10 release process, today we've released Django 1.10 alpha 1, a preview/testing package that represents the first stage in the 1.10 release cycle and an opportunity for you to try out the changes coming in Django 1.10.

Django 1.10 has a panoply of new features which you can read about in the in-development 1.10 release notes.

This alpha milestone marks a complete feature freeze. The current release schedule calls for a beta release in about a month and a release candidate about a month from then. We'll only be able to keep this schedule if we get early and often testing from the community. Updates on the release schedule schedule are available on the django-developers mailing list.

As with all alpha and beta packages, this is not for production use. But if you'd like to take some of the new features for a spin, or to help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the alpha package from our downloads page or on PyPI.

The PGP key ID used for this release is Tim Graham: 1E8ABDC773EDE252.


May 18 2016 [Archived Version] □ Published at Latest Django packages added

How to migrate your existing Django project to Heroku

May 18 2016 [Archived Version] □ Published at Random notes from Zena » django under tags  django git heroku python sysadmin

Recently I had some fun with Heroku, the well known PaaS provider. I had a small personal Django project I use for invoicing that I ran locally with ./manage.py runserver when needed. That was a perfect candidate for the Heroku free plan because I need to access the app only occasionally. In this tutorial I assume you […]


May 17 2016 [Archived Version] □ Published at Latest Django packages added

Mark Lavin to Give Keynote at Python Nordeste

May 16 2016 [Archived Version] □ Published at Caktus Blog

Mark Lavin will be giving the keynote address at Python Nordeste this year. Python Nordeste is the largest gathering of the Northeast Python community, which takes place annually in cities of northeastern Brazil. This year’s conference will be held in Teresina, the capital of the Brazilian state of Piauí. Mark will be speaking from his...


May 16 2016 [Archived Version] □ Published at Latest Django packages added

fun with postgres

May 14 2016 [Archived Version] □ Published at from __future__ import braces

=# select case when u.is_superuser='t' then 'superuser' when u.is_superuser='f' then 'nsp' end as status from userprofile_user u;

=# select concat('user is superuser',case when u.is_superuser='t' then ' = True' end, case when u.is_superuser='f' then ' = False' end) as status from userprofile_user u;
 user is superuser = True
 user is superuser = False
 user is superuser = False

Find count of rows satisfied by two or more condition
=# select count( case when u.is_superuser='t' then 1 end) as num_superusers, count(case when u.is_superuser='f' then 1 end) as num_non_superusers, count(*) as total from userprofile_user u;

 num_superusers | num_non_superusers | total 
              7            |                       36        |   43

Use the built-in RAND function in conjunction with LIMIT and ORDER BY:
=# select ename,job from emp order by rand() limit 5

Use the built-in RANDOM function in conjunction with LIMIT and ORDER BY:
=# select ename,job from emp order by random() limit 5;

Use the function COALESCE to substitute real values for nulls:
=# select coalesce(b.owner_id, 0) from basket_basket b;
note: default 0 here, as owner_id is integer

=# select id, coalesce(owner_id, 0) as owners from basket_basket order by owner_id asc;

 id  | owners 
  69 |     32
 164 |     28
 105 |      2
  90 |      2
  25 |      0

=# select id, coalesce(owner_id, 0) as owners from basket_basket order by random() desc;

Sorting by Substrings
=# select email from userprofile_user order by substring(email, 1,2); (orders by first 2 chars)

Find number of superusers or nones from userprofile_user
=#select is_superuser, count(*) from userprofile_user group by is_superuser;

 is_superuser | count 
 f            |    38
 t            |     7
(2 rows)

Using case expression in order by clause:

=# select u.email, u.salary, u.job from userprofile_user u order by case when job = 'SALESMAN'  u.job else u.salary end;

=# select u.email, u.id from userprofile_user u union select b.status, b.owner_id from basket_basket b;

Joins two or more tables.
constraints: same number and type of columns in both select statement (same no means 2 columns in each, type same in same order)

union => distinct results
union all => duplicates included

SELECT City, Country FROM Customers
WHERE Country='Germany'
SELECT City, Country FROM Suppliers
WHERE Country='Germany'


May 14 2016 [Archived Version] □ Published at Latest Django packages added

Pygrunn keynote: the future of programming - Steven Pemberton

May 13 2016 [Archived Version] □ Published at Reinout van Rees' weblog under tags  pygrunn python

(One of my summaries of the one-day 2016 PyGrunn conference).

Steven Pemberton (https://en.wikipedia.org/wiki/Steven_Pemberton) is one of the developers of ABC, a predecessor of python.

He's a researcher at CWI in Amsterdam. It was the first non-military internet site in Europe in 1988 when the whole of Europe was still connected to the USA with a 64kb link.

When designing ABC they were considered completely crazy because it was an interpreted language. Computers were slow at that time. But they knew about Moore's law. Computers would become much faster.

At that time computers were very, very expensive. Programmers were basically free. Now it is the other way. Computers are basically free and programmers are very expensive. So, at that time, in the 1950s, programming languages were designed around the needs of the computer, not the programmer.

Moore's law is still going strong. Despite many articles claiming its imminent demise. He heard the first one in 1977. Steven showed a graph of his own computers. It fits.

On modern laptops, the CPU is hardly doing anything most of the time. So why use programming languages optimized for giving the CPU a rest?

There's another cost. The more lines a program has, the more bugs there are in it. But it is not a linear relationship. More like lines ^ 1.5. So a program with 10x more lines probably has 30x more bugs.

Steven thinks the future of programming is in declarative programming instead of in procedural programming. Declarative code describes what you want to achieve and not how you want to achieve it. It is much shorter.

Procedural code would have specified everything in detail. He showed a code example of 1000 lines. And a declarative one of 15 lines. Wow.

He also showed an example with xforms, which is declarative. Projects that use it regularly report a factor of 10 in savings compared to more traditional methods. He mentioned a couple of examples.

Steven doesn't necessarily want us all to jump on Xforms. It might not fit with our usecases. But he does want us to understand that declarative languages are the way to go. The approach has been proven.

In response to a question he compared it to the difference between roman numerals and arabic numerals and the speed difference in using them.

(The sheets will be up on http://homepages.cwi.nl/~steven/Talks/2016/05-13-pygrunn/ later).

Pygrunn keynote: Morepath under the hood - Martijn Faassen

May 13 2016 [Archived Version] □ Published at Reinout van Rees' weblog under tags  django pygrunn python zope

(One of my summaries of the one-day 2016 PyGrunn conference).

Martijn Faassen is well-known from lxml, zope, grok. Europython, Zope foundation. And he's written Morepath, a python web framework.

Three subjects in this talk:

  • Morepath implementation details.
  • History of concepts in web frameworks
  • Creativity in software development.

Morepath implementation details. A framework with super powers ("it was the last to escape from the exploding planet Zope")

Traversal. In the 1990's you'd have filesystem traversal. example.com/addresses/faassen would map to a file /webroot/addresses/faassen.

In zope2 (1998) you had "traversal through an object tree. So root['addresses']['faassen'] in python. The advantage is that it is all python. The drawback is that every object needs to know how to render itself for the web. It is an example of creativity: how do we map filesystem traversal to objects?.

In zope3 (2001) the goal was the zope2 object traversal, but with objects that don't need to know how to handle the web. A way of working called "component architecture" was invented to add traversal-capabilities to existing objects. It works, but as a developer you need to quite some configuration and registration. Creativity: "separation of concerns" and "lookups in a registry"

Pyramid sits somewhere in between. And has some creativity on its own.

Another option is routing. You map a url explicitly to a function. A @route('/addresses/{name}') decorator to a function (or a django urls.py). The creativity is that is simple.

Both traversal and routing have their advantages. So Morepath has both of them. Simple routing to get to the content object and then traversal from there to the view.

The creativity here is "dialectic". You have a "thesis" and an "antithesis" and end up with a "synthesis". So a creative mix between two ideas that seem to be opposites.

Apart from traversal/routing, there's also the registry. Zope's registry (component architecture) is very complicated. He's now got a replacement called "Reg" (http://reg.readthedocs.io/).

He ended up with this after creatively experimenting with it. Just experimenting, nothing serious. Rewriting everything from scratch.

(It turned out there already was something that worked a bit the same in the python standard library: @functools.singledispatch.)

He later extended it from single dispatch to multiple dispatch. The creativity here was the freedom to completely change the implementation as he was the only user of the library at that moment. Don't be afraid to break stuff. Everything has been invented before (so research). Also creative: it is just a function.

A recent spin-off: "dectate". (http://dectate.readthedocs.io/). A decorator-based configuration system for frameworks :-) Including subclassing to override configuration.

Some creativity here: it is all just subclassing. And split something off into a library for focus, testing and documentation. Split something off to gain these advantages.

Pygrunn: from code to config and back again - Jasper Spaans

May 13 2016 [Archived Version] □ Published at Reinout van Rees' weblog under tags  pygrunn python

(One of my summaries of the one-day 2016 PyGrunn conference).

Jasper works at Fox IT, one of the programs he works on is DetACT, a fraud detection tool for online banking. The technical summary would be something like "spamassassin and wireshark for internet traffic".

  • Wireshark-like: DetACT intercepts online bank traffic and feeds it to a rule engine that ought to detect fraud. The rule engine is the one that needs to be configured.
  • Spamassassin-like: rules with weights. If a transaction gets too many "points", it is marked as suspect. Just like spam detection in emails.

In the beginning of the tool, the rules were in the code itself. But as more and more rules and exceptions got added, maintaining it became a lot of work. And deploying takes a while as you need code review, automatic acceptance systems, customer approval, etc.

From code to config: they rewrote the rule engine from start to work based on a configuration. (Even though Joel Spolsky says totally rewriting your code is the single worst mistake you can make). They went 2x over budget. That's what you get when rewriting completely....

The initial test with hand-written json config files went OK, so they went to step two: make the configuration editable in a web interface. Including config syntax validation. Including mandatory runtime performance evaluation. The advantage: they could deploy new rules much faster than when the rules were inside the source code.

Then... they did a performance test at a customer.... It was 10x slower than the old code. They didn't have enough hardware to run it. (It needs to run on real hardware instead of in the cloud as it is very very sensitive data).

They fired up the profiler and discovered that only 30% of the time is spend on the actual rules, the other 70% is bookkeeping and overhead.

In the end they had the idea to generate python code from the configuration. They tried it. The generated code is ugly, but it works and it is fast. A 3x improvement. Fine, but not a factor of 10, yet.

They tried converting the config to AST (python's Abstract Syntax Tree) instead of to actual python code. Every block was turned into an AST and then combined based on the config. This is then optimized (which you can do with an AST) before generating python code again.

This was fast enough!

Some lesons learned:

  • Joel Spolsky is right. You should not rewrite your software completely. If you do it, do it in very small chunks.
  • Write readable and correct code first. Then benchmark and profile
  • Have someone on your team who knows about compiler construction if you want to solve these kinds of problems.

django-planet aggregates posts from Django-related blogs. It is not affiliated with or endorsed by the Django Project.

Social Sharing


Tag cloud

admin administration adsense advanced ajax amazon angular angularjs apache api app appengine app engine apple application security aprendiendo python architecture articles asides audrey authentication automation backup bash basics best practices binary bitbucket blog blog action day blogging book books buildout business c++ cache capoeira celery celerycam celerycrawler challenges chat cherokee choices class-based-views cliff cloud cms code codeship codeship news coding command community computer computing configuration continuous deployment continuous integration couchdb coverage css custom d data database databases db debian debugging deploy deployment deployment academy design developers development devops digitalocean django django1.7 django cms djangocon django-nose django-readonly-site django-rest-framework django-tagging django templates django-twisted-chat django web framework tutorials documentation dojango dojo dotcloud dreamhost dughh easy_install eclipse education elasticsearch email encoding english error events expressjs extensions fabric facebook family fashiolista fedora field file filter fix flash flask form forms frameworks friends fun gae gallery games geek general gentoo gis git github gmail gnome google google app engine guides gunicorn hack hackathon hacking hamburg haskell heroku holidays hosting howto how-to how-tos html http i18n image imaging indifex install installation intermediate internet ios iphone java javascript jobs journalism jquery json justmigrated kde linear regression linkedin linode linux login mac machine learning mac os x math memcached mercurial meta migration mirror misc model models mod_wsgi mongodb months mozilla multi-language mvc mysql nelenschuurmans newforms news nginx nodejs nosql ogólne openshift opensource open source open-source operations orm osx os x ottawa paas packages patterns pedantics pelican performance personal philosophy php pi pil pinax pip piston planet plone plugin pony postgis postgres postgresql ppoftw presentation private programmieren programming programming & internet project projects pycon pygrunn pyladies pypi pypy python python3 queryset quick tips quora rabbitmq rails rant ratnadeep debnath reactjs redis refactor release request resolutions rest reusable app review rhel rtnpro ruby ruby on rails scala scaling science screencast script scripting security server setup simple smiley snaking software software collections software development south sphinx sql ssh ssl static storage supervisor support svn sysadmin tag tag cloud talk nerdy to me tastypie tdd techblog technical technology template templates template tags test testing tests tip tips tools tornado transifex travel tumbles tutorial tutorials twisted twitter twoscoops typo3 ubuntu uncategorized unicode unittest unix use user authentication usergroup uwsgi uxebu virtualenv virtualenvwrapper web web 2.0 web application web applications web design & development webdev web development webfaction web framework websockets whoosh windows wordpress work workshop yada znc zope