What's new in Django community blogs?


May 05 2017 [Archived Version] □ Published at Latest Django packages added

Django Debits

May 04 2017 [Archived Version] □ Published at Latest Django packages added

Accept (regular and subscription) payments in Internet (currently supports PayPal). Advanced support for subscription payments.


May 04 2017 [Archived Version] □ Published at Latest Django packages added

ChatterBot is a machine learning, conversational dialog engine for creating chat bots.


May 03 2017 [Archived Version] □ Published at Latest Django packages added

ShipIt Day Recap Q2 2017

May 03 2017 [Archived Version] □ Published at Caktus Blog

Once per quarter, Caktus employees have the opportunity to take a day away from client work to focus on learning or refreshing skills, testing out ideas, or working on open source contributions. The Q2 2017 ShipIt Day work included building apps, updating open source projects, trying out new tools, and more. Keep reading for the...

Introducing the Natural Language Toolkit (NLTK)

May 03 2017 [Archived Version] □ Published at tuts+

Natural language processing (NLP) is the automatic or semi-automatic processing of human language. NLP is closely related to linguistics and has links to research in cognitive science, psychology, physiology, and mathematics. In the computer science domain in particular, NLP is related to compiler techniques, formal language theory, human-computer interaction, machine learning, and theorem proving. This Quora question shows different advantages of NLP.

In this tutorial I'm going to walk you through an interesting Python platform for NLP called the Natural Language Toolkit (NLTK). Before we see how to work with this platform, let me first tell you what NLTK is.

What Is NLTK?

The Natural Language Toolkit (NLTK) is a platform used for building programs for text analysis. The platform was originally released by Steven Bird and Edward Loper in conjunction with a computational linguistics course at the University of Pennsylvania in 2001. There is an accompanying book for the platform called Natural Language Processing with Python.

Installing NLTK

Let's now install NLTK to start experimenting with natural language processing. It will be fun!

Installing NLTK is very simple. I'm using Windows 10, so in my Command Prompt (MS-DOS) I type the following command:

If you are using Ubuntu or macOS, you run the command from the Terminal. More information about installing NLTK on different platforms can be found in the documentation.

If you are wondering what pip is, it is a package management system used to install and manage software packages written in Python. If you are using Python 2 >=2.7.9 or Python 3 >=3.4, you already have pip installed! To check your Python version, simply type the following in your command prompt:

Let's go ahead and check if we have installed NLTK successfully. To do that, open up Python's IDLE and type the two lines shown in the figure below:

Check if we have installed NLTK successfully

If you get the version of your NLTK returned, then congratulations, you have NLTK installed successfully!

So what we have done in the above step is that we installed NLTK from the Python Package index (pip) locally into our virtual environment.

Notice that you might have a different version of NLTK depending on when you have installed the platform, but that shouldn't cause a problem.

Working With NLTK

The first thing we need to do to work with NLTK is to download what's called the NLTK corpora. I'm going to download the whole corpora. I know it is very large (10.9 GB), but we are going to do it only once. If you know which corpora you need, you don't need to download the whole corpora. 

In your Python's IDLE, type the following:

In this case, you will get a GUI from which you can specify the destination and what to download, as shown in the figure below:

A GUI from which you can specify the destination

I'm going to download everything at this point. Click the Download button at the bottom left of the window, and wait for a while until everything gets downloaded to your destination directory.

Before moving forward, you might be wondering what a corpus (singular of corpora) is. A corpus can be defined as follows:

Corpus, plural corpora; A collection of linguistic data, either compiled as written texts or as a transcription of recorded speech. The main purpose of a corpus is to verify a hypothesis about language - for example, to determine how the usage of a particular sound, word, or syntactic construction varies. Corpus linguistics deals with the principles and practice of using corpora in language study. A computer corpus is a large body of machine-readable texts.
( Crystal, David. 1992. An Encyclopedic Dictionary of Language and Languages. Oxford: Blackwell.)

A text corpus is thus simply any large body of text.

Stop Words

Sometimes we need to filter out useless data to make the data more understandable by the computer. In natural language processing (NLP), such useless data (words) are called stop words. So, these words to us have no meaning, and we would like to remove them.

NLTK provides us with some stop words to start with. To see those words, use the following script:

In which case you will get the following output:

The output from NLTK

What we did is that we printed out a set (unordered collection of items) of stop words of the English language.

How can we remove the stop words from our own text? The example below shows how we can perform this task:

The output of the above script is:

The resulting output of the script

Tokenization, as defined in Wikipedia, is:

The process of breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens.

So what the word_tokenize() function does is:

Tokenize a string to split off punctuation other than periods


Let's say we have the following text file (download the text file from Dropbox). We would like to look for (search) the word language. We can simply do this using the NLTK platform as follows:

In which case you will get the following output:

Searching for the word language

Notice that concordance() returns every occurrence of the word language, in addition to some context. Before that, as shown in the script above, we tokenize the read file and then convert it into an nltk.Text object.

I just want to note that the first time I ran the program, I got the following error, which seems to be related to the encoding the console uses:

What I simply did to solve this issue is to run this command in my console before running the program: chcp 65001.

The Gutenberg Corpus

As mentioned in Wikipedia:

Project Gutenberg (PG) is a volunteer effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". It was founded in 1971 by Michael S. Hart and is the oldest digital library. Most of the items in its collection are the full texts of public domain books. The project tries to make these as free as possible, in long-lasting, open formats that can be used on almost any computer. As of 3 October 2015, Project Gutenberg reached 50,000 items in its collection.

NLTK contains a small selection of texts from Project Gutenberg. To see the included files from Project Gutenberg, we do the following:

The output of the above script will be as follows:

The output of the above script

If we want to find the number of words for the text file bryant-stories.txt for instance, we can do the following:

The above script should return the following number of words: 55563


As we have seen in this tutorial, the NLTK platform provides us with a powerful tool for working with natural language processing (NLP). I have only scratched the surface in this tutorial. If you would like to go deeper into using NLTK for different NLP tasks, you can refer to NLTK's accompanying book: Natural Language Processing with Python.

HTTPS behind your reverse proxy

May 02 2017 [Archived Version] □ Published at Reinout van Rees' weblog under tags  django python

We have a setup that looks (simplified) like this:


HTTP/HTTPS connections from browsers ("the green cloud") go to two reverse proxy servers on the outer border of our network. Almost everything is https.

Nginx then proxies the requests towards the actual webservers. Those webservers also have nginx on them, which proxies the request to the actual django site running on some port (8000, 5010, etc.).

Until recently, the https connection was only between the browser and the main proxies. Internally inside our own network, traffic was http-only. In a sense, that is OK as you've got security and a firewall and so. But... actually it is not OK. At least, not OK enough.

You cannot trust in only a solid outer wall. You need defense in depth. Network segmentation, restricted access. So ideally the traffic between the main proxies (in the outer "wall") to the webservers inside it should also be encrypted, for instance. Now, how to do this?

It turned out to be pretty easy, but figuring it out took some time. Likewise finding the right terminology to google with :-)

  • The main proxies (nginx) terminate the https connection. Most of the ssl certificates that we use are wildcard certificates. For example:

    server {
      listen 443;
      server_name sitename.example.org;
      location / {
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header X-Forwarded-Proto https;
        proxy_redirect off;
        proxy_pass http://internal-server-name;
        proxy_http_version 1.1;
      ssl on;
      ssl_certificate /etc/ssl/certs/wildcard.example.org.pem;
      ssl_certificate_key /etc/ssl/private/wildcard.example.org.key;
  • Using https instead of http towards the internal webserver is easy. Just use https instead of http :-) Change the proxy_pass line:

    proxy_pass https://internal-server-name;

    The google term here is re-encrypting, btw.

  • The internal webserver has to allow an https connection. This is were we initially made it too hard for ourselves. We copied the relevant wildcard certificate to the webserver and changed the site to use the certificate and to listen on 443, basically just like on the main proxy.

    A big drawback is that you need to copy the certificate all over the place. Not very secure. Not a good idea. And we generate/deploy the nginx config for on the webserver from within our django project. So every django project would need to know the filesystem location and name of those certificates... Bah.

  • "What about not being so strict on the proxy? Cannot we tell nginx to omit a strict check on the certificate?" After a while I found the proxy_ssl_verify nginx setting. Bingo.

    Only, you need 1.7.0 for it. The main proxies are still on ubuntu 14.04, which has an older nginx. But wait: the default is "off". Which means that nginx doesn't bother checking certificates when proxying! A bit of experimenting showed that nginx really didn't mind which certificate was used on the webserver! Nice.

  • So any certificate is fine, really. I did my experimenting with ubuntu's default "snakeoil" self-signed certificate (/etc/ssl/certs/ssl-cert-snakeoil.pem). Install the ssl-cert package if it isn't there.

    On the webserver, the config thus looks like this:

    server {
        listen 443;
        # ^^^ Yes, we're running on https internally, too.
        server_name sitename.example.org;
        ssl on;
        ssl_certificate /etc/ssl/certs/ssl-cert-snakeoil.pem;
        ssl_certificate_key /etc/ssl/private/ssl-cert-snakeoil.key;

    An advantage: the django site's setup doesn't need to know about specific certificate names, it can just use the basic certificate that's always there on ubuntu.

  • Now what about that "snakeoil" certificate? Isn't it some dummy certificate that is the same on every ubuntu install? If it is always the same certificate, you can still sniff and decrypt the internal https traffic almost as easily as plain http traffic...

    No it isn't. I verified it by uninstalling/purging the ssl-cert package and then re-installing it: the certificate changes. The snakeoil certificate is generated fresh when installing the package. So every server has its own self-signed certificate.

    You can generate a fresh certificate easily, for instance when you copied a server from an existing virtual machine template:

    $ sudo make-ssl-cert generate-default-snakeoil --force-overwrite

    As long as the only goal is to encrypt the https traffic between the main proxy and an internal webserver, the certificate is of course fine.

Summary: nginx doesn't check the certificate when proxying. So terminating the ssl connection on a main nginx proxy and then re-encrypting it (https) to backend webservers which use the simple default snakeoil certificate is a simple workable solution. And a solution that is a big improvement over plain http traffic!

How to deploy a Django project in 15 minutes with Ansible

May 02 2017 [Archived Version] □ Published at Random notes from Zena » django under tags  ansible devops django sysadmin ubuntu

In this tutorial I will assume that you are a Django developer and you have built and tested a project locally. It’s time to deploy the project on a public server to let users access your awesome application. So you need a VPS with an SSH access, then you will access the server, install and configure […]


May 01 2017 [Archived Version] □ Published at Latest Django packages added

Django package to easily render Excel spreadsheets

Django Advanced Filters

May 01 2017 [Archived Version] □ Published at Latest Django packages added

Add advanced filtering abilities to Django admin

Building a Custom Block Template Tag

May 01 2017 [Archived Version] □ Published at Caktus Blog

Building custom tags for Django templates has gotten much easier over the years, with decorators provided that do most of the work when building common, simple kinds of tags. One area that isn't covered is block tags, the kind of tags that have an opening and ending tag, with content inside that might also need...

Python Django Developer

May 01 2017 [Archived Version] □ Published at Djangojobs.Net latest jobs

About Domain7

Founded in 1997, by Shawn Neumann

Offices in Vancouver & Abbotsford

Approximately 40 team members

Serving clients in Canada, USA and the UK, across industries including higher education, non-profit, startups, retail, technology and more

Featured in FastCompany, Applied Arts, The Globe & Mail, Communication Arts, CNBC, BC Business, BC's Fastest Growing Companies

Pick your favourite term: We're alternately called a web agency, a software company, a user-experience design firm, a digital transformation agency, a digital consultancy, or simply (and hopefully), your partner.

What we are looking for

We are looking for a senior, full stack Python Django developer for a contract position. We want to welcome a developer motivated by solving interesting technical challenges and building new applications quickly who can jump in, learn how our team builds software, join a project team in planning a great product, and code it to life in an agile development cycle. The position requires a problem solving mindset, scalable and maintainable software development skills and experience working within a team.

Start date likely late May, 20-30 hours per week, 8-10 weeks duration. Successful completion of this contract may lead to either continued project work or alternatively full time employment.

Options for orchestrating periodic tasks.

Apr 30 2017 [Archived Version] □ Published at Irrational Exuberance

Reliably running tasks periodically is hard. Getting to normal-case behavior of "exactly once" running requires either a singleton instance (introducting a single point of failure) or a leader election mechanism to determine which instance should be running (which is why having elections as primitive via something like Chubby can so powerful).

Before going too deep, a few definitions:

  • Scheduling is deciding when and whether a task should run.
  • Orchestration is deciding where and how a task should run.

Even once you have the ability to correctly schedule tasks, you still need a second mechanism to orchestrate them somewhere, and doing this effectively requires a fairly significant amount of coupling between the scheduler and orchestrator. For example, determining if the task completed successfully is information in the orchestrator, but determining the conditions a task should be restarted is potentially behavior you'd want determined by the orchestrator, especially around cases for tasks which are running long (e.g. do you want front-of-line blocking behavior or not).

We've been chatting more about this problem space at work, and it's been a while since I've explored the options in this space, so I decided to look around a bit.

First, a few thoughts about the features we want:

  • Language agnostic - we'd like to have one framework which can run periodic tasks for all programming languages we use, not have to deploy one for each language.
  • Familiar deployment paradigm - getting deployment right (with code review, linting, rollbacks, etc) is hard, and we'd prefer to use a single deployment paradigm and mechanism for periodic and long-running processes if possible. This is important both from a leverage perspective (we can improve everyone's experience in one place), and also from a training and adoption perspective.
  • Reliable - these are business critical tasks, and the scheduling and orchestration components both need to be reliable and predictable.
  • Reusible - ideally we could use the same orchestrator for both our periodic tasks and long-running ones. This will reduce our maintenance overhead, allow us to gain operational expertise more quickly, and also leave the door open to bin-packing based fleet efficiency optimization further down the line.
  • No vendor lock-in - ideally we'd find a solution that doesn't require vendor lock-in, e.g. proprietary cloud solutions from AWS, GCP or Azure.

With those features in mind, I spent some time digging around for common solutions:

  1. AWS Lambda with Scheduled Events is a cloud solution that should solve most straightforward use cases, both from a scheduling and an orchestration perspective. It is not particularly flexible in either regard, but it does give you the same primitives as cron, and if you happen to be using their supported languages (Node.js, Java, C#, Python), then it might be sufficient. (You can also use a hybrid Amazon EC2 Container Service and AWS Lambda approach if you need more flexibility in your orchestration layer.)
  2. Google's Cloud Functions can be paired with App Engine Cron Service, coordinating over Google Cloud Pub/Sub, to get more or less the same scheduling behavior as AWS Lambda with Scheduled Events, albeit with more pieces to futz with. Good Cloud Functions are still a bit limiting in terms of only supporting the Node.js runtime today, but one imagines they'll add more support over time. (You can of course get creative and have jobs call running services, allowing you to break out of Cloud Compute's language restrictions.)
  3. Chronos is a scheduler running on top of Mesos, which handles both the scheduling and orchestration aspects for you, and gives a good degree of flexibility in both. Running Mesos is a bit heavy, but this certainly makes if you already have operational expertise with running Mesos.
  4. Kubernetes' Cron Jobs give you a solution similar to Chronos, except running on Kubernetes instead of Mesos, for organizations which already have it deployed.
  5. cron is still used pretty frequently as a scheduler, and if you run it in a prebaked AMI in an AutoScaling Group with a size of one instance, then you can rely on the ASG for "election" of a single instance. You do have a single point of failure, but it'll recover relatively quickly. What you don't have is any orchestration primatives, so you would still need to integrate this with a second system that handles the orchestration aspects (e.g. spinning up a container on Amazon EC2 Container Service or calling into an AWS Lambda). On the plus side, you can use your existing server imaging and deployment strategies.
  6. Python Celery is used by many Python shops for this kind of functionality, although it suffers from most of the same scheduling challenges as Cron and orchestration is both fairly naive and restricted to Python (although, just found a Go implementation of Celery workers, which is a terrifying find). There are a bunch of other similar solutions in this category, both in the Python space and in other languages.
  7. Dkron is purely a scheduler, which aims to provide fault-tolerant scheduling, even if some nodes fail. (E.g. solving the leader-election and leader handoff problems for you, as opposed to building your own on top of Zookeeper, etc.) It also provides a nice UI, although depending on your security and compliance needs, it's possible that UI is a mixed blessing.
  8. Bistro is broadly in this space, but feels much more targeted at running jobs once for every resource in a fleet, as opposed to running a job once somewhere on some resource in a fleet.

Of those options, it feels like for larger companies, you'll likely end up with either a Cloud based solutions (AWS Scheduled Lambdas or App Engine Cron), Chronos if you're already happy running Mesos, or Kubernetes' Cron Jobs if you're already happy running Kubernetes.

Python F-Strings Are Fun!

Apr 27 2017 [Archived Version] □ Published at pydanny under tags  django python python3 twoscoops

Python F-Strings Are Fun!

In python 3.6 we saw the adoption of Literal String Interpolation, or as they are known more commonly, f-strings. At first I was hesitant because... well... we've got multiple string tools already available:

one, two = 1, 2
_format = '{},{}'.format(one, two)
_percent = '%s,%s' % (one, two)
_concatenation = str(one) + ',' + str(two)
_join = ','.join((str(one),str(two)))
assert _format == _percent == _concatenation == _join

Adding f-strings to this mix didn't seem all that useful:

_fstring = f'{one},{two}'
assert _fstring == _format == _percent == _concatenation == _join

I was doubtful, but then I tried out f-strings on a non-trivial example. Now I'm hooked. Be it on local utility scripts or production code, I now instinctively gravitate toward their usage. In fact, f-strings are so useful that going back to earlier versions of Python now feels cumbersome.

The reason why I feel this way is that f-strings are concise but easy to understand. Thanks to intuitive expression evaluation I can compress more verbose commands into smaller lines of code that are more legible. Take a look:

_fstring = f'Total: {one + two}'  # Go f-string!
_format = 'Total: {}'.format(one + two)
_percent = 'Total: %s' % (one + two)
_concatenation = 'Total: ' + str(one + two)
assert _fstring == _format == _percent == _concatenation

The f-string example is four characters shorter than the closest alternative and is extremely easy to read. Indeed, put the f-string example in front of a non-programmer and they'll understand it fast. The same won't apply to the alternatives, odds are they'll ask what .format(), str(), and the % mean.

F-Strings Are Addictive

The conciseness and power of the intuitive expression evaluation can't be understated. On the surface f-strings seem like a small step forward for Python, but once I started using them I realized they were a huge step in codability for the language.

Now I'm hooked. I'm addicted to f-strings. When I step back to Python 3.5 or lower I feel like less of a Python coder. Yes, I have a problem with how much I lean on f-strings now, but I acknowledge my problem. I would go to therapy for it, but I believe I can manage the addiction for now.

Okay, enough joking, f-strings are awesome. Try them out.

A Utility Script Example

We just released Two Scoops of Django 1.11, which is written in LaTeX. Like most programming books we provide code examples in a repo for our readers. However, as we completey revised the code-highlighting, we had to rewrite our code extractor from the ground up. In a flurry of cowboy coding, I did so in thirty minutes using Python 3.6 while leaning on f-strings:

"""Two Scoops of Django 1.11 Code Extractor"""
import os
import shutil
from glob import glob

    print('Removed old code directory')
except FileNotFoundError:
print('Created new code directory')

STAR = '*'


    '\\begin{python}': '.py',
    '\\begin{badpython}': '.py',
    '\\begin{django}': '.html',
    '\\begin{baddjango}': '.html',
    '\\begin{plaintext}': '.txt',
    '\\begin{badplaintext}': '.txt',
    '\\begin{sql}': '.sql',
    '\\begin{makefile}': '',
    '\\begin{json}': '.json',
    '\\begin{bash}': '.txt',
    '\\begin{xml}': '.html',

LANGUAGE_END = {x.replace('begin', 'end'):y for x,y in LANGUAGE_START.items()}

def is_example(line, SWITCH):
    for key in SWITCH:
        if line.strip().startswith(key):
            return SWITCH[key]
    return None

def makefilename(chapter_num, in_example):
    return f'code/chapter_{chapter_num}_example_{str(example_num).zfill(2)}{in_example}'

if __name__ == '__main__':

    in_example = False
    starting = False
    for path in glob('chapters/*.tex'):
            chapter_num = int(path[9:11])
            chapter_num = path[9:11]
        except ValueError:
            if not path.lower().startswith('appendix'):
        example_num = 1
        with open(path) as f:
            lines = (x for x in f.readlines())
        for line in lines:
            if starting:
                # Crazy long string interpolation that should probably
                # be broken up but remains because it's easy for me to read
                filename =  f'code/chapter_{chapter_num}_example_{str(example_num).zfill(2)}{in_example}'
                dafile = open(filename, 'w')
                if in_example in ('.py', '.html'):
            if not in_example:
                mime = None
                in_example = is_example(line, LANGUAGE_START)
                if in_example:
                    starting = True
            mime = is_example(line, LANGUAGE_END)
            starting = False
            if mime:
                in_example = False
                example_num += 1


Apr 27 2017 [Archived Version] □ Published at Latest Django packages added

A straightforward menu generator for Django

django-planet aggregates posts from Django-related blogs. It is not affiliated with or endorsed by the Django Project.

Social Sharing


Tag cloud

admin administration adsense advanced ajax amazon angular angularjs apache api app appengine app engine apple application security aprendiendo python architecture argentina articles asides audrey australia authentication automation backup bash basics best practices binary bitbucket blog blog action day blogging book books buildout business c++ cache capoeira celery celerycam celerycrawler challenges chat cheatsheet cherokee choices christianity class-based-views cliff clojure cloud cms code codeship codeship news coding command community computer computers computing configuration consumernotebook consumer-notebook continuous deployment continuous integration cookiecutter couchdb coverage css custom d data database databases db debian debugging deploy deployment deployment academy design developers development devops digitalocean django django1.7 django admin django cms djangocon django framework django-nose django-readonly-site django-rest-framework django-tagging django templates django-twisted-chat django web framework tutorials documentation dojango dojo dotcloud dreamhost dughh easy_install eclipse education elasticsearch email encoding english error europe eventbrite events expressjs extensions fabric facebook family fashiolista fedora field file filter fix flash flask foreman form forms frameworks friends fun functional reactive programming gae gallery games geek general gentoo gis git github gmail gnome google google app engine guides gunicorn hack hackathon hacking hamburg haskell heroku holidays hosting howto how-to howtos how-tos html http i18n image imaging indifex install installation intermediate internet ios iphone java javascript jinja2 jobs journalism jquery json justmigrated kde la latex linear regression linkedin linode linux login mac machine learning mac os x markdown math memcached meme mercurial meta meteor migration mirror misc model models mod_wsgi mongodb months mozilla multi-language mvc mysql nasa nelenschuurmans newforms news nginx nodejs nosql oauth ogólne openshift opensource open source open-source openstack operations orm osx os x ottawa paas packages packaging patterns pedantics pelican penetration test performance personal personal and misc philippines philosophy php pi pil pinax pip piston planet plone plugin pony postgis postgres postgresql ppoftw presentation private programmieren programming programming & internet project projects pycharm pycon pycon-2013-guide pydiversity pygrunn pyladies pypi pypy pyramid python python3 queryset quick tips quora rabbitmq rails rant ratnadeep debnath reactjs recipe redis refactor release request resolutions rest reusable app review rhel rtnpro ruby ruby on rails scala scaling science screencast script scripting security server setup shell simple smiley snaking software software collections software development south sphinx sprint sql ssh ssl static storage supervisor support svn sysadmin tag tag cloud talk nerdy to me tastypie tdd techblog technical technology template templates template tags test testing tests tip tips tools tornado training transifex travel travel tips for geeks tumbles tutorial tutorials twisted twitter twoscoops typo3 ubuntu uncategorized unicode unittest unix use user authentication usergroup uwsgi uxebu virtualenv virtualenvwrapper web web 2.0 web application web applications web design & development webdev web development webfaction web framework websockets whoosh windows wordpress work workshop wsgi yada znc zope