Adventures in Django Deployment: What is a Web Server Anyway?

I recently had some hair-raising adventures in the land of website deployment. I’ve been completely rebuilding my website with a Django backend because I love Python. (This blog will eventually be hosted there, although WordPress does excel at the blog thing, so I might just get crazy and integrate Django AND WordPress. But I digress.) I had everything working all fine and dandy on my local computer and was ready to deploy my almost-identical-but-now-Django-backed placeholder site. I’ve long had Apache serving a few VirtualHosts and handling the SSL certificates on my personal DigitalOcean droplet, so my choice of web server was made for me. At least all that domain config stuff was done, so it would be a cinch to get Django up and running in place of my static site, I thought.

I thought wrong.

It started off so well. I installed python and PostgreSQL on my server and configured my production settings file to match (and to pull in the passwords from a separate ini file on the server and out of the git tracking). I set up a bare repo and some git hooks to make pushing to production easy. (I know I should also set up a separate staging subdomain for testing, but I’ve been in a hurry to get something real up there since I’m starting the ol’ job hunt.) Everything seemed to be ready to push so I turned to the Apache config.

This is where my troubles began. I had been running Django’s testing server locally, but the docs were very clear that this testing server should not be used in production. They did not go into details as to why other than “security and scaling,” but I took their word for it. That’s okay, I thought; I have Apache set up already anyway. The docs also kept talking about this “WSGI” thing, but when I had tried to figure out how to configure that, it had mostly just confused me, and well, it was working on my machine.

Downtime was not really an issue, so I went ahead and pushed to remote. Git hooks worked! I set up my remote virtual environment and installed my requirements.txt. And… nothing. I couldn’t even get Django talking to PostgreSQL. After at least half a day banging my head against that, trying different settings tweaks and increasingly simplified debugging tests, I realized that either Django or the package I was using to import passwords (python-decouple) was angry that my long, secure password had a friggin’ percent sign in it. OKAY! After trying and failing to properly escape it, I did the easy thing and just changed the password. I was finally able to set up the production database! I was close now, surely!

I spent the better part of a week figuring the rest out. Not only did I have to figure out the whole WSGI thing, but apparently there are NO tutorials out there on getting Apache, Gunicorn, and SSL all playing nice together. I had me a TIME. Through a lot of internet sleuthing and a mostly trial-and-error, I finally got Django serving my website with Apache and Gunicorn and only over HTTPS. After I got it working, I was of course curious as to why and how, and spent a day doing the baseline research I should have done from the beginning. Let me tell you my learnings so you do not suffer my fate.

What is a Web Server, Anyway?

The bulk on my confusion can be boiled down to the fuzzy lines between what, exactly, is a web server, what is an application server, and where this thing called WSGI comes in.

A web server can be many things. A web server contains multitudes. But the one thing a web server must possess is an HTTP server. An HTTP server is the software that is capable of understanding HTTP requests and providing a response. The confusion arises because a web server can also refer to the other software that works with the HTTP server, such as modules or an application server, as well as the actual machine or hardware on which the software resides. The HTTP server may be a part of a larger “web server” package or it can be a stand-alone application.

Client-server diagram. Everything is a web server!
Web server is an imprecise term.

OK, so that explains why Django can be both an application server and contain a web server. So why do we need this WSGI thing? Why do we need Apache or Ngnix at all? The short answer is that these tried-and-true, dedicated HTTP servers are far more stable and secure than anything Django can dish out. They are already set up to handle multiple simultaneous requests, handle security and SSL, and locate and serve files and responses as is appropriate. Yes, Django could serve out all your HTTP responses, but that doesn’t mean Django should.

So what is WSGI?

WSGI, pronounced like “whiskey,” stands for Web Server Gateway Interface. As you might infer from the title, it is a protocol that interfaces between web servers and applications, in this case Python. See, web servers like Apache cannot typically run Python on their own. Back in the day, choosing a Python framework restricted your choice of web server, and vice-versa. For example, there was an Apache module called mod_python that folks used to get Python working on the web, but in addition to being Apache-specific, it was poorly maintained and turned out to be full of security leaks.

WSGI is the standard interface that folks came up with in order to normalize communication between differing web servers and differing Python frameworks or applications. A WSGI-compliant Python app will provide a standardized “callable object” that the web server uses to actually invoke the Python code in response to requests. In this way, deployment stuff is kept separated from application stuff, and different web servers and frameworks can be swapped with one another at ease.

Apache does have a module called mod_wsgi that can interact with Django directly, but I found its configuration confusing, and the Internet tells me that it’s got a lot of overhead. This is where a WSGI server like Gunicorn or uWSGI comes in. I know, I know, as if we needed yet another type of server to worry about in the stack. Fortunately, Gunicorn comes with significant performance and stability boosts, and though I struggled with getting the Apache config where it needed to be to talk to everything, Gunicorn itself required no configuration whatsoever.

Gunicorn and uWSGI are middleware that sit between the web server (in this case Apache) and the application server (in this case Django). It runs on a port separate from the web server and spawns worker threads to handle invoking the application as needed (via that “callable object,” which is the wsgi.py file in Django), as well as balancing loads, optimizing performance, and handling multiple processes at once. It acts as a web server to the application and as an application server to the web server. Again, Gunicorn could act as the web server itself here, but that would negate a lot of its performance benefits and probably wouldn’t be as secure.

So what is the HTTP server doing here? In addition to handling security, waiting for slow clients, and serving up any local static files, it acts as a proxy, forwarding valid HTTP/S requests from the open web ports to the local Gunicorn/WSGI port and back again. I also have some rewrite rules ensuring that all insecure HTTP requests get rewritten to HTTPS. Finally, Apache grants permissions to the WSGI and local static directories so that files can be served. Without further ado, here is the final Apache config:


<VirtualHost *:80>    
        ServerAdmin myemail@gmail.com
        ServerName artdyke.com
        ServerAlias www.artdyke.com
        DocumentRoot /path/to/site/root
        
        ...
        
        Alias /static /path/to/static/folder
        <Directory /path/to/static/folder>
                Require all granted
                Options -Indexes
        </Directory>

        <Directory /path/to/django/project/folder>
                <Files wsgi.py>
                        Require all granted
                </Files>
                Options -Indexes
        </Directory>

        ProxyPreserveHost on
        ProxyPass /static/ !
        ProxyPass /media/ !
        ProxyPass / http://localhost:8000/
        ProxyPassReverse / http://localhost:8000/

        RewriteEngine on
        RewriteCond %{SERVER_NAME} =artdyke.com [OR]
        RewriteCond %{SERVER_NAME} =www.artdyke.com
        RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>

And the companion SSL conf file:

      
<IfModule mod_ssl.c>
<VirtualHost *:443>
        
        ...
        
        RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME}
        RequestHeader set "X-Forwarded-SSL" expr=%{HTTPS}
        SSLProxyEngine on
        SSLCertificateFile /path/to/certfile.pem
        SSLCertificateKeyFile /path/to/privatekey.pem
        Include /etc/letsencrypt/options-ssl-apache.conf

        ProxyPreserveHost on
        ProxyPass /static/ !
        ProxyPass /media/ !
        ProxyPass / http://localhost:8000/
        ProxyPassReverse / http://localhost:8000/
</VirtualHost>
</IfModule>

Note the proxies all go to http and the same localhost port, even in the SSL conf. That threw me for a minute, but of course it makes sense; the HTTPS is being served externally to the client, not internally to or from the application server.

And in my Django production settings:

SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')
CSRF_COOKIE_SECURE = True
SESSION_COOKIE_SECURE = True
SECURE_SSL_REDIRECT = True

I’ve been too afraid to keep messing with a good thing to test which of these Request Header SSL settings are strictly necessary, so heads up that one or more might be superfluous.

And there you have it! Secure and robust deployment! Well, mostly. Serving static files from the local server is widely discouraged, and currently if my Gunicorn instance should fail for any reason my site would go down. Next up is to set up a process monitor like supervisor to make sure it all comes alive again if it dies or reboots.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s