Django ORM Memory Leaks in Debug Mode

TL;DR: Django has tremendous memory leaks for any long running task with many SQL operations with debug on.

 

Solution: Turn debug off.

 

Story:

 

So I just spent a long time digging this up. We have a management command that runs a ton of SQL statements and queries/inserts all kinds of objects into the database. I was attempting to run it locally but it kept getting slower and slower over time and the memory was just linearly expanding. I naively first did a really simple profile with heapy. This told me that I had a whole lot of strings taking up a whole lot of memory. My first inclination was that maybe python was doing something dumb by not interning strings properly or by not garbage collecting properly. I added a few intern() functions on what I thought might be candidates for duplication, and i explicitly garbage collected throughout the script. It had zero effect.

 

So I had to dig deeper with heapy. I found this really good heapy tutorial, which I barely understood, but could map to my own memory situation to dig into what was actually causing the issue. Finally I traced it down to a single giant dictionary associated with the following class:        django.db.backends.postgresql_psycopg2.base.DatabaseWrapper. So at this point I immediately googled “django orm memory leak” which quickly led me to a few random blog posts that indicated that django stores all queries ever run in debug mode, until you explicitly call db.reset_queries(). You don’t bump into this normally because it’s automatically called at the end of every HTTP request.

 

Anyway, debug off, memory is now beautifully constant.

Share

Tags

Similar Articles

The World's Most Powerful Mobile Data Collection Platform

Start a FREE 30-day CommCare trial today. No credit card required.

Get Started

Learn More

Get the latest news delivered
straight to your inbox