Suppose we’re writing an app called Patient Care Dashboard to help hospitals visualize their data. The hospitals store their data in CSVs.*
*Let’s say that legal limitations prevent our app from storing any patient data in databases, so all data has to come in via these CSVs.
On the homepage of the app, providers can choose from several types of visualizations: one to view patient infection rates by department, one to view a schedule of friend and family visits, one to view a pie chart of patient post-surgery food requests, et cetera.
Once providers choose a visualization type, they get a modal where they can upload their CSV file, click a button, and see their gorgeous visualization.
We want to store a record of each of your visualization types in the app’s database so our homepage can fetch the list of their names and descriptions to display all the options on the homepage. We also want to keep a record of each time a user runs a visualization task was run along with the visualization type.
But each different visualization type requires different code to run on the CSV. First of all, each CSV has different data, so each visualization type has its own list of CSV column names to fetch. Second of all, each CSV has different chart code: some use calendars, others use pie charts, and so forth.
How can we store a representation of each visualization type in the database while running different code for each one?
We could make a model with a giant conditional.
Our Visualization model could contain a method like you see here on line 24:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import unicode_literals | |
from django.conf import settings | |
from django.db import models | |
from django.utils import timezone | |
import numpy as np | |
from calendar import Calendar | |
from charts import HeatMap, PieChart | |
from exceptions import IncompleteDataException | |
class Visualization(models.Model): | |
name = models.CharField(primary_key=True, max_length=200) | |
description = models.TextField() | |
position = models.IntegerField(default=200) | |
def __repr__ (self): | |
return self.name | |
def __str__ (self): | |
return self.name | |
def from_data(name, csvfile): | |
if name == "infection_rates": | |
department_instances = [] | |
infection_type = [] | |
with open csvfile as file: | |
if file["department"] and file["type"]: | |
department_instances = np.array(file["department"]) | |
infection_type = np.array(file["type"]) | |
else: | |
return IncompleteDataException("Please include a department and a type column in your CSV.") | |
map = HeatMap(list(zip(department_instances, infection_type))) | |
return map.to_svg() | |
if name == "family_visits": | |
#…similar rigamarole | |
if name == "food_requests": | |
#…similar rigamarole |
That we then call in the view like we do here on line 22:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import unicode_literals | |
from django.shortcuts import render | |
from models import Visualization | |
from exceptions import IncompleteDataException | |
from .forms import VisualizationForm | |
from django.contrib.auth.decorators import login_required | |
@login_required | |
def transform(request): | |
name = request.GET['script'] | |
if request.POST and request.FILES: | |
form = VisualizaionForm(request.POST, request.FILES) | |
if form.is_valid(): | |
input_file = request.FILES['file'] | |
image = None | |
try: | |
vizualization = Visualization.objects.get(pk=name) | |
image = vizualization.from_data(name, input_file) | |
except IncompleteDataException as e: | |
return render(request, 'visualizations/form.html', {'error': e.message()}) | |
return render(request, 'visualizations/results.html', {'image': image}) | |
As we add new visualizations, this giant conditional becomes even longer. Maybe that’s not so bad. But suppose we want to add complexity to the visualizations? Maybe we want them to also include custom error messages, shell out to third party services, or track changes in the data from visualization run to visualization run.
Even if multiple visualizations are supposed to have the same new functionality, they either each need their own copy of the implementation in their conditional block, or we DRY things up by writing blocks of code before the conditional—or worse, before only certain parts of the conditional—until we have several branches of code to peruse before we can make any changes.
This approach would be quite testable, but the test file would be enormous because each conditional block’s tests would have to live together in the test class. Alternatively, we would have to make several test files for the same class, which can lead to confusion as well.
What if, instead, we made each type of visualization responsible for its own behavior?
In Django, several different classes can inherit from the same Model class with called Proxy Models. Our regular old model might look something like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from django.db import models | |
class Visualization(models.Model): | |
name = models.CharField(primary_key=True, max_length=200) | |
description = models.TextField() | |
position = models.IntegerField(default=200) | |
def __repr__ (self): | |
return self.name | |
def __str__ (self): | |
return self.name | |
def from_data(csvfile): | |
raise NotImplementedError("You must implement from_data in your visualization subclass.") |
and we would subclass it for each particular type of visualization like so:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from visualizations.models import Visualization | |
class InfectionRateVisualization(Visualization): | |
class Meta: | |
proxy = True | |
def from_data(csvfile): | |
department_instances = [] | |
infection_type = [] | |
with open csvfile as file: | |
if file["department"] and file["type"]: | |
department_instances = np.array(file["department"]) | |
infection_type = np.array(file["type"]) | |
else: | |
return IncompleteDataException("Please include a department and a type column in your CSV.") | |
map = HeatMap(list(zip(department_instances, infection_type))) | |
return map.to_svg() |
In this version, each subclass of Visualization could have its own test file and its own circumscribed area for its unique functionality. Meanwhile, if we want to add behavior that should apply to all instances, we can add that behavior to the original Visualization
superclass—or, if we prefer, to an intermediary subclass that inherits from Visualization, while all the final visualizations inherit from the intermediary subclass.
A note on this: if you do it that way, every subclassed model down the tree must possess the in-class annotation `class Meta: proxy = True`.
That is, the final subclasses won’t work properly without this, even if the intermediary class from which they inherit has this annotation. I’m not sure why. If you know, I’d love to hear.
The Drawback: Dependency Injection
So how does Django know which of the subclasses to use in the views.py
file?
One option is to inject it with a dictionary (lines 9-13) in views.py
:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import unicode_literals | |
from django.shortcuts import render | |
from models import Visualization | |
from exceptions import IncompleteDataException | |
from .forms import VisualizationForm | |
from django.contrib.auth.decorators import login_required | |
VISUALIZATIONS = { | |
"infection_rate_visualization": InfectionRateVisualization.objects.get(pk="infection_rate_visualization"), | |
"friend_family_visit_visualization": FriendFamilyVisitVisualization.objects.get(pk="friend_family_visit_visualization"), | |
"food_request_visualization": FoodRequestVisualization.objects.get(pk="food_request_visualization") | |
} | |
@login_required | |
def transform(request): | |
name = request.GET['script'] | |
if request.POST and request.FILES: | |
form = VisualizaionForm(request.POST, request.FILES) | |
if form.is_valid(): | |
input_file = request.FILES['file'] | |
image = None | |
try: | |
vizualization = VISUALIZATIONS[name] | |
image = vizualization.from_data(name, input_file) | |
except IncompleteDataException as e: | |
return render(request, 'visualizations/form.html', {'error': e.message()}) | |
return render(request, 'visualizations/results.html', {'image': image}) |
Alternatively, we could do it with a method. Let’s define that method (lines 9-15) and call it (line 29). Remember that a leading underscore on a method name in python denotes that this method is intended to be private to this file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import unicode_literals | |
from django.shortcuts import render | |
from models import Visualization | |
from exceptions import IncompleteDataException | |
from .forms import VisualizationForm | |
from django.contrib.auth.decorators import login_required | |
def _inject_visualization(name): | |
if name == "infection_rate_visualization": | |
return InfectionRateVisualization.objects.get(pk=name) | |
if name == "friend_family_visit_visualization": | |
return FriendFamilyVisitVisualization.objects.get(pk=name) | |
if name == "food_request_visualization": | |
return FoodRequestVisualization.objects.get(pk=name) | |
@login_required | |
def transform(request): | |
name = request.GET['script'] | |
if request.POST and request.FILES: | |
form = VisualizaionForm(request.POST, request.FILES) | |
if form.is_valid(): | |
input_file = request.FILES['file'] | |
image = None | |
try: | |
vizualization = _inject_visualization(pk=name) | |
image = vizualization.from_data(name, input_file) | |
except IncompleteDataException as e: | |
return render(request, 'visualizations/form.html', {'error': e.message()}) | |
return render(request, 'visualizations/results.html', {'image': image}) |
The advantage to this over a dictionary is that Django will only instantiate each model instance if that branch of the conditional gets executed, rather than eager-evaluating and instantiating them all. This approach can reduce your memory usage, especially if you have many bespoke implementations or very complicated bespoke implementations.
I think this approach looks clunky, and we also have to update it every time we add a visualization. I’m not a huge fan of this, and I’ve cast around for an alternative to this for a while now. It should be noted that one could do some fancy meta-programming à la Rubyism. Let’s add some code to do that (lines 9-22) and then change line 36 to see how that might look:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import unicode_literals | |
from django.shortcuts import render | |
from models import Visualization | |
from exceptions import IncompleteDataException, ConstantizeException | |
from .forms import VisualizationForm | |
from django.contrib.auth.decorators import login_required | |
import inspect, re | |
ClassRegex = re.compile("[a-zA-Z_][a-zA-Z0-9_]*") | |
def constantize(string): | |
if not ClassRegex.match(string): | |
raise ConstantizeException("`%s' contains illegal characters." % str(string)) | |
klass = eval(str(string)) | |
if not inspect.isclass(klass): raise ConstantizeException("`%s' does not refer to a class." % str(string)) | |
return klass | |
def snake_to_camel(word): | |
import re | |
return ''.join(x.capitalize() or '_' for x in word.split('_')) | |
@login_required | |
def transform(request): | |
name = request.GET['script'] | |
if request.POST and request.FILES: | |
form = VisualizaionForm(request.POST, request.FILES) | |
if form.is_valid(): | |
input_file = request.FILES['file'] | |
image = None | |
try: | |
vizualization = constantize(snake_to_camel(name)).objects.get(pk=name) | |
image = vizualization.from_data(name, input_file) | |
except IncompleteDataException as e: | |
return render(request, 'visualizations/form.html', {'error': e.message()}) | |
return render(request, 'visualizations/results.html', {'image': image}) |
Code credit here for the snake case conversion and to the below linked article for the constantize method.
This approach is specifically not pythonic and, more generally, not recommended for a number of performance , stability, and security reasons.
That said, if I am adding developers to my Patient Care Dashboard team, I would rather include documentation and context around “you have to add a line to this method right here” than ask them to wade through a giant conditional inside my models.py
.
What do you think?
Django’s Proxy Models generally get deployed when a Model has several categories that each require a separate implementation. This example takes that to the extreme and imagines a case for exactly one model instance in the database for each of our implementations. Does this seem like a reasonable application of the Proxy Model approach in Python to you? How would you solve the dependency injection issue? What sort of tradeoffs do you see in the different approaches we have discussed? Did you learn anything new for a project you’re working on?
If you found this example interesting, you might also like:
Streamlining your Automated Test Strategy with Risk Maps (RubyTapas listeners, this post was the initial rough draft for the screencast you saw—enjoy my sparkly hand drawn diagrams in this version!)
Designing Code for Testing: An Example Implemented in Python
Thank you, this is very helpful!
Thank you for this awesome article