Organizing Flask Models with Automatic Discovery
How to have as many SQLAlchemy models as you like without all the hassle.
Series: How I Use Flask
10 min
No One True Way™
After 5+ years of building Flask apps, it sometimes surprises me there hasn’t yet emerged a viable One True Way™ for organizing and loading SQLAlchemy models in more complex applications. Of course Flask is a microframework, and much of its appeal is in letting developers figure out the best way to organize their own apps using their own chosen principles. However, there are quite a lot of less-seasoned developers who want to use Flask and don’t exactly know the first place to begin to sanely organize, discover, and import models in their apps.
The relevant Flask docs on using SQLAlchemy in Flask are exceedingly light on details and guidance.
The top hit on StackOverflow offers a method of handling model organization, importing, and database setup that might work okay for one model, but hit trouble when you have several—especially because you must import all of your app’s models before you can rely on the database being fully ready (and this isn’t highlighted in the Flask docs well enough for new developers to understand).
Let’s look then at how I handle organizing SQLAlchemy models in Flask with an eye toward model-building being a relatively zero-effort activity that just works in a sizeable Flask app.
SQLAlchemy database setup
One of the most important jobs of your Flask app factory is to init a database your app can use. I’ve already covered how I handle the create_app
function in my Flask apps in another post, so review that if you would like to know more.
As mentioned in my post on the app factory pattern, I use a custom method called init_db
to handle all my database setup.
Here’s what proj/db.py
looks like:
# proj/db.py
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy_utils import force_auto_coercion
from proj.lib.loaders import load_models
__all__ = ('db', 'init_db')
# our global DB object (imported by models & views & everything else)
db = SQLAlchemy()
# support importing a functioning session query
query = db.session.query
def init_db(app=None, db=None):
"""Initializes the global database object used by the app."""
if isinstance(app, Flask) and isinstance(db, SQLAlchemy):
force_auto_coercion()
load_models()
db.init_app(app)
else:
raise ValueError('Cannot init DB without db and app objects.')
Let’s take a quick tour through what’s happening here.
Instantiate SQLAlchemy
as your database object
This one’s easy. We simply instantiate an SQLAlchemy
database instance, and set it to the variable db
. This will be imported throughout our app any time we need to interact with the database as
from proj.db import db
I find it most sensible to import my database instance from a db
module. It doesn’t have to live here, but it makes sense to me.
There are no doubt many times you’ll want to perform queries in some part of your app code. I always make a simple and importable db.session.query
alias that can be used by all parts of my apps like so:
from proj.db import query
Don’t init a database when missing arguments
Sometimes I think I probably go a bit overboard with paranoid, hyper-defensive error protection. I’m trying to get better at that. Needless to say, I take a cautious approach to refusing to instantiate a database if init_db
isn’t passed both a db
and app
argument. Now, this is just a quick-and-dirty check that the arguments are not None
, in addition to ensuring app
is, in fact, a Flask
instance, as well as verifying db
is an SQLAlchemy
instance. I don’t want anything calling this code with missing or incorrect arguments.
Always call force_auto_coercion
If you make use of any kinds of custom data type object wrappers in your Flask apps—think things like ChoiceField
, ArrowType
, URLType
, PhoneNumber
, among others (including your own custom data types)—then you’re going to want to force automatic data type coercion for your models.
Note: What matters here is that you must call
force_auto_coercion
before you import/initialize your models. That’s why we do it in ourinit_db
helper.
Automatically discover and import all app models
The next step is to discover and load all app models with the custom load_models
function. We’re actually going to look into this in more detail below.
Initialize the database-app connection
Lastly, call db.init_app(app)
to use Flask’s built-in mechanism for extending your Flask app with additional functionality (in this case, your database). Once db.init_app
has done its work, you’ll have a working database connection that handles all your app-db communication needs.
Automatic model discovery
I’ll dig more into my db.Model
base class usage in another post, so we’re just going to cover how I handle automatic model discovery & loading as part of the database setup portion of Flask’s app factory. All that not-so-magic stuff happens inside load_models
. Let’s briefly consider what load_models
ought to do.
Walk a given directory looking for Python modules
The first step of auto-discovering our models is that our model loader needs to be able to walk a package directory and find all Python modules.
A quick and dirty approach to accomplishing this is as follows:
# proj/lib/loaders.py
from os import walk
from os.path import abspath, basename, dirname, join
# main project path & module name
PROJ_DIR = abspath(join(dirname(abspath(__file__)), '../..'))
APP_MODULE = basename(PROJ_DIR)
def get_modules(module):
"""Returns all .py modules in given `module` directory that are not `__init__`.
Usage:
get_modules('models')
Yields dot-notated module paths for discovery/import.
Example:
/proj/app/models/foo.py > app.models.foo
"""
file_dir = abspath(join(PROJ_DIR, module))
for root, dirnames, files in walk(file_dir):
mod_path = '{}{}'.format(APP_MODULE, root.split(PROJ_DIR)[1]).replace('/', '.')
for filename in files:
if filename.endswith('.py') and not filename.startswith('__init__'):
yield '.'.join([mod_path, filename[0:-3]])
I’ve made get_modules
a generator that will yield
dot-notated module paths as it finds them within a specified module
directory. For most of my app needs, I don’t ever define usable models inside an __init__.py
file—that’s the place I might collect sub-module imports or define abstract base classes and mixins that would be used by other classes.
Dynamically load class definitions that match a rule
Obviously we need to test if the module we’ve found by walking a directory has a Model
defined that we want to load.
Now, we might pause a moment here to recognize we may want to perform this for something other than models, too—cough views cough—so it might make sense to make our dynamic loader accept the rule(s) as an argument.
Python makes this pretty easy to implement:
# proj/lib/loaders.py
from importlib import import_module
def dynamic_loader(module, compare):
"""Iterates over all .py files in `module` directory, finding all classes that
match `compare` function.
Other classes/objects in the module directory will be ignored.
Returns unique list of matches found.
"""
items = []
for mod in get_modules(module):
module = import_module(mod)
if hasattr(module, '__all__'):
objs = [getattr(module, obj) for obj in module.__all__]
items += [o for o in objs if compare(o) and o not in items]
return items
In our dynamic_loader
, we supply which app directory to walk as the module
argument, and a comparator function as the compare
argument. The body of the function is relatively straightforward—call get_modules
from before to get all the Python modules in the module
directory, then use importlib.import_module
to load the module and iterate over all items in the module.__all__
attribute. Now, this bit is something I’ve added quite specifically for my codebases. I’m pretty religious about always declaring __all__
in my Python modules, and I only want to inspect and compare the items that have been declared in this __all__
attribute.
Finally, dynamic_loader
will append whatever it finds to the items
list if it passes the compare
function and hasn’t already been added.
Write a comparator function to identify models
The final piece in our dynamic, automatic loading of database models is correctly identifying whether or not an object found inside a module is actually a database model. This one is pretty straightforward—all we need to do here is verify that our item is in fact a class, and that it is a subclass of flask_sqlalchemy.Model
.
# proj/lib/loaders.py
from inspect import isclass
from flask_sqlalchemy import Model
def is_model(item):
"""Determines if `item` is a `Model` subclass."""
return isclass(item) and issubclass(item, Model) # and not item.__ignore__()
Now, in my projects, I define my own base Model
class that all app models subclass. This base class typically defines some standard database columns like primary keys, timestamps, and a few other things. Additionally, I may define other abstract base classes that I intend to be subclassed by other models for shared attributes that only apply to a certain set of similar models for a given context of the app. For this reason, my base Model
class defines a class method named __ignore__
, which includes some simple logic like asserting the class isn’t defined as __abstract__
, or that the class name is not in a certain list of base model classes I do not want to be included in the auto model loading. You don’t need the last bit that’s commented out in the function above, but I’m leaving it there in case you find you have similar needs.
Putting it all together
Now that we’ve worked out what an automatic model loader should do, we can put it all together and create our load_models
function that will be used by init_db
.
# proj/lib/loaders.py
from importlib import import_module
from inspect import isclass
from os import walk
from os.path import abspath, basename, dirname, join
from sys import modules
from flask_sqlalchemy import Model
__all__ = ('get_models', 'load_models')
# main project path & module name
PROJ_DIR = abspath(join(dirname(abspath(__file__)), '../..'))
APP_MODULE = basename(PROJ_DIR)
def get_modules(module):
"""Returns all .py modules in given file_dir that are not __init__."""
file_dir = abspath(join(PROJ_DIR, module))
for root, dirnames, files in walk(file_dir):
mod_path = '{}{}'.format(APP_MODULE, root.split(PROJ_DIR)[1]).replace('/', '.')
for filename in files:
if filename.endswith('.py') and not filename.startswith('__init__'):
yield '.'.join([mod_path, filename[0:-3]])
def dynamic_loader(module, compare):
"""Iterates over all .py files in `module` directory, finding all classes that
match `compare` function.
Other classes/objects in the module directory will be ignored.
Returns unique items found.
"""
items = []
for mod in get_modules(module):
module = import_module(mod)
if hasattr(module, '__all__'):
objs = [getattr(module, obj) for obj in module.__all__]
items += [o for o in objs if compare(o) and o not in items]
return items
def get_models():
"""Dynamic model finder."""
return dynamic_loader('models', is_model)
def is_model(item):
"""Determines if `item` is a `db.Model`."""
return isclass(item) and issubclass(item, Model) and not item.__ignore__()
def load_models():
"""Load application models for management script & app availability."""
for model in get_models():
setattr(modules[__name__], model.__name__, model)
You’ve already seen how all of these pieces work by themselves. Our final task is adding the load_models
function, which will iterate over all the models found, and add them to sys.modules
so our database knows about them when it starts up.
Why the get_models
function?
You can omit this part yourself if you’d like. I define the get_models
function because I also have a couple custom manage.py
commands for listing out all recognized models and database tables in my apps. I’ll probably cover those in a future post. They aren’t used often, but have proven helpful when I’ve been working on models and want to ensure my dynamic_loader
is still successfully finding all the correct models in my apps. If you don’t want that extra function, your load_models
would look like this:
# proj/lib/loaders.py
def load_models():
"""Load application models for management script & app availability."""
for model in dynamic_loader('models', is_model):
setattr(modules[__name__], model.__name__, model)
Up next
That about wraps up how I handle organizing, discovering, and dynamically loading as many models as my Flask apps may need. It brings Flask apps with multiple models into the hassle-free ease-of-use that is a hallmark of the heavier frameworks. I find this to be a significant win for Flask app development as adding models is as simple as defining a new model_name.py
in your models
directory.
Since we’ve been covering automatic discovery & loading of Python modules in a Flask app, I think it’s only natural that we turn our attention to doing the same thing for class-based views in Flask. Stay tuned.