Jamie Kuppens - Building Multi-Tenant Applications on App Engine

⚠️NOTICE: This article was written several years ago and is likely out of date.

With the rise of cloud computing and the proliferation of new users to the web, it has become a necessity for successful web applications to be able to serve thousands of users whilst being reliable, fast, and secure. People depend on web services for their day-to-day lives much more than they did during the internet’s infancy, and it’s expected that applications meet user expectations in order to stay relevant.

What is Multitenancy?

Multitenancy is a software architecture in which one application serves multiple tenants. Multi-Tenant applications can easily provision access to tenants with minimal effort from the service provider which makes it ideal for modern applications. Multitenancy is regarded as an important feature of cloud computing.

Tenants

A tenant can refer to either a user, a client, or an organization.

Tenant is a loosely defined term, but it’s typically used to describe either a user, a client, or an organization. For example an organization on GitHub with it’s own set of repositories can be regarded as a tenant, while a GitHub user may also be considered one. Another example can be of a SaaS application where the tenant is a client and users are owned by the tenant rather than being the tenant themselves. In paid applications it is typically the tenant who pays, which in it of itself is the true definition of a tenant.

Data Isolation

Each tenant may have private information that they don’t want exposed without giving explicit permission. How data is isolated depends on how data is actually stored. For example if the developers choose to use one database for all tenants, they would have to tread carefully when writing queries so data from other tenants is not accidentally leaked. This is known as the Shared Schema approach and is common when dealing with relational data.

Another approach to dealing with data is to create separate databases for each tenant to guarantee data does not leak, though relationships cannot be made between databases when isolated.

Each of these approaches have their use case and it’s not one or the other, in fact a mixture of both approaches can be used. For example if your application is a simple public facing web service where the tenants are users, then you may only need to isolate private user data that would never be shared into separate databases, but store user credentials and relational data in a shared database for quick access.

Availability

Since a multi-tenant application is just one application serving multiple tenants, if the application goes down all tenants will be denied service. This differs from a single-tenant approach where the application is provisioned individually for each tenant, in which case if the application stops working it affects only the one tenant.

Namespaces

For quite a while now, App Engine has included options with their various APIs that makes implementing Multi-Tenant applications easy. Multitenancy in App Engine revolves around the use of Namespace enabled APIs.

Namespaces allow you to separate data into their own isolated silos through a simple API which can be used with minimal effort. Namespaces are accessed through a namespace id which are created implicitly when accessed. For example, if you try to store notes in datastore with the namespace my_notes and it does not exist, App Engine will happily create it for you and place the data in with no questions asked. One thing to note is that when you use a namespace enabled API under app engine without specifying a namespace, resources are stored and retrieved from a default unnamed namespace.

Let’s use namespaces for storing private user data. For simplicity, we’re going to be using the Users API for authentication. To create namespaces for users, we must have a unique identifier to separate data from other users. The Users API exposes a user id which can be used in namespace names when querying data specific to that user. Since user ids are unique, we don’t ever have to worry about leaking information to the wrong user.

Note that this is not a tutorial about the basics of App Engine and I’ll gloss over alot here. First things first we need to allow authentication with the Users API. Here is a basic App Engine app that lets a user login and obtain a user id with user.user_id().

import webapp2

from google.appengine.api import users

class MainPage(webapp2.RequestHandler):
    def get(self):
        body = ''
        user = users.get_current_user()
        if user:
            nickname = user.nickname()
            logout_url = users.create_logout_url('/')
            body = 'Welcome, {}! (<a href="{}">Sign Out</a>), ID: {}'.format(
                nickname, logout_url, user.user_id())
        else:
            login_url = users.create_login_url('/')
            body = '<a href="{}">Sign in</a>'.format(login_url)

        self.response.headers['Content-Type'] = 'text/html'
        self.response.write('<html><body>{}</body></html>'.format(body))

app = webapp2.WSGIApplication([
    ('/', MainPage),
], debug=True)

Now that we have an id we can start using namespaces. The App Engine python library uses global state for keeping track of the current namespace used by namespace enabled APIs. I’ll create a counter to keep track of how many times the current user has loaded the page that only the current user can see.

from google.appengine.api import users
from google.appengine.api import namespace_manager
from google.appengine.ext import ndb

class MainPage(webapp2.RequestHandler):
    def get(self):
        ...

        if user:
            nickname = user.nickname()
            logout_url = users.create_logout_url('/')
            body = 'Welcome, {}! (<a href="{}">Sign Out</a>), ID: {}'.format(
                nickname, logout_url, user.user_id())

            # Save the current namespace so it can be reset. It's good practice
            # to reset the namespace to the previous state so code that uses no
            # namespaces does not write to the wrong place.
            previous_namespace = namespace_manager.get_namespace()
            try:
                # Create a namespace name with the user's id appended to it
                # before running the update counter function.
                namespace_manager.set_namespace('user_{}'.format(user.user_id()))
                count = update_counter('counter')

                body += '<br>Visit Count: {}'.format(count)
            finally:
                # Always restore the saved namespace. This will still run if the
                # counter update fails.
                namespace_manager.set_namespace(previous_namespace)

...

class VisitCounter(ndb.Model):
    previous_namespace = namespace_manager
    count = ndb.IntegerProperty()

@ndb.transactional
def update_counter(name='visits'):
    """Increment a counter entity with a given name."""

    counter = VisitCounter.get_by_id(name)
    if counter is None:
        counter = VisitCounter(id=name, count=0)
    counter.count += 1
    counter.put()

    return counter.count

Now the user visit count for the currently logged in user will be stored in datastore under a namespace specific to the user.

If you log into another user, then you will get a different count as other users namespace names are not shared. There are many more APIs that use namespaces such as Memcache, Task queue, and Search.

Summary

App Engine APIs make it easy to write scalable multi-tenant applications using namespaces. If you want to learn more about App Engine, I suggest you start with their python guide here. You can get the source code of this demo on GitHub.