War-room stories Pt I: Designing role based authorization for Micro-services using Elixir/Erlang OTP

Akul
7 min readFeb 17, 2017

There comes a point in every startup where, as a product architect, you wake up in cold sweat one fine night because you’ve had this “what-if” dream. What if the whole thing topples down? Or you have left out that many crucial use cases? Was it poor engineering, design or a polar opposite of requirements. After a lot of brainstorming you pin to one solution and try to stick to it and keep your ears open to improvizations. Rapid improvizations. Sometimes developers won’t agree and you would need to go back to the drawing board, discussing network level optimizations or scalability vs resource conflict or simply wrong database in the wrong place.

And only then, you would need to rethink facing challenges designing a product that will be battle tested. In such cases, my learning of ‘back of the envelope’ usually comes to the rescue. This is one such story.

[Part 1 of this blog will set the stage for problem, requirements and approach taken. In Part 2 of this blog, we conclude with the solution]

Background

For as long as I can remember, resource ownership has been the crux of most SaaS applications with an option of delegation, resource sharing, back-links and permissions based views.

And for the same period of time, I’ve seen too many project management tools, todos and ERP systems built as modular as possible. Yet they all have felt monolithic. If you are writing an app that has to be as vendor lock-in free with a generic abstraction that you could hook with an ERP and a project management system, both from different providers, there is a lot of API level work to do.

Let me give you a background of our service model. I am working with a team developing an early stage product management & trading systems platform. It is based on a polyglot stack based off python, node stack and Erlang/Elixir. On an average day, it allows 500 clients transacting 5 GB * 500 amount of data which includes media such as videos, images, text and product catalog templates. When it scales, it will start off with add on services for contracts handling, trading networks, tenders and auctions handling atleast a terabyte of data daily.

What if you designed an application for millions of consumers and wanted to integrate with two different sets of APIs that both managed *your* users’ data from a common pool of user’s resources? But had to centrally be able to provide role based permissions in your app? And it has got to be *modular* because that’s how everyone seems to be going these days, is container-friendy and scalable? It has got to span across multiple datacenters and work in tandem, in sync, with high availability and high consistency, the usual CA and CP of CAP

source: http://blog.scottlogic.com/

Faced with such a challenge and 6 hours to design and develop a unified role based permission system, I came up with a segregation mechanism that concerned each task at one level and compatible enough with our in-house architecture without breaking changes.

Modular Micro-services

Consider a 10000 feet view of an architecture such as this:

A function specific multi-layered modular architecture

All applications stay at, say, tier A. A middle-ware at tier B exists that acts as an interface between external internet and internal network. And lastly a storage layer, tier C that is the main data store not locked-in to any particular service.

Consider Tier A is a group of modular app services. We could break down a monolithic app into a task specific services.

At Tier B, which acts as an SNAT (Source Network Address Translation. think OpenStack Neutron) and an API LUT middleware which is responsible for Apps cross talking to each other, taking care of storing data from Apps to Storage layer and providing consistent security, verifying trusted apps within this private network and exposing public APIs to external world. For now, let us concern ourselves with these two layers.

Zooming in a little

If I rotate Tier A & B such that Tier A is on the left then, this
← happens.

For communication, we use the Zmq protocol with a set of communication standards that define the services from each App A1, A2, A3..An. There is a queuing system between each app and Tier B to handle heavy traffic. There is a lot of scope for improvisation here but I would be dragging you away from the core topic.

Multiple Tier B middlewares M1, M2, M3 connected to a load balancer. All within one container.

Consider that all the vendor lock-in free solution had to be abstracted away from Tier A and that each application may have their own settings and purpose of fetching or rather CRUD-ing on a user’s or entity’s data from a common entity management platform, this platform can be brought to Tier B.

Keeping user privacy in consideration and given we have TLS enabled and other security parameters in check, OAUTH2.0 seemed to be a good fit for our use case.

Enter OAUTH 2.0

A great deal has been written about OAuth & how it has defined the way secure modular services can be achieved. I presume the reader has, if they are not acquainted, skimmed through the OAuth Spec.

While each App from A1 to An registers itself as a client, the middleware is not a monolithic entity in itself. In turn, Middleware is broken down into two major platforms, each an umbrella service.

While one platform simply focuses on a realtime communication layer (here forth called RC) that performs API lookup from a Look Up Table that has a list of registered App specific internal APIs listed for registered private ips, allowing apps to cross talk with each other by making itself as a ROUTER-DEALER communication protocol, the other part is simply Resource Management by way of OAuth specification.

One need only think of treating all apps in the App layer as 3rd party Apps registering themselves to Facebook and authorizing on behalf of Facebook’s users. This is exactly how Resource Management isolates itself and trusts only RC. This can be handled with an asymmetric key validation or a signed certificate and is a developer concern.

The Resource Management itself is made up of resource server and authorization service as per OAuth2.0 specification (Resource Management and authorization service here forth called RM).

source: https://developer.salesforce.com/page/File:OAuthRoles.png

Once we got the hang of treating in-house engineered apps as external apps, it was a no-brainer that this could be a good use case while commercializing from 3rd party vendor apps integrating with our OAuth service, registering themselves the same way as in house apps did.

We have an OpenID implementation in the works, that allows for a vast metadata to be shared upon user sign in to other apps while the RC also acts as an LUT for storing Bearer Access Tokens generated via RM per app. This allows the trusted RC to be a single point of reference allowing apps to cross talk via RC and access User’s API resources on RM after authorization via RC.

Part A of Elementary Units & ES/CQRS: Permissions, Client Level Rights, User level Rights & Synchronizing databases

At this point, the whole architecture boils down to an RM dashboard.

Now it becomes easy to control user. It becomes easy for other apps to share a User’s resources based on their client-level access privileges that user can authorize. It becomes easy for apps to create their settings & permissions. But we have gone a step further to create a central repository of roles and permissions through a standard structure of creating Permission sets. With that, applications inter-dependent on each other can have a common policy engine which is simply an IFTT engine that automates tasks spanning across two or more apps working in sync with each other.

This is where we have scratched the surface and create a functional paradigm for role based access rights. An Administrative user with power privileges can exercise their rights to grant universal privileges and app based privileges to a particular user or simply select one of the existing Profile templates with preconfigured roles per app. This kind of role management can have a “check access first” or “fail first” rule while calling the APIs from any of the apps. In the former case, the apps can be supplied with a dynamically created role based access JSON of a user as they sign in and in the latter case, an API call will simply be denied because the user doesn’t have the rights.

Stay tuned for Part 2 of the blog where we conclude our system that has broken down all actions as a series of hooks and atomic transactions based on Permissions.

--

--