Wednesday, July 17, 2019

LIFESPAR cloud architecture principles to follow

With more and more organisations adopting cloud technologies for their applications, I've seen the tendency to just "life and shift" architectures. Physical servers are replicated as virtual machines and the same software applications as before are run "on somebody else's computer". 
  • Latency-aware
  • Instrumented
  • Failure-aware
  • Event-driven
  • Secure
  • Parallelizable
  • Automated
  • Resource-consumption-aware

But this approach doesn't leverage many of the benefits of software-as-a-service (SaaS) or the new cloud-only components available as platform-as-a-service (PaaS). It also means that architects creating cloud-native solutions need to have a different set of principle to before.
Some of the best guidance I have seen in this area comes from Gartner, who use the acronym LIFESPAR to explain those principles to follow for designing cloud-native architectures.

So what is LIFESPAR and what does it mean to an architect?

Latency-aware:
Understand that every application needs to be designed & implemented knowing that it may not get an instant response to each request. This latency could be milliseconds, or it could be seconds, so ensure each solution elegantly deals with delays and is tested to prove it works under these conditions.

Instrumented:
Every solution and every component must generate sufficient data about their usage and ideally send this information back to a central location, so that real-time & subsequent decisions about the architecture, cost, volumes, etc. can be made. In this way, as instrumented approach supports an elastic automatically scaling system (scaling both up AND down).

Failure-aware:
Remember that things fail (hardware, processes, etc.) and that software, created by humans, is never usually bug-free. So always design solutions that wait or fail & re- in the way you need them to. Failures must also be comprehensively tested - if necessary, writing code to force failures (e.g. the Chaos Monkey). Also bear in mind that in some scenarios... latency and failure are the same thing.

Event-driven:
Applications used to developed with synchronous actions (as the performance, etc. of the target system was known). But now in a decoupled architecture, messages need be sent as events. This also simplifies scaling and makes them more resilient to failure.

Secure:
Always assume your solution will be subject to some sort of malicious activity and try to prevent it. This means restrict events and users, minimising attack surfaces and following best practice data handing and security processes.

Parallelizable:
Many small systems are usually cheaper that one large one, even in the cloud. Therefore find ways to scale-out your solution and its processing & messaging, rather than scaling-up.

Automated:
Every cloud-based component and also your overall solutions should be able to be deployed, started, stopped & reset via scripts. Remember to test this from a command line, from the beginning of development through to your Operational Acceptance testing (and even as a means of testing Disaster Recovery processes).

Resource-consumption-aware:
So, you now have almost limitless processing and storage resources at your fingertips. But you also either have your credit card charged as you use the service or are going to get an invoice for what you use very soon... Therefore, always consider using the least amount of cloud resources possible. Simplify your solution, build & burn of components & environments, automate components to start & stop only when they are needed and don't store more than you need (e.g. by sharing test data across solutions & environments).

No comments: