The case for full-stack observability in a modern world of distributed applications

The app-driven digital economy and the future of work that has been slowly emerging over the past few years experienced an adrenaline rush in March 2020. Prior to the pandemic, 50% of companies respondents to the World Economic Forum expected software, automation and AI to drive significant re-skilling of their workforce as well as reductions. COVID-19 has dramatically accelerated and exacerbated this, deeply affecting software developers.

More and more business transactions, autonomous supply chain control loops, healthcare delivery, agricultural efficiency, education and entertainment are happening through modern distributed cloud-native applications.

The app is the new brand

The business agility and quality of digital experience provided by modern apps has led to the latest industry mantra: app experience is the new brand. This app experience demands faster cadence of features and functions, constant availability, improved app performance, and paramount trust and security around the data managed by the app. AppDynamics’ Application Attention Index watch brands have a chance to offer “total app experience”.

At the heart of delivering this app experience is the developer, who is now tasked with delivering these apps and features faster, with higher availability and better security than ever before. Developers now live in the land of plenty and the age of choice. They have a wide range of software APIs and services available to build applications ranging from mobile APIs to public cloud APIs, SaaS APIs, edge computing APIs, and on-premises APIs that their development teams internals could provide. They should select software services that streamline application development while keeping customer data secure. Building modern applications powered by external, internet-centric cloud environments is very different from the monolithic closed platforms of a bare metal server or virtual machine.

In this modern, distributed application development environment, which runs on complex underlying network and internet infrastructures, being able to observe your applications end-to-end and top-down across all APIs, software services, back-end subcomponents, and all software and hardware infrastructure are critical to delivering better customer experience, application availability, and performance. This visibility is also essential for reducing the mean time to resolution (MTTR) of failures and monitoring KPIs on how the business is doing and is potentially impacted, positively or negatively, by software and infrastructure changes. This is called full stack observability.

Full observability allows anyone (developer, SRE, product, customer success, or business manager) to answer “What happened?” “Where did this happen?” “Why did this happen? and “Can this happen in the future?”

It is helpful to illustrate this with a real-world example, where full-stack end-to-end observability helped reduce MTTR and reduce business impact of a modern banking application.

Alice and her date with full stack observability

Alice is a developer in the mobile banking application team at New Bank, Inc. Two months into the pandemic, her product manager asked her to develop a new feature for the New Bank mobile application: Cash Withdrawal. non-contact species. A customer would use the feature to first locate the nearest ATM and get directions to the ATM. The mobile app would then authenticate and verify ATM proximity, customer credentials, and the amount to be withdrawn from their account. The customer is then simply prompted to withdraw the money (yes, touch involved at this point) at the ATM, without having to touch the ATM’s high-traffic screens or buttons.

The customer experience was pretty straightforward, but the developer experience was anything but. Alice had to start with the mobile APIs (say iOS), because that’s where her customers interacted with the app. All of its back-end was in AWS, so it had to select its AWS services carefully, while customer data was accessed through Salesforce SaaS APIs. His bank’s transactional back-ends existed on-premises on bare metal servers on a monolithic database whose APIs provided a global and account-level consistency picture, while his ATM branch’s edge compute nodes had a different set of APIs to manage geotagged species consistency. There were other SaaS APIs to manage location, identity, compliance, etc.

A month after the production roll-out, the customer success team is starting to receive an increased number of calls regarding the contactless cash withdrawal feature, which is taking too long to spit out the cash at various ATMs . Simultaneously, using a comprehensive observability solution, the business metrics team observed an increase in transaction delays in the Digital Endpoint Monitoring (DEM) dashboard for the mobile banking application.

Alice and her fellow developers and SREs begin invoking code using the comprehensive observability APIs that uniformly query and correlate relevant events across the data platform, including metrics, logs, and data traces. each API, application, service and infrastructure (HW or SW) described in the distributed development environment above. Full Observability UX allows each person – e.g. developer, SRE, product, business leader, customer success – to focus relevant information only on events that are relevant to the person.

After a few quick debug cycles, they noticed that the latency between a service in AWS US-East and their on-premises software stack had increased steadily over the past hour. Using any capable monitoring tool, one could easily jump to the conclusion that it might be a network issue. But using full-stack observability, they were able to discover that a few memory (RAM) banks on their on-premises database server had failed. This forced that database server to queue incoming requests, which increased the service layer latency between the AWS service and their on-premises software stack.

If software ate the world…

Then, full-stack observability will ensure that the software is feature-rich, rapidly evolving, high-performing, reliable, and secure, and will ensure that consumers of that software get the best possible digital experience. This becomes especially true with modern distributed software built on a variety of APIs and infrastructure stacks, distributed across third-party vendors, and running over the Internet.

Source link

Comments are closed.