Metrics and traces provide important context for understanding what an application is doing and finding areas for improvement. But sometimes there are lower-level issues that emerge from how code interacts with runtime environments, or crashes and constraints that result from problems with garbage collection and memory management. When standard observability practices don’t provide enough insight, how do you know where a problem is coming from and what to optimize?
Microservices have become the industry standard for building new applications and most modern operations tooling is designed for service-oriented architectures. While refactoring a large monolithic application into smaller services is desirable, it’s not always feasible or practical. Braze’s core platform, a monolithic Ruby application, handles so many features that it would take months or even years to do this safely. So how do you operate a monolith in a microservices world? In this session, I’ll share how the team at Braze developed a microservices approach to our monolith that lets us better utilize modern tools, increase developer velocity, and improve reliability.
Season 02 Episode 04. Scott interviews Tim Koopmans, founder of Tricentis Flood and discusses browser level load testing. He also interviews Zach McCormick and Matt DiSipio of Braze where they demonstrate how they use Flood to load test their platform, which handles billions of transactions a day.
Deploying changes can be a big deal when working with high-frequency, highly-available distributed systems. How do you detect if something went wrong? From there, how do you measure the impact? How do you track performance changes across thousands of nodes? In this talk, learn what systems and best practices Braze implemented to monitor and understand the effects of deploying changes to a massive, always-changing distributed system where performance is critical. Whether you're deploying new features in code or adding a new class of infrastructure, no technology works out of the box at scale - a lesson we learned the hard way with one of our latest new features, Content Cards.
How do you determine whether your MongoDB Atlas cluster is over provisioned, whether the new feature in your next application release will crush your cluster, or when to increase cluster size based upon planned usage growth? MongoDB Atlas provides over a hundred metrics enabling visibility into the inner workings of MongoDB performance, but how do apply all this information to make capacity planning decisions? This presentation will enable you to effectively analyze your MongoDB performance to optimize your MongoDB Atlas spend and ensure smooth application operation into the future.