Amazon Web Services (AWS) announced a new fully managed operations service that uses machine learning to help developers improve application availability. It does this by automatically detecting operational issues and recommending specific actions to address them, CEO Andy Jassy said during re: Invent presentation on Tuesday.
The service, Amazon DevOps Guru, identifies abnormal application behaviors, such as increased latency, error rates, or resource constraints that could cause potential service outages or interruptions, and then alerts developers to them. communicating these details. It reports the resources involved, the timing of issues, and related events through Amazon Simple Notification Service (SNS) and integrations from partners like Atlassian Opsgenie and PagerDuty, Jassy explained.
The goal is to help organizations quickly understand the potential impact and probable causes of the problem by making specific recommendations to address them. “Developers can use suggestions from Amazon DevOps Guru to reduce problem resolution time and improve application availability and reliability without the need for manual configuration or machine learning expertise,” AWS explains in a blog post.
Thanks to the pandemic, teams realize they need to work more closely together
“Application shutdowns caused by erroneous code or configuration changes, unbalanced container clusters, or depleted resources (eg, CPU, memory, disk, etc.) inevitably lead to bad experiences for customers and loss of income “.
Like many AWS customer services, DevOps Guru has been used internally, and Andy Jassy describes it as the culmination of 20 years of operational expertise in building, scaling and maintaining highly available applications for Amazon.com.
Speaking with about the new service, Australia and New Zealand Chief Technology and Public Sector Transformation Director Simon Elisha said at the end of the day, DevOps is about doing a lot more. with much less and to act more quickly. He added that DevOps Guru is the perfect tool to achieve this. “If you think about today’s systems, they generate more information than ever, more telemetry than ever, warnings, notifications, messages, etc. and that’s a good thing because you get a lot more information, but it can be very difficult to know when something is changing, when something is different, and a lot of what DevOps is all about is understanding the relationship between the changes you make in code and what happens. goes into production, ”he said. “The ability for anyone to see what’s going on in your environment without any manual configuration, without training models, without doing anything except a few clicks.”
Elisha said that thanks to the pandemic, teams are realizing they need to work more closely together, but that doesn’t necessarily mean being physically together. He pointed out that AWS Proton (more on this later), which was also announced on Tuesday, is a great example of bringing together “infrastructure people” and developers, but in a way that allows both to achieve the results they need.