Accelerating Ulta Beauty’s modernization with managed containerized microservices
Michael Alderson
Senior Cloud Architect, Ulta Beauty
Dave Bartoletti
Sr. Product Manager, Cloud Runtimes
Founding the digital store of the future
With a perfectly symmetrical cat-eye toward tomorrow, Ulta Beauty has been laying the foundation for its upcoming Digital Store. This transformational, digital touchpoint delivers a redesigned, highly personalized, and compelling e-commerce experience. To do this right, Ulta Beauty requires a modern, containerized platform, an agile infrastructure to support rapid application development, and managed services to reduce operational burden on an already busy team. With Google Cloud, Ulta Beauty is delivering an e-commerce platform that scales globally and efficiently to deliver an even more engaging, enjoyable, and accessible shopping experience for all.
In 2019, Ulta Beauty began moving away from its legacy e-commerce platform that ran in their own on-prem data centers. The monolithic platform had become increasingly difficult to update and upgrade, hampering new features and capabilities. With such bottlenecks in mind, Ulta Beauty decided to refactor into multiple, distinct microservices to perform unique functions so development teams can create and test new features and release fixes faster. Microservices can scale independently yet are interconnected and visible to one another, and they can introduce new system complexities at scale. It was critical for Ulta Beauty to deploy microservices quickly, to avoid unnecessary distractions in integrating services with one another — or managing the underlying infrastructure themselves.
Ulta Beauty chose Google Cloud for its leadership and expertise with containers and Kubernetes and its unified managed container services, Google Kubernetes Engine (GKE) and Anthos. Google Cloud’s cloud-based managed services, knowledgeable technical support, and cost-efficient pricing were added perks.
A blushing partnership
Upon joining Ulta Beauty, Senior Cloud Architect Michael Alderson had no experience with Kubernetes or Google Cloud, but he understood containers. To begin this modernization journey, Michael sought a new way to create and manage infrastructure, putting it in developers’ hands quickly.
Recognizing the many ways productivity could be impacted, Michael needed to ensure his environment was ready with properly configured cloud ‘landing zones’ whenever developers needed them. His developers worked hard to build containers but lacked production-like dev or test environments. Michael said, “They couldn’t test containers with the new APIs needed for integration. With GKE and Anthos, developers have dev environments on Google Cloud whenever needed, making them ten times more productive.”
As a newly minted Cloud Architect, Michael familiarized himself with Google Cloud’s training pathways and certifications. With his experimental mindset, he got to work studying Kubernetes, containers, and the role of a service mesh to manage large fleets of containerized microservices at scale.
First up was putting GKE to work. Michael and team soon discovered a key benefit of ephemeral container environments, managed by Kubernetes: they could quickly try something, learn, and try again — without a huge upfront investment.
With Google Cloud and GKE, the Ulta Beauty IT team could now create, manage, optimize, and secure container platforms for developers in record time. “We probably built over a thousand clusters and burnt them down learning Kubernetes — this would have taken months or years to accomplish before.” GKE allowed them to stand up and tear down new environments for developers. “We replicated what five teams would have to do with a monumental effort to integrate vendors and services — and we did that as a single, cohesive unit. This technology accelerated our efforts beyond what we were able to achieve previously.” Michael added, “With ephemeral container development environments on GKE, spun up on demand, we can deploy a new feature to a microservice in about 10 minutes — globally — a fraction of the time required in the past. And importantly, the risk factor in any update is drastically reduced with microservices.”
Yet, Michael still needed to maintain control over his growing, increasingly complex web of interconnected e-commerce microservices.
Maximizing engineering resources with Anthos
Enter Anthos, Google Cloud’s managed platform for consistent, holistic management, observability, and security for distributed containerized apps wherever they are built or run. Anthos includes a managed service mesh, which dramatically streamlines service delivery, eases traffic management, secures communication between services, and speeds up troubleshooting with deep visibility into inter-service networking. A service mesh also streamlines inter-app communication, asserts rules over which services can talk to each other, and assures high availability of services in the event of a failure.
Anthos Service Mesh is Google’s fully managed implementation of open-source Istio, which alone is powerful, but can be difficult to install, configure, and maintain. Anthos Service Mesh allows Michael and team to rely on Google Cloud to manage the Istio components, so they can focus on optimizing and troubleshooting Ulta Beauty’s apps. “The fact that we have metrics built in and can use those metrics to auto-scale and auto-heal, which is native with Anthos Service Mesh, fuels a better guest experience.”
“We can better quantify guest experiences because we see errors, reporting, and where we haven’t gotten it right yet. It enables our software development and release processes (DevOps) to mature, leading to better business choices and better guest experiences. Today, this is a competitive advantage for Ulta Beauty.” Michael added that each microservice is dynamically instrumented by Anthos, so developers don’t have to learn the entire system to debug one component or to track monitor and log data.
Anthos Config Management, another managed service of Anthos, allows the team to spin up new environments from a standard template with predefined networking and security policies. Leveraging the declarative power of Kubernetes, Ulta Beauty can quickly deploy uniform configurations across its fleet of clusters, enforce consistent security guardrails, and share load balancing across clusters and regions. To Michael, Anthos let him define and then “...establish environments and automatically keep them the same, from dev to prod, which was a necessity. We have 70-80 different integrations to test in our new ecommerce platform — without Anthos, we’d need to spend a week every month sorting them across dev and test environments. Anthos keeps everything automatically up to date.”
Because of Anthos Multi Cluster Ingress’ built-in traffic routing, Ulta Beauty can eliminate the need for certain third-party services to assist orchestration. When the company’s security team requested an upgrade to the existing firewall to harden the organization’s security posture, he showed them why it was unnecessary and how it would actually hamper performance. “Rather than significantly invest in a next-generation firewall we didn’t need, we can solve it within the mesh, as it should be. It should be part of the network, not a separate element that slows us down.” Ulta Beauty saved several hundred thousand dollars in additional licensing fees by using Anthos built-in traffic routing.
For Michael, the flexibility of GKE and Anthos to spin up servers means his developers can learn hands-on by testing a code, tearing it down, and trying again. “It’s important to make mistakes and not be crucified, we all need the ability to learn by doing and that’s what Google Cloud allows.”
Anthos’ ease of use is accelerating Ulta Beauty’s e-commerce modernization and already improving the organization’s business workflows. The shift to Anthos cloud services, which are available anywhere, reduces risk and the severity of errors by offloading platform availability to Google Cloud. When Michael was forced to work remotely for over two months, his team ran without a problem and continued innovating.
Since implementing its new service mesh, Ulta Beauty’s modernization has been progressing rapidly, especially because the development team no longer needs to focus on security or platform operations; they’re free to concentrate on building engaging guest services and apps. Mean time to recover (MTTR) from errors is now measured in seconds versus hours, at least two orders of magnitude faster. Errors are now traced quickly to the clusters or pods affected, and can be remediated without impacting any other components – importantly, without impacting guest experiences. With faster response times and much shorter MTTR, Ulta Beauty’s guests enjoy a superior experience on its website and throughout the purchase process.
Primer for the future
As Ulta Beauty looks ahead, it's taking stock of how far it has come in such a short time and will continue leveraging Anthos Service Mesh as a critical component. Over the next year, Michael and team will leverage Google Cloud’s multi-region capabilities to improve application availability and disaster recovery in the event of a cluster failure. Anthos and GKE seamlessly handle workload redistribution when infrastructure is unavailable. As the company continues innovating and deploying new services and experiences, Google will continue to manage and administer its environment as developers make the shopping experience as beautiful as its guests. Google Cloud is proud to partner with Ulta Beauty, supporting the company’s developers through thick, thin, and other eyebrow stages.