OpenXLA Project

OpenXLA

An open ecosystem of performant, portable, and extensible machine learning (ML) infrastructure components that simplify ML development by defragmenting the tools between frontend frameworks and hardware backends. Built by industry leaders in AI modeling, software, and hardware.

Upcoming Events

Community Meeting 2024/12/17 @9AM PT

Notes/Google Meet Link

OpenXLA Dev Lab November 2024

Our next Dev Lab is taking place on November 14th in Sunnyvale, California.

Details

XLA

About

XLA (Accelerated Linear Algebra) is an open source compiler for machine learning. The XLA compiler takes models from popular frameworks such as PyTorch, TensorFlow, and JAX, and optimizes the models for high-performance execution across different hardware platforms including GPUs, CPUs, and ML accelerators.

Source Code

Installation/Usage

XLA comes prebuilt for many ML frameworks. For information on how to use XLA in these cases, see the documentation and individual framework pages.

JAX TensorFlow PyTorch

Documentation

The XLA documentation covers a number of basic and advanced topics, such as how to integrate a new PJRT plugin, implement a new XLA backend, and optimize XLA program runtime.

Documentation

StableHLO

About

StableHLO is an operation set for high-level operations (HLO) in machine learning (ML) models. Essentially, it's a portability layer between different ML frameworks and ML compilers: ML frameworks that produce StableHLO programs are compatible with ML compilers that consume StableHLO programs.

Source Code

Documentation

The StableHLO documentation covers a number of topics, such as the specification of the StableHLO OpSet, and how to export StableHLO graphs from common ML frameworks.

Documentation

Shardy

About

Shardy is an MLIR-based tensor partitioning system for all dialects. Built from the collaboration of both the GSPMD and PartIR teams, it incorporates the best of both systems, and the shared experience of both teams and users.

Source Code

Documentation

The Shardy documentation covers sharding concepts, dialect overview, and getting started tutorials for using Shardy from JAX or integrating Shardy into a custom MLIR pipeline.

Documentation

PJRT

About

PJRT is a hardware and framework independent interface for ML compilers and runtimes. It is currently included with the XLA distribution. See the XLA GitHub and documentation for more information on how to use and integrate PJRT.

Documentation Source Code

Community

Announcements/Discussions

Join the openxla-discuss mailing list to get news about releases, events and other major updates. This is also our primary channel for design and development discussions.

Join group

Discord

Join the OpenXLA Discord to participate in chats about XLA and StableHLO topics.

Open Discord

Meetings

Meetings are held monthly via Google Meet on the 2nd or 3rd Tuesday at 9AM PT. Please see the meeting document or openxla-discuss for specific dates and topics.

Recent meeting notes/Google Meet links Archived meetings notes

YouTube

Stay up to date on all the latest news and announcements from the OpenXLA community.

Open YouTube

Contributing

We welcome contributions from the community. Please see our contributing guidelines for more information.

Contributing guidelines

Industry partners

The OpenXLA project is developed collaboratively by leading ML hardware and software organizations.

Alibaba

“At Alibaba, OpenXLA is leveraged by Elastic GPU Service customers for training and serving of large PyTorch models. We've seen significant performance improvements for customers using OpenXLA, notably speed-ups of 72% for GPT2 and 88% for Swin Transformer on NVIDIA GPUs. We're proud to be a founding member of the OpenXLA Project and work with the open-source community to develop an advanced ML compiler that delivers superior performance and user experience for Alibaba Cloud customers.” - Yangqing Jia, VP, AI and Data Analytics, Alibaba

Amazon Web Services

“We're excited to be a founding member of the OpenXLA Project, which will democratize access to performant, scalable, and extensible AI infrastructure as well as further collaboration within the open source community to drive innovation. At AWS, our customers scale their generative AI applications on AWS Trainium and Inferentia and our Neuron SDK relies on XLA to optimize ML models for high performance and best in class performance per watt. With a robust OpenXLA ecosystem, developers can continue innovating and delivering great performance with a sustainable ML infrastructure, and know that their code is portable to use on their choice of hardware.” - Nafea Bshara, Vice President and Distinguished Engineer, AWS

AMD

“We are excited about the future direction of OpenXLA on the broad family of AMD devices (CPUs, GPUs, AIE) and are proud to be part of this community. We value projects with open governance, flexible and broad applicability, cutting edge features and top-notch performance and are looking forward to the continued collaboration to expand open source ecosystem for ML developers.” - Alan Lee, Corporate Vice President, Software Development, AMD

Anyscale

"Anyscale develops open and scalable technologies like Ray to help AI practitioners develop their applications faster and make them available to more users. Recently we partnered with the ALPA project to use OpenXLA to show high-performance model training for Large Language models at scale. We are glad to participate in OpenXLA and excited how this open source effort enables running AI workloads on a wider variety of hardware platforms efficiently, thereby lowering the barrier of entry, reducing costs and advancing the field of AI faster." - Philipp Moritz, CTO, Anyscale

Apple

Apple Inc. designs, manufactures and markets smartphones, personal computers, tablets, wearables and accessories, and sells a variety of related services.

Arm

“The OpenXLA Project marks an important milestone on the path to simplifying ML software development. We are fully supportive of the OpenXLA mission and look forward to leveraging the OpenXLA stability and standardization across the Arm® Neoverse™ hardware and software roadmaps.” - Peter Greenhalgh, Vice President of Technology and Fellow, Arm.

Cerebras

“At Cerebras, we build AI accelerators that are designed to make training even the largest AI models quick and easy. Our systems and software meet users where they are -- enabling rapid development, scaling, and iteration using standard ML frameworks without change. OpenXLA helps extend our user reach and accelerated time to solution by providing the Cerebras Wafer-Scale Engine with a common interface to higher level ML frameworks. We are tremendously excited to see the OpenXLA ecosystem available for even broader community engagement, contribution, and use on GitHub.” - Andy Hock, VP and Head of Product, Cerebras Systems

Google

“Open-source software gives everyone the opportunity to help create breakthroughs in AI. At Google, we're collaborating on the OpenXLA Project to further our commitment to open source and foster adoption of AI tooling that raises the standard for ML performance, addresses incompatibilities between frameworks and hardware, and is reconfigurable to address developers' tailored use cases. We're excited to develop these tools with the OpenXLA community so that developers can drive advancements across many different layers of the AI stack.” - Jeff Dean, Senior Fellow and SVP, Google Research and AI

Graphcore

“Our IPU compiler pipeline has used XLA since it was made public. Thanks to XLA's platform independence and stability, it provides an ideal frontend for bringing up novel silicon. XLA's flexibility has allowed us to expose our IPU's novel hardware features and achieve state of the art performance with multiple frameworks. Millions of queries a day are served by systems running code compiled by XLA. We are excited by the direction of OpenXLA and hope to continue contributing to the open source project. We believe that it will form a core component in the future of AI/ML.” - David Norman, Director of Software Design, Graphcore

Hugging Face

“Making it easy to run any model efficiently on any hardware is a deep technical challenge, and an important goal for our mission to democratize good machine learning. At Hugging Face, we enabled XLA for TensorFlow text generation models and achieved speed-ups of ~100x. Moreover, we collaborate closely with engineering teams at Intel, AWS, Habana, Graphcore, AMD, Qualcomm and Google, building open source bridges between frameworks and each silicon, to offer out of the box efficiency to end users through our Optimum library. OpenXLA promises standardized building blocks upon which we can build much needed interoperability, and we can't wait to follow and contribute!” - Morgan Funtowicz, Head of Machine Learning Optimization, Hugging Face

Intel

“At Intel, we believe in open, democratized access to AI. Intel CPUs, GPUs, Habana Gaudi accelerators, and oneAPI-powered AI software including OpenVINO, drive ML workloads everywhere from exascale supercomputers to major cloud deployments. Together with other OpenXLA members, we seek to support standards-based, componentized ML compiler tools that drive innovation across multiple frameworks and hardware environments to accelerate world-changing science and research.” - Greg Lavender, Intel SVP, CTO & GM of Software & Advanced Technology Group

Contact

For direct questions, contact the maintainers - maintainers at openxla.org