Innovation

The best open-source AI models: All your free-to-use options explained

Here are the best open-source and free-to-use AI models for text, images, and audio, organized by type, application, and licensing considerations.

Written by Jason Perlow, Senior Contributing Writer Nov. 6, 2024 at 3:00 a.m. PT

Generative AI (Gen AI) has advanced significantly since its public launch two years ago. The technology has led to transformative applications that can create text, images, and other media with impressive accuracy and creativity.

Also: We have an official open-source AI definition now

Open-source generative models are valuable for developers, researchers, and organizations wanting to leverage cutting-edge AI technology without incurring high licensing fees or restrictive commercial policies. Let's find out more.

Open-source vs. proprietary models

Open-source AI models offer several advantages, including customization, transparency, and community-driven innovation. These models allow users to tailor them to specific needs and benefit from ongoing enhancements. Additionally, they typically come with licenses that permit both commercial and non-commercial use, which enhances their accessibility and adaptability across various applications.

Also: The best free AI courses in 2024

However, open-source solutions are not always the best choice. In industries that demand strict regulatory compliance, data privacy, and specialized support, proprietary models often perform better. They provide stronger legal frameworks, dedicated customer support, and optimizations tailored to industry requirements. Closed-source solutions may also excel in highly specialized tasks, thanks to exclusive features designed for high performance and reliability.

When organizations require real-time updates, advanced security, or specialized functionalities, proprietary models can offer a more robust and secure solution, effectively balancing openness with the rigorous demands for quality and accountability.

The Open Source AI Definition

The Open Source Initiative (OSI) recently introduced the Open Source AI Definition (OSAID) to clarify what qualifies as genuinely open-source AI. To meet OSAID standards, a model must be fully transparent in its design and training data, enabling users to recreate, adapt, and use it freely.

Also: Can AI even be open source? It's complicated

However, some popular models, including Meta's LLaMA and Stability AI's Stable Diffusion, have licensing restrictions or lack transparency around training data, preventing full compliance with OSAID.

As part of the OSAID validation process, OSI assessed the following:

Compliant models: Pythia (Eleuther AI), OLMo (AI2), Amber and CrystalCoder (LLM360), and T5 (Google).
Potentially compliant models: Bloom (BigScience), Starcoder2 (BigCode), and Falcon (TII) could meet OSAID standards with minor adjustments to licensing terms or transparency.
Non-compliant models: LLaMA (Meta), Grok (X/Twitter), Phi (Microsoft), and Mixtral (Mistral) lack the necessary transparency or impose restrictive licensing terms.

The OSAID has sparked notable dissent among prominent open-source community members. Because it diverges from the traditional open-source definition used for software, its relevance and impact on open-source generative AI models have stirred intense debate across community forums, including the Open Source Definition's bulletin boards (an alternative organization to the OSI), developer mailing lists, and public platforms like LinkedIn.

LLaMA and other non-compliant architectures

The Meta LLaMA architecture exemplifies noncompliance with OSAID due to its restrictive research-only license and lack of full transparency about training data, limiting commercial use and reproducibility. Derived models, like Mistral's Mixtral and the Vicuna Team's MiniGPT-4, inherit these restrictions, propagating LLaMA's noncompliance across additional projects.

Also: Want to work in AI? How to pivot your career in 5 steps

Beyond LLaMA-based models, other widely used architectures face similar issues. For example, Stability Diffusion by Stability AI employs the Creative ML OpenRAIL-M license, which includes ethical restrictions that deviate from OSAID's requirements for unrestricted use. Similarly, Grok by xAI combines proprietary elements with usage limitations, challenging its alignment with open-source ideals.

These examples underscore the difficulty of meeting OSAID's standards, as many AI developers balance open access with commercial and ethical considerations.

Implications for organizations: OSAID compliance vs. non-compliance

Choosing OSAID-compliant models gives organizations transparency, legal security, and full customizability features essential for responsible and flexible AI use. These compliant models adhere to ethical practices and benefit from strong community support, promoting collaborative development.

In contrast, non-compliant models may limit adaptability and rely more heavily on proprietary resources. For organizations that prioritize flexibility and alignment with open-source values, OSAID-compliant models are advantageous. However, non-compliant models can still be valuable when proprietary features are required.

Understanding licensing in open-source AI models

Open-source AI models are released under licenses that define usage, modification, and sharing conditions. While some licenses align with traditional open-source standards, others incorporate restrictions or ethical guidelines that prevent full OSAID compliance. Key licenses include:

Apache 2.0: A permissive license that allows free use, modification, and distribution, along with a patent grant. Apache 2.0 is OSI-approved and popular for open-source projects, providing flexibility and legal protection.
MIT: Another permissive license that only requires attribution for reuse. Like Apache 2.0, MIT is OSI-approved, widely adopted, and offers simplicity and minimal restrictions.
Creative ML OpenRAIL-M: A license designed for AI applications, allowing broad use but imposing ethical guidelines to prevent harmful use. OpenRAIL-M is not OSI-approved because it includes usage restrictions that conflict with the OSI's principles of unrestricted freedom. However, it is valued by developers aiming to prioritize ethical use in AI.
CC BY-SA: The Creative Commons Share-Alike license permits free use and requires derivative works to remain open source. While it encourages open collaboration, it's not OSI-approved and is more commonly used for content rather than code, as it lacks some flexibility for software applications.
CC BY-NC 4.0: A Creative Commons license that permits free use with attribution but restricts commercial applications. This license, used for certain model weights (like Meta's MusicGen and AudioGen), limits the models' usability in commercial environments and does not align with OSI's open-source standards.
Custom licenses: Many models on our list, such as IBM's Granite and Nvidia's NeMo, operate under proprietary or custom licenses. These models often impose specific conditions for use or modify traditional open-source terms to align with commercial goals, making them non-compliant with open-source principles.
Research-only licenses: Certain models, such as Meta's LLaMA and Codellama series, are available only under research-use terms. These licenses restrict use to academic or non-commercial purposes and prevent broad community-driven projects, as they do not meet OSI's open-source criteria.

Requirements for running open-source AI models

Running open-source Gen AI models requires specific hardware, software environments, and toolsets for model training, fine-tuning, and deployment tasks. High-performance models with billions of parameters benefit from powerful GPU setups like Nvidia's A100 or H100.

Also: How open source attracts some of the world's top innovators

Essential environments typically include Python and machine learning libraries like PyTorch or TensorFlow. Specialized toolsets, including Hugging Face's Transformers library and Nvidia's NeMo, simplify the processes of fine-tuning and deployment. Docker helps maintain consistent environments across different systems, while Ollama allows for the local execution of large language models on compatible systems.

The following chart highlights essential toolsets, recommended hardware, and their specific functions for managing open-source AI models:

Toolset	Purpose	Requirements	Use
Python	Primary programming environment	N/A	Essential for scripting and configuring models
PyTorch	Model training and inference	GPU (e.g., Nvidia A100, H100)	Widely used library for deep learning models
TensorFlow	Model training and inference	GPU (e.g., Nvidia A100, H100)	Alternative deep learning library
Hugging Face Transformers	Model deployment and fine-tuning	GPU (preferred)	Library for accessing, fine-tuning, and deploying models
Nvidia NeMo	Multimodal model support and deployment	Nvidia GPUs	Optimized for Nvidia hardware and multimodal tasks
Docker	Environment consistency and deployment	Supports GPUs	Containerizes models for easy deployment
Ollama	Running large language models locally	macOS, Linux, Windows, supports GPUs	Platform to run LLMs locally on compatible systems
LangChain	Building applications with LLMs	Python 3.7+	Framework for composing and deploying LLM-powered applications
LlamaIndex	Connecting LLMs with external data sources	Python 3.7+	Framework for integrating LLMs with data sources

This setup establishes a robust framework for efficiently managing Gen AI models, from experimentation to production-ready deployment. Each tool set possesses unique strengths, enabling developers to tailor their environments for specific project needs.

Choosing the right model

Selecting the right gen AI model depends on several factors, including licensing requirements, desired performance, and specific functionality. While larger models tend to deliver higher accuracy and flexibility, they require substantial computational resources. Smaller models, on the other hand, are more suitable for resource-constrained applications and devices.

Also: IBM will train you in AI fundamentals for free, and give you a skill credential - in 10 hours

It's important to note that most models listed here, even those with traditionally open-source licenses like Apache 2.0 or MIT, do not meet the Open Source AI Definition (OSAID). This gap is primarily due to restrictions around training data transparency and usage limitations, which OSAID emphasizes as essential for true open-source AI. However, certain models, such as Bloom and Falcon, show potential for compliance with minor adjustments to their licenses or transparency protocols and may achieve full compliance over time.

The tables below provide an organized overview of the leading open-source generative AI models, categorized by type, issuer, and functionality, to help you choose the best option for your needs, whether a fully transparent, community-driven model or a high-performance tool with specific features and licensing requirements.

Language models

Language models are crucial in text-based applications such as chatbots, content creation, translation, and summarization. They are fundamental to natural language processing (NLP) and continually improve their understanding of language structure and context.

Notable models include Meta's LLaMA, EleutherAI's GPT-NeoX, and Nvidia's NVLM 1.0 family, each known for their unique strengths in multilingual, large-scale, and multimodal tasks.

Issuer & Model	Parameter Sizes	License	Highlights
Google T5	Small to XXL	Apache 2.0	High-performance language model, OSAID Compliant
EleutherAI Pythia	Various	Apache 2.0	Interpretability-focused, OSAID Compliant
Allen Institute for AI (AI2) OLMo	Various	Apache 2.0	Open language research model, OSAID Compliant
BigScience BLOOM	176B	OpenRAIL-M	Multilingual, responsible AI, OSAID Potential
BigCode Starcoder2	Various	Apache 2.0	Code generation, OSAID Potential
TII Falcon	7B, 40B	Apache 2.0	Efficient and high-performance, OSAID Potential
AI21 Labs Jamba Series	Mini to Large	Custom	Language and chat generation
AI Singapore Sea-Lion	7B	Custom	Language and cultural representation
Alibaba Qwen Series	7B	Custom	Bilingual model (Chinese, English)
Databricks Dolly 2.0	12B	CC BY-SA 3.0	Open dataset, commercial use
EleutherAI GPT-J	6B	Apache 2.0	General-purpose language model
EleutherAI GPT-NeoX	20B	MIT	Large-scale text generation
Google Gemma 2	2B, 9B, 27B	Apache 2.0	Language and code generation
IBM Granite Series	3B, 8B	Apache 2.0	Summarization, classification, RAG
Meta LLaMA 3.2	1B to 405B	Research-only	Advanced NLP, multilingual
Microsoft Phi-3 Series	Mini to Medium	MIT	Reasoning, cost-effective
Mistral AI Mixtral 8x22B	8x22B	Apache 2.0	Sparse model, efficient reasoning
Mistral AI Mistral 7B	7B	Apache 2.0	Dense, multilingual text generation
Nvidia NVLM 1.0 Family	72B	CC by SA 3.0	High-performance multimodal LLM
Rakuten RakutenAI Series	7B	Custom	Multilingual chat, NLP
xAI Grok-1	314B	Apache 2.0	Large-scale language model

Image generation models

Image generation models create high-quality visuals or artwork from text prompts, which makes them invaluable for content creators, designers, and marketers.

Stability AI's Stable Diffusion is widely adopted due to its flexibility and output quality, while DeepFloyd's IF emphasizes generating realistic visuals with an understanding of language.

Issuer & Model	Parameter Sizes	License	Highlights
Stability AI Stable Diffusion 3.5	2.5B to 8B	OpenRAIL-M	High-quality image synthesis
DeepFloyd IF	400M to 4.3B	Custom	Realistic visuals with language comprehension
OpenAI DALL-E 3	Not disclosed	Custom	State-of-the-art text-to-image synthesis
Google Imagen	Not disclosed	Custom	High-fidelity image generation from text
Midjourney	Not disclosed	Custom	Artistic and stylized image generation
Adobe Firefly	Not disclosed	Custom	Integrated AI image generation within Adobe products

Vision models

Vision models analyze images and videos, supporting object detection, segmentation, and visual generation from text prompts.

Also: How Claude's new AI data analysis tool compares to ChatGPT's version (hint: it doesn't)

These technologies benefit several industries, including healthcare, autonomous vehicles, and media.

Issuer & Model	Parameter Sizes	License	Highlights
Meta SAM 2.1	38.9M to 224.4M	Apache 2.0	Video editing, segmentation
NVIDIA Consistency	Not disclosed	Custom	Character consistency across video frames
NVIDIA VISTA-3D	Not disclosed	Custom	Medical imaging, anatomical segmentation
NVIDIA NV-DINOv2	Not disclosed	Non-commercial	Image embedding generation
Google DeepLab	Not disclosed	Apache 2.0	High-quality semantic image segmentation
Microsoft Florence	0.23B, 0.77B	MIT	General-purpose visual model for computer vision
OpenAI CLIP	400M	MIT	Text and image comprehension

Audio models

Audio models process and generate audio data, enabling speech recognition, text-to-speech synthesis, music composition, and audio enhancement.

Issuer & Model	Sizes	License	Highlights
Coqui.ai TTS	N/A	MPL 2.0	Text-to-speech synthesis, multi-language support
ESPnet ESPnet	N/A	Apache 2.0	End-to-end speech processing toolkit
Facebook AI wav2vec 2.0	Base (95M), Large (317M)	Apache 2.0	Self-supervised speech recognition
Hugging Face Transformers (Speech Models)	Various	Apache 2.0	Collection of ASR and TTS models
Magenta MusicVAE	N/A	Apache 2.0	Music generation and interpolation
Meta MusicGen	N/A	MIT / CC BY-NC 4.0	Music generation from text prompts
Meta AudioGen	N/A	MIT / CC BY-NC 4.0	Sound effect generation from text prompts
Meta EnCodec	N/A	MIT / CC BY-NC 4.0	High-quality audio compression
Mozilla DeepSpeech	N/A	MPL 2.0	End-to-end speech-to-text engine
NVIDIA NeMo (Speech Models)	Various	Apache 2.0	ASR and TTS models optimized for Nvidia GPUs
OpenAI Jukebox	N/A	MIT	Neural music generation with genre/artist conditioning
OpenAI Whisper	39M to 1.6B	MIT	Multilingual speech recognition and transcription
TensorFlow TFLite Speech Models	N/A	Apache 2.0	Speech recognition models optimized for mobile devices

Multimodal models

Multimodal models combine text, images, audio, and other data types to create content from various inputs.

Also: How AI hallucinations could help create life-saving antibiotics

These models are effective in applications requiring language, visual, and sensory understanding.

Model Name	Parameter Sizes	License	Highlights
Allen Institute for AI (AI2) Molmo	1B, 70B	Apache 2.0	A multimodal AI model that processes text and visual inputs, OSAID-compliant
Meta ImageBind	N/A	Custom	Integrates six data types: text, images, audio, depth, thermal, and IMU.
Meta SeamlessM4T	N/A	Custom	Provides multilingual translation and transcription services.
Meta Spirit LM	N/A	Custom	Combines text and speech to produce natural-sounding outputs.
Microsoft Florence-2	0.23B, 0.77B	MIT	Handles computer vision and language tasks proficiently.
NVIDIA VILA	N/A	Custom	Processes vision-language tasks effectively.
OpenAI CLIP	400M	MIT	Excels in text and image comprehension.
Vicuna Team MiniGPT-4	13B	Apache 2.0	Capable of understanding both text and images.

Retrieval-augmented generation (RAG)

RAG models merge generative AI with information retrieval, allowing them to incorporate relevant data from extensive datasets into their responses.

Issuer & Model	Parameter Sizes	License	Highlights
BAAI BGE-M3	N/A	Custom	Dense and sparse retrieval optimization
IBM Granite 3.0 Series	3B, 8B	Apache 2.0	Advanced retrieval, summarization, RAG
Nvidia EmbedQA & ReRankQA	1B	Custom	Multilingual QA, GPU-accelerated retrieval

Specialized models

Specialized models are optimized for specific fields, such as programming, scientific research, and healthcare, offering enhanced functionality tailored to their domains.

Issuer & Model	Parameter Sizes	License	Highlights
Meta Codellama Series	7B, 13B, 34B	Custom	Code generation, multilingual programming
IBM Granite (Specialized Models)	3B, 8B, 20B, 34B	Apache 2.0	Code generation, time series, geospatial
Mistral AI Mamba-Codestral	7B	Apache 2.0	Focused on coding and multilingual capabilities
Mistral AI Mathstral	7B	Apache 2.0	Specialized in mathematical reasoning

Guardrail models

Guardrail models ensure safe and responsible outputs by detecting and mitigating biases, inappropriate content, and harmful responses.

Issuer & Model	Parameter Sizes	License	Highlights
NVIDIA NeMo Guardrails	N/A	Apache 2.0	Open-source toolkit for adding programmable guardrails
Google ShieldGemma	2B, 9B, 27B	Custom	Safety classifier models built on Gemma 2
IBM Granite-Guardian	8B	Apache 2.0	Detects unethical or harmful content

Choose open-source models

The landscape of generative AI is evolving rapidly, with open-source models crucial for making advanced technology accessible to all. These models allow for customization and collaboration, breaking down barriers that have limited AI development to large corporations.

Also: 4 ways to turn generative AI experiments into real business value

Developers can tailor solutions to their needs by choosing open-source Gen AI, contributing to a global community, and accelerating technological progress. The variety of available models -- from language and vision to safety-focused designs -- ensures options for almost any application.

Supporting open-source AI communities will be essential for promoting ethical and innovative AI developments, benefiting individual projects, and advancing technology responsibly.

Artificial Intelligence

Editorial standards

Show Comments