X
Innovation

The best open-source AI models: All your free-to-use options explained

Here are the best open-source and free-to-use AI models for text, images, and audio, organized by type, application, and licensing considerations.
Written by Jason Perlow, Senior Contributing Writer
ai concept
Jackie Niam/Getty Images

Generative AI (Gen AI) has advanced significantly since its public launch two years ago. The technology has led to transformative applications that can create text, images, and other media with impressive accuracy and creativity. 

Also: We have an official open-source AI definition now

Open-source generative models are valuable for developers, researchers, and organizations wanting to leverage cutting-edge AI technology without incurring high licensing fees or restrictive commercial policies. Let's find out more.

Open-source vs. proprietary models

Open-source AI models offer several advantages, including customization, transparency, and community-driven innovation. These models allow users to tailor them to specific needs and benefit from ongoing enhancements. Additionally, they typically come with licenses that permit both commercial and non-commercial use, which enhances their accessibility and adaptability across various applications.

Also: The best free AI courses in 2024

However, open-source solutions are not always the best choice. In industries that demand strict regulatory compliance, data privacy, and specialized support, proprietary models often perform better. They provide stronger legal frameworks, dedicated customer support, and optimizations tailored to industry requirements. Closed-source solutions may also excel in highly specialized tasks, thanks to exclusive features designed for high performance and reliability.

When organizations require real-time updates, advanced security, or specialized functionalities, proprietary models can offer a more robust and secure solution, effectively balancing openness with the rigorous demands for quality and accountability.

The Open Source AI Definition

The Open Source Initiative (OSI) recently introduced the Open Source AI Definition (OSAID) to clarify what qualifies as genuinely open-source AI. To meet OSAID standards, a model must be fully transparent in its design and training data, enabling users to recreate, adapt, and use it freely. 

Also: Can AI even be open source? It's complicated

However, some popular models, including Meta's LLaMA and Stability AI's Stable Diffusion, have licensing restrictions or lack transparency around training data, preventing full compliance with OSAID.

As part of the OSAID validation process, OSI assessed the following:

  • Compliant models: Pythia (Eleuther AI), OLMo (AI2), Amber and CrystalCoder (LLM360), and T5 (Google).
  • Potentially compliant models: Bloom (BigScience), Starcoder2 (BigCode), and Falcon (TII) could meet OSAID standards with minor adjustments to licensing terms or transparency.
  • Non-compliant models: LLaMA (Meta), Grok (X/Twitter), Phi (Microsoft), and Mixtral (Mistral) lack the necessary transparency or impose restrictive licensing terms.

The OSAID has sparked notable dissent among prominent open-source community members. Because it diverges from the traditional open-source definition used for software, its relevance and impact on open-source generative AI models have stirred intense debate across community forums, including the Open Source Definition's bulletin boards (an alternative organization to the OSI), developer mailing lists, and public platforms like LinkedIn.

LLaMA and other non-compliant architectures

The Meta LLaMA architecture exemplifies noncompliance with OSAID due to its restrictive research-only license and lack of full transparency about training data, limiting commercial use and reproducibility. Derived models, like Mistral's Mixtral and the Vicuna Team's MiniGPT-4, inherit these restrictions, propagating LLaMA's noncompliance across additional projects.

Also: Want to work in AI? How to pivot your career in 5 steps

Beyond LLaMA-based models, other widely used architectures face similar issues. For example, Stability Diffusion by Stability AI employs the Creative ML OpenRAIL-M license, which includes ethical restrictions that deviate from OSAID's requirements for unrestricted use. Similarly, Grok by xAI combines proprietary elements with usage limitations, challenging its alignment with open-source ideals.

These examples underscore the difficulty of meeting OSAID's standards, as many AI developers balance open access with commercial and ethical considerations.

Implications for organizations: OSAID compliance vs. non-compliance

Choosing OSAID-compliant models gives organizations transparency, legal security, and full customizability features essential for responsible and flexible AI use. These compliant models adhere to ethical practices and benefit from strong community support, promoting collaborative development. 

In contrast, non-compliant models may limit adaptability and rely more heavily on proprietary resources. For organizations that prioritize flexibility and alignment with open-source values, OSAID-compliant models are advantageous. However, non-compliant models can still be valuable when proprietary features are required.

Understanding licensing in open-source AI models

Open-source AI models are released under licenses that define usage, modification, and sharing conditions. While some licenses align with traditional open-source standards, others incorporate restrictions or ethical guidelines that prevent full OSAID compliance. Key licenses include:

  • Apache 2.0: A permissive license that allows free use, modification, and distribution, along with a patent grant. Apache 2.0 is OSI-approved and popular for open-source projects, providing flexibility and legal protection.
  • MIT: Another permissive license that only requires attribution for reuse. Like Apache 2.0, MIT is OSI-approved, widely adopted, and offers simplicity and minimal restrictions.
  • Creative ML OpenRAIL-M: A license designed for AI applications, allowing broad use but imposing ethical guidelines to prevent harmful use. OpenRAIL-M is not OSI-approved because it includes usage restrictions that conflict with the OSI's principles of unrestricted freedom. However, it is valued by developers aiming to prioritize ethical use in AI.
  • CC BY-SA: The Creative Commons Share-Alike license permits free use and requires derivative works to remain open source. While it encourages open collaboration, it's not OSI-approved and is more commonly used for content rather than code, as it lacks some flexibility for software applications.
  • CC BY-NC 4.0: A Creative Commons license that permits free use with attribution but restricts commercial applications. This license, used for certain model weights (like Meta's MusicGen and AudioGen), limits the models' usability in commercial environments and does not align with OSI's open-source standards.
  • Custom licenses: Many models on our list, such as IBM's Granite and Nvidia's NeMo, operate under proprietary or custom licenses. These models often impose specific conditions for use or modify traditional open-source terms to align with commercial goals, making them non-compliant with open-source principles.
  • Research-only licenses: Certain models, such as Meta's LLaMA and Codellama series, are available only under research-use terms. These licenses restrict use to academic or non-commercial purposes and prevent broad community-driven projects, as they do not meet OSI's open-source criteria.

Requirements for running open-source AI models

Running open-source Gen AI models requires specific hardware, software environments, and toolsets for model training, fine-tuning, and deployment tasks. High-performance models with billions of parameters benefit from powerful GPU setups like Nvidia's A100 or H100. 

Also: How open source attracts some of the world's top innovators

Essential environments typically include Python and machine learning libraries like PyTorch or TensorFlow. Specialized toolsets, including Hugging Face's Transformers library and Nvidia's NeMo, simplify the processes of fine-tuning and deployment. Docker helps maintain consistent environments across different systems, while Ollama allows for the local execution of large language models on compatible systems. 

The following chart highlights essential toolsets, recommended hardware, and their specific functions for managing open-source AI models:

Toolset

Purpose

Requirements

Use

Python

Primary programming environment

N/A

Essential for scripting and configuring models

PyTorch

Model training and inference

GPU (e.g., Nvidia A100, H100)

Widely used library for deep learning models

TensorFlow

Model training and inference

GPU (e.g., Nvidia A100, H100)

Alternative deep learning library

Hugging Face Transformers

Model deployment and fine-tuning

GPU (preferred)

Library for accessing, fine-tuning, and deploying models

Nvidia NeMo

Multimodal model support and deployment

Nvidia GPUs

Optimized for Nvidia hardware and multimodal tasks

Docker

Environment consistency and deployment

Supports GPUs

Containerizes models for easy deployment

Ollama

Running large language models locally

macOS, Linux, Windows, supports GPUs

Platform to run LLMs locally on compatible systems

LangChain

Building applications with LLMs

Python 3.7+

Framework for composing and deploying LLM-powered applications

LlamaIndex

Connecting LLMs with external data sources

Python 3.7+

Framework for integrating LLMs with data sources


This setup establishes a robust framework for efficiently managing Gen AI models, from experimentation to production-ready deployment. Each tool set possesses unique strengths, enabling developers to tailor their environments for specific project needs.

Choosing the right model

Selecting the right gen AI model depends on several factors, including licensing requirements, desired performance, and specific functionality. While larger models tend to deliver higher accuracy and flexibility, they require substantial computational resources. Smaller models, on the other hand, are more suitable for resource-constrained applications and devices.

Also: IBM will train you in AI fundamentals for free, and give you a skill credential - in 10 hours

It's important to note that most models listed here, even those with traditionally open-source licenses like Apache 2.0 or MIT, do not meet the Open Source AI Definition (OSAID). This gap is primarily due to restrictions around training data transparency and usage limitations, which OSAID emphasizes as essential for true open-source AI. However, certain models, such as Bloom and Falcon, show potential for compliance with minor adjustments to their licenses or transparency protocols and may achieve full compliance over time.

The tables below provide an organized overview of the leading open-source generative AI models, categorized by type, issuer, and functionality, to help you choose the best option for your needs, whether a fully transparent, community-driven model or a high-performance tool with specific features and licensing requirements.

Language models

Language models are crucial in text-based applications such as chatbots, content creation, translation, and summarization. They are fundamental to natural language processing (NLP) and continually improve their understanding of language structure and context. 

Notable models include Meta's LLaMA, EleutherAI's GPT-NeoX, and Nvidia's NVLM 1.0 family, each known for their unique strengths in multilingual, large-scale, and multimodal tasks.

Issuer & ModelParameter SizesLicenseHighlights
Google T5Small to XXLApache 2.0High-performance language model, OSAID Compliant
EleutherAI PythiaVariousApache 2.0Interpretability-focused, OSAID Compliant
Allen Institute for AI (AI2) OLMoVariousApache 2.0Open language research model, OSAID Compliant
BigScience BLOOM176BOpenRAIL-MMultilingual, responsible AI, OSAID Potential
BigCode Starcoder2VariousApache 2.0Code generation, OSAID Potential
TII Falcon7B, 40BApache 2.0Efficient and high-performance, OSAID Potential
AI21 Labs Jamba SeriesMini to LargeCustomLanguage and chat generation
AI Singapore Sea-Lion7BCustomLanguage and cultural representation
Alibaba Qwen Series7BCustomBilingual model (Chinese, English)
Databricks Dolly 2.012BCC BY-SA 3.0Open dataset, commercial use
EleutherAI GPT-J6BApache 2.0General-purpose language model
EleutherAI GPT-NeoX20BMITLarge-scale text generation
Google Gemma 22B, 9B, 27BApache 2.0Language and code generation
IBM Granite Series3B, 8BApache 2.0Summarization, classification, RAG
Meta LLaMA 3.21B to 405BResearch-onlyAdvanced NLP, multilingual
Microsoft Phi-3 SeriesMini to MediumMITReasoning, cost-effective
Mistral AI Mixtral 8x22B8x22BApache 2.0Sparse model, efficient reasoning
Mistral AI Mistral 7B7BApache 2.0Dense, multilingual text generation
Nvidia NVLM 1.0 Family72BCC by SA 3.0High-performance multimodal LLM
Rakuten RakutenAI Series7BCustomMultilingual chat, NLP
xAI Grok-1314BApache 2.0Large-scale language model


Image generation models

Image generation models create high-quality visuals or artwork from text prompts, which makes them invaluable for content creators, designers, and marketers. 

Stability AI's Stable Diffusion is widely adopted due to its flexibility and output quality, while DeepFloyd's IF emphasizes generating realistic visuals with an understanding of language.

Issuer & ModelParameter SizesLicenseHighlights
Stability AI Stable Diffusion 3.52.5B to 8BOpenRAIL-MHigh-quality image synthesis
DeepFloyd IF400M to 4.3BCustomRealistic visuals with language comprehension
OpenAI DALL-E 3Not disclosedCustomState-of-the-art text-to-image synthesis
Google ImagenNot disclosedCustomHigh-fidelity image generation from text
MidjourneyNot disclosedCustomArtistic and stylized image generation
Adobe FireflyNot disclosedCustomIntegrated AI image generation within Adobe products


Vision models

Vision models analyze images and videos, supporting object detection, segmentation, and visual generation from text prompts. 

Also: How Claude's new AI data analysis tool compares to ChatGPT's version (hint: it doesn't)

These technologies benefit several industries, including healthcare, autonomous vehicles, and media.

Issuer & ModelParameter SizesLicenseHighlights
Meta SAM 2.138.9M to 224.4MApache 2.0Video editing, segmentation
NVIDIA ConsistencyNot disclosedCustomCharacter consistency across video frames
NVIDIA VISTA-3DNot disclosedCustomMedical imaging, anatomical segmentation
NVIDIA NV-DINOv2Not disclosedNon-commercialImage embedding generation
Google DeepLabNot disclosedApache 2.0High-quality semantic image segmentation
Microsoft Florence0.23B, 0.77BMITGeneral-purpose visual model for computer vision
OpenAI CLIP400MMITText and image comprehension


Audio models

Audio models process and generate audio data, enabling speech recognition, text-to-speech synthesis, music composition, and audio enhancement.

Issuer & ModelSizesLicenseHighlights
Coqui.ai TTSN/AMPL 2.0Text-to-speech synthesis, multi-language support
ESPnet ESPnetN/AApache 2.0End-to-end speech processing toolkit
Facebook AI wav2vec 2.0Base (95M), Large (317M)Apache 2.0Self-supervised speech recognition
Hugging Face Transformers (Speech Models)VariousApache 2.0Collection of ASR and TTS models
Magenta MusicVAEN/AApache 2.0Music generation and interpolation
Meta MusicGenN/AMIT / CC BY-NC 4.0Music generation from text prompts
Meta AudioGenN/AMIT / CC BY-NC 4.0Sound effect generation from text prompts
Meta EnCodecN/AMIT / CC BY-NC 4.0High-quality audio compression
Mozilla DeepSpeechN/AMPL 2.0End-to-end speech-to-text engine
NVIDIA NeMo (Speech Models)VariousApache 2.0ASR and TTS models optimized for Nvidia GPUs
OpenAI JukeboxN/AMITNeural music generation with genre/artist conditioning
OpenAI Whisper39M to 1.6BMITMultilingual speech recognition and transcription
TensorFlow TFLite Speech ModelsN/AApache 2.0Speech recognition models optimized for mobile devices


Multimodal models

Multimodal models combine text, images, audio, and other data types to create content from various inputs. 

Also: How AI hallucinations could help create life-saving antibiotics

These models are effective in applications requiring language, visual, and sensory understanding.

Model NameParameter SizesLicenseHighlights
Allen Institute for AI (AI2) Molmo1B, 70BApache 2.0A multimodal AI model that processes text and visual inputs, OSAID-compliant
Meta ImageBindN/ACustomIntegrates six data types: text, images, audio, depth, thermal, and IMU.
Meta SeamlessM4TN/ACustomProvides multilingual translation and transcription services.
Meta Spirit LMN/ACustomCombines text and speech to produce natural-sounding outputs.
Microsoft Florence-20.23B, 0.77BMITHandles computer vision and language tasks proficiently.
NVIDIA VILAN/ACustomProcesses vision-language tasks effectively.
OpenAI CLIP400MMITExcels in text and image comprehension.
Vicuna Team MiniGPT-413BApache 2.0Capable of understanding both text and images.


Retrieval-augmented generation (RAG)

RAG models merge generative AI with information retrieval, allowing them to incorporate relevant data from extensive datasets into their responses.

Issuer & ModelParameter SizesLicenseHighlights
BAAI BGE-M3N/ACustomDense and sparse retrieval optimization
IBM Granite 3.0 Series3B, 8BApache 2.0Advanced retrieval, summarization, RAG
Nvidia EmbedQA & ReRankQA1BCustomMultilingual QA, GPU-accelerated retrieval


Specialized models

Specialized models are optimized for specific fields, such as programming, scientific research, and healthcare, offering enhanced functionality tailored to their domains.

Issuer & ModelParameter SizesLicenseHighlights
Meta Codellama Series7B, 13B, 34BCustomCode generation, multilingual programming
IBM Granite (Specialized Models)3B, 8B, 20B, 34BApache 2.0Code generation, time series, geospatial
Mistral AI Mamba-Codestral7BApache 2.0Focused on coding and multilingual capabilities
Mistral AI Mathstral7BApache 2.0Specialized in mathematical reasoning


Guardrail models

Guardrail models ensure safe and responsible outputs by detecting and mitigating biases, inappropriate content, and harmful responses.

Issuer & ModelParameter SizesLicenseHighlights
NVIDIA NeMo GuardrailsN/AApache 2.0Open-source toolkit for adding programmable guardrails
Google ShieldGemma2B, 9B, 27BCustomSafety classifier models built on Gemma 2
IBM Granite-Guardian8BApache 2.0Detects unethical or harmful content


Choose open-source models

The landscape of generative AI is evolving rapidly, with open-source models crucial for making advanced technology accessible to all. These models allow for customization and collaboration, breaking down barriers that have limited AI development to large corporations.

Also: 4 ways to turn generative AI experiments into real business value

Developers can tailor solutions to their needs by choosing open-source Gen AI, contributing to a global community, and accelerating technological progress. The variety of available models -- from language and vision to safety-focused designs -- ensures options for almost any application.

Supporting open-source AI communities will be essential for promoting ethical and innovative AI developments, benefiting individual projects, and advancing technology responsibly.

Editorial standards