The most general and capable AI models we've ever built.
Our most flexible models yet
Each Gemini model is built for its own set of use cases, making a versatile model family that runs efficiently on everything from data centers to on-device.
Project Astra
Project Astra explores the future of AI assistants. Building on our Gemini models, we’ve developed AI agents that can quickly process multimodal information, reason about the context you’re in, and respond to questions at a conversational pace, making interactions feel much more natural.
The demo shows two continuous takes: one with the prototype running on a Google Pixel phone and another on a prototype glasses device.
Natively multimodal
Gemini models are built from the ground up for multimodality, seamlessly combining and understanding text, code, images, audio, and video.
Longer context
1.5 Pro and 1.5 Flash both have a default context window of up to one million tokens — the longest context window of any large scale foundation model. They achieve near-perfect recall on long-context retrieval tasks across modalities, unlocking the ability to process long documents, thousands of lines of code, hours of audio, video, and more. For 1.5 Pro, developers and enterprise customers can also sign up to try a two-million-token context window.
Research
Relentless innovation
Our research team is continually exploring new ideas at the frontier of AI, building innovative products that show consistent progress on a range of benchmarks.
Capability |
Benchmark |
Description |
Gemini 1.5 Flash-8B (Oct 2024) |
Gemini 1.5 Flash (May 2024) |
Gemini 1.5 Flash (Sep 2024) |
Gemini 1.5 Pro (May 2024) |
Gemini 1.5 Pro (Sep 2024) |
---|---|---|---|---|---|---|---|
General MMLU-Pro Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks |
|||||||
General |
MMLU-Pro |
Enhanced version of popular MMLU dataset with questions across multiple subjects with higher difficulty tasks |
Gemini 1.5 Flash-8B (Oct 2024) 58.7% |
Gemini 1.5 Flash (May 2024) 59.1% |
Gemini 1.5 Flash (Sep 2024) 67.3% |
Gemini 1.5 Pro (May 2024) 69.0% |
Gemini 1.5 Pro (Sept 2024) 75.8% |
Code Natural2Code Code generation across Python, Java, C++, JS, Go . Held out dataset HumanEval-like, not leaked on the web |
|||||||
Code |
Natural2Code |
Code generation across Python, Java, C++, JS, Go . Held out dataset HumanEval-like, not leaked on the web |
Gemini 1.5 Flash-8B (Oct 2024) 75.5% |
Gemini 1.5 Flash (May 2024) 77.2% |
Gemini 1.5 Flash (Sep 2024) 79.8% |
Gemini 1.5 Pro (May 2024) 82.6% |
Gemini 1.5 Pro (Sep 2024) 85.4% |
Math MATH Challenging math problems (incl. algebra, geometry, pre-calculus, and others) |
|||||||
Math |
MATH |
Challenging math problems (incl. algebra, geometry, pre-calculus, and others) |
Gemini 1.5 Flash-8B (Oct 2024) 58.7% |
Gemini 1.5 Flash (May 2024) 54.9% |
Gemini 1.5 Flash (Sep 2024) 77.9% |
Gemini 1.5 Pro (May 2024) 67.7% |
Gemini 1.5 Pro (Sep 2024) 86.5% |
HiddenMath Competition-level math problems, Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web |
|||||||
HiddenMath |
Competition-level math problems, Held out dataset AIME/AMC-like, crafted by experts and not leaked on the web |
Gemini 1.5 Flash-8B (Oct 2024) 32.8% |
Gemini 1.5 Flash (May 2024) 20.3% |
Gemini 1.5 Flash (Sep 2024) 47.2% |
Gemini 1.5 Pro (May 2024) 28.0% |
Gemini 1.5 Pro (Sep 2024) 52.0% |
|
Reasoning GPQA (diamond) Challenging dataset of questions written by domain experts in biology, physics, and chemistry |
|||||||
Reasoning |
GPQA (diamond) |
Challenging dataset of questions written by domain experts in biology, physics, and chemistry |
Gemini 1.5 Flash-8B (Oct 2024) 38.4% |
Gemini 1.5 Flash (May 2024) 41.4% |
Gemini 1.5 Flash (Sep 2024) 51.0% |
Gemini 1.5 Pro (May 2024) 46.0% |
Gemini 1.5 Pro (Sep 2024) 59.1% |
Multilingual WMT23 Language translation |
|||||||
Multilingual |
WMT23 |
Language translation |
Gemini 1.5 Flash-8B (Oct 2024) 72.6 |
Gemini 1.5 Flash (May 2024) 74.1 |
Gemini 1.5 Flash (Sep 2024) 73.9 |
Gemini 1.5 Pro (May 2024) 75.3 |
Gemini 1.5 Pro (Sep 2024) 75.1 |
Long Context MRCR (1M) Diagnostic long-context understanding evaluation |
|||||||
Long Context |
MRCR (1M) |
Diagnostic long-context understanding evaluation |
Gemini 1.5 Flash-8B (Oct 2024) 54.7% |
Gemini 1.5 Flash (May 2024) 70.1% |
Gemini 1.5 Flash (Sep 2024) 71.9% |
Gemini 1.5 Pro (May 2024) 70.5% |
Gemini 1.5 Pro (Sep 2024) 82.6% |
Image MMMU Multi-discipline college-level multimodal reasoning problems |
|||||||
Image |
MMMU |
Multi-discipline college-level multimodal understanding and reasoning problems |
Gemini 1.5 Flash-8B (Oct 2024) 53.7% |
Gemini 1.5 Flash (May 2024) 56.1% |
Gemini 1.5 Flash (Sep 2024) 62.3% |
Gemini 1.5 Pro (May 2024) 62.2% |
Gemini 1.5 Pro (Sep 2024) 65.9% |
Vibe-Eval (Reka) Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater |
|||||||
Vibe-Eval (Reka) |
Visual understanding in chat models with challenging everyday examples. Evaluated with a Gemini Flash model as a rater |
Gemini 1.5 Flash-8B (Oct 2024) 40.9% |
Gemini 1.5 Flash (May 2024) 44.8% |
Gemini 1.5 Flash (Sep 2024) 48.9% |
Gemini 1.5 Pro (May 2024) 48.9% |
Gemini 1.5 Pro (Sep 2024) 53.9% |
|
MathVista Mathematical reasoning in visual contexts |
|||||||
MathVista |
Mathematical reasoning in visual contexts |
Gemini 1.5 Flash-8B (Oct 2024) 54.7% |
Gemini 1.5 Flash (May 2024) 58.4% |
Gemini 1.5 Flash (Sep 2024) 65.8% |
Gemini 1.5 Pro (May 2024) 63.9% |
Gemini 1.5 Pro (Sep 2024) 68.1% |
|
Audio FLEURS (55 lang) Automatic speech recognition (based on word error rate, lower is better) |
|||||||
Audio |
FLEURS (55 lang) |
Automatic speech recognition (based on word error rate, lower is better) |
Gemini 1.5 Flash-8B (Oct 2024) 13.6% |
Gemini 1.5 Flash (May 2024) 9.8% |
Gemini 1.5 Flash (Sep 2024) 9.6% |
Gemini 1.5 Pro (May 2024) 6.5% |
Gemini 1.5 Pro (May 2024) 6.7% |
Video Video-MME Video analysis across multiple domains |
|||||||
Video |
Video-MME |
Video analysis across multiple domains |
Gemini 1.5 Flash-8B (Oct 2024) 66.2% |
Gemini 1.5 Flash (May 2024) 74.7% |
Gemini 1.5 Flash (Sep 2024) 76.1% |
Gemini 1.5 Pro (May 2024) 77.9% |
Gemini 1.5 Pro (May 2024) 78.6% |
Safety XSTest Measures how often models refuse to respond to safe/benign prompts. The score represents how frequently models correctly fulfill requests |
|||||||
Safety |
XSTest |
Measures how often models refuse to respond to safe/benign prompts. The score represents how frequently models correctly fulfill requests |
Gemini 1.5 Flash-8B (Oct 2024) 92.6% |
Gemini 1.5 Flash (May 2024) 86.9% |
Gemini 1.5 Flash (Sep 2024) 97.0% |
Gemini 1.5 Pro (May 2024) 88.4% |
Gemini 1.5 Pro (May 2024) 98.8% |
Technical reports
For developers
Build with Gemini
Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI.
Try the models
Get started
Example prompts for the Gemini API in Google AI Studio.
Responsibility at the core
Our models undergo extensive ethics and safety tests, including adversarial testing for bias and toxicity.
Hands-on
Serving billions of Google users
Gemini models are embedded in a range of Google experiences.
What's new
-
Technologies
Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more
We’re releasing two updated production-ready Gemini models
-
Technologies
Gemini breaks new ground: a faster model, longer context and AI agents
We’re introducing a series of updates across the Gemini family of models, including the new 1.5 Flash, our lightweight model for speed and efficiency, and Project Astra, our vision for the future...
-
Technologies
Our next-generation model: Gemini 1.5
The model delivers dramatically enhanced performance, with a breakthrough in long-context understanding across modalities.
-
Technologies
The next chapter of our Gemini era
We're bringing Gemini to more Google products