Google DeepMind · General LLM

Gemini

Google's natively multimodal AI model family designed to understand and reason across text, images, audio, video, and code.

Overview

Gemini is Google DeepMind's most capable AI model family, built from the ground up to be natively multimodal. Unlike models that bolt together separate vision and language components, Gemini was trained jointly on text, images, audio, and video data. Gemini Ultra achieves state-of-the-art performance on numerous benchmarks, while Gemini Pro and Flash variants offer excellent performance-to-cost ratios. The model powers Google's AI features across Search, Workspace, and the Gemini chatbot.

Models

Ultra, Pro, Flash (1.5 family)

Context Window

Up to 1M tokens (1.5 Pro)

Modality

Text, image, audio, video (native)

Architecture

Natively multimodal transformer

API Availability

Google AI Studio, Vertex AI

Capabilities

Native multimodal reasoning across text, image, audio, and video

Advanced mathematical and scientific reasoning

Code generation and understanding

Long-context processing up to 1M tokens (Gemini 1.5 Pro)

Real-time conversational AI

Grounded responses with Google Search integration

Use Cases

Building multimodal AI applications processing diverse media types

Analyzing video content with natural language queries

Processing extremely long documents with the 1M token context

Integrating AI capabilities into Google Workspace workflows

Pros

  • +Natively multimodal with best-in-class video understanding
  • +1M token context window is the largest commercially available
  • +Competitive pricing especially for Flash variants
  • +Deep integration with Google ecosystem and services

Cons

  • -Closed-source with Google Cloud vendor lock-in considerations
  • -Availability and features can vary by region
  • -Ultra model access is more restricted than competitors
  • -Google's data practices may concern privacy-sensitive organizations

Pricing

Gemini 1.5 Pro: $1.25/1M input, $5/1M output (under 128K). Gemini 1.5 Flash: $0.075/1M input, $0.30/1M output. Free tier available.

Related Models