Bloomberg · Finance

BloombergGPT

A 50-billion parameter language model purpose-built for finance, trained on Bloomberg's proprietary financial data alongside general-purpose datasets.

Overview

BloombergGPT is a large language model specifically built for the financial domain. It was trained on a 363-billion token dataset combining Bloomberg's vast proprietary financial data archive (FinPile) with general-purpose text. The model outperforms existing open models of similar size on financial NLP benchmarks while maintaining competitive performance on general language tasks, demonstrating the value of domain-specific training at scale.

Parameters

50B

Training Data

363B tokens (FinPile + general corpus)

Architecture

Decoder-only transformer

FinPile Size

345B tokens of financial data

Training Duration

53 days on 512 A100 GPUs

Capabilities

Financial sentiment analysis

Financial named entity recognition

News headline classification

Financial question answering

Financial document summarization

Use Cases

Analyzing market sentiment from news articles and social media

Extracting financial entities from earnings reports and filings

Classifying financial news for real-time trading signals

Automating financial research report summarization

Pros

  • +Largest and most comprehensive financial domain LLM
  • +Trained on unmatched proprietary financial data
  • +Strong performance across all financial NLP benchmarks
  • +Maintains competitive general-purpose language capabilities

Cons

  • -Not publicly accessible or open-source
  • -Cannot be deployed or fine-tuned by external organizations
  • -Extremely expensive to train and replicate
  • -Model weights and training data are proprietary

Pricing

Not publicly available. Access is limited to Bloomberg internal use and select research partnerships. Bloomberg Terminal subscribers may see integrated features.

Related Models