Evaluating GPT-4.5 for Financial Reasoning and Greek Financial Tasks

Introduction

GPT-4.5 is the latest upgrade in OpenAI's GPT family, promising improved reasoning and better general performance across domains. To understand how well this general-purpose powerhouse handles domain-specific financial challenges, we evaluated GPT-4.5 on two of TheFinAI’s open leaderboards:

The results offer valuable insights into GPT-4.5’s strengths and limitations when applied to the nuanced world of finance, and particularly how it fares against domain-specific models.

Financial Reasoning: Strong Math, but Troubled Reasoning

Analysis

What This Tells Us

👉 GPT-4.5’s general reasoning is strong, but it struggles with the specialized reasoning and formatting rules found in financial documents — particularly for long-form documents and structured numeric data.

Greek Financial Tasks: Strong QA, But Outperformed by Specialized Models

Analysis

What This Tells Us

👉 GPT-4.5’s general understanding and factual reasoning are top-tier, but it lacks the fine-grained pattern recognition and financial format awareness needed for high accuracy in Greek financial documents.
👉 Specialized models like plutus-8B, trained specifically on Greek financial texts, retain a substantial edge on numeric entity recognition and overall average performance.