Claude 3.5 Sonnet: Elevating AI Coding and Tool Use
The Claude 3.5 Sonnet has received a significant upgrade, delivering across-the-board improvements over its predecessor. It has made remarkable strides in coding—a domain where it already led the field. The model showcases exceptional performance on industry benchmarks:
SWE-bench Verified: Claude 3.5 Sonnet improved from a score of 33.4% to an impressive 49.0%, outperforming all publicly available models, including specialized systems designed for agentic coding.
TAU-bench: It enhanced its performance in agentic tool use tasks, scoring 69.2% in the retail domain and 46.0% in the more challenging airline domain.
These advancements come without any increase in cost or decrease in speed compared to the previous version, making Claude 3.5 Sonnet a powerful and efficient tool for developers and organizations.
Early Adopters Praise Claude 3.5 Sonnet:
GitLab tested the model for DevSecOps tasks and noted stronger reasoning abilities with no added latency.
Cognition experienced substantial improvements in coding, planning, and problem-solving for autonomous AI evaluations.
The Browser Company found Claude 3.5 Sonnet outperformed every model they had previously tested for automating web-based workflows.
Claude 3.5 Haiku: Speed Meets State-of-the-Art Performance
Introducing Claude 3.5 Haiku, the next generation of our fastest model. This model surpasses its predecessor, Claude 3 Haiku, across all skill sets and even matches the performance of our prior largest model, Claude 3 Opus, on many intelligence benchmarks—all while maintaining the same cost and similar speed.
Notable achievements include:
SWE-bench Verified: Scoring 40.6%, it outperforms many agents using state-of-the-art models, including the original Claude 3.5 Sonnet and GPT-4o.
With low latency, improved instruction following, and enhanced tool use, Claude 3.5 Haiku is ideal for user-facing products, specialized sub-agent tasks, and generating personalized experiences from large data volumes.
Availability: Claude 3.5 Haiku will be available later this month via our first-party API, Amazon Bedrock, and Google Cloud’s Vertex AI. Initially, it will be a text-only model, with image input capabilities to follow.
Where to try it out
New version of Claude 3.5 Sonnet is already available on Survo Chat