Focus on agility, ethical integrity and value delivery amid rapid AI evolution.
Focus on agility, ethical integrity and value delivery amid rapid AI evolution.
By Mike Fang | February 12, 2025
As the large language model (LLM) market rapidly evolves — with disruptors like DeepSeek being the latest to intensify competition and upend cost assumptions — users still need to evaluate LLMs based on business impact. That means assessing LLMs to ensure scalability, efficiency and long-term value.
Evaluating LLMs solely on cost can lead to misalignment with long-term business objectives. Several critical factors should guide your assessments.
In mid-2024, when DeepSeek, ByteDance and others were first announcing big reductions in their pricing, Gartner predicted that by 2027, the average price of GenAI APIs would be less than 1% of the current average price at that time — while those APIs would maintain the same levels of quality, throughput and latency.
However, we maintain that this decline in AI inference costs (costs associated with using a trained AI model to generate output) has little immediate impact on enterprises with on-premises GenAI solutions, largely because of limited deployment options, early phases of adoption and current cost structures to date.
For cloud-based AI, API cost is just one factor in the total cost of ownership (TCO), which also includes:
AI software and tools
AI-ready data and governance
Infrastructure and cloud computing costs
AI services and skilled labor
Security and regulatory compliance measures
Recommendations for evaluating LLMs given volatile costs:
Prioritize AI investments based on value delivery, risk and total cost structures.
Incorporate security, governance, and regulatory compliance into AI cost planning.
Assess model effectiveness beyond cost, considering quality, throughput and latency.
As GenAI API costs decline, organizations should reassess AI deployment strategies, balancing cloud-based and on-premises models. Cloud AI offers scalability, agility and integration with existing AI ecosystems, while on-premises solutions may be preferable for regulatory compliance, security and specialized infrastructure needs.
Recommendations for evaluating use of cloud-based LLMs:
Align AI deployment with business priorities, balancing cloud and on-premises trade-offs.
Evaluate cloud adoption for AI use cases while ensuring compliance with data security policies.
Consider hybrid models that leverage both cloud and on-premises infrastructure for flexibility.
Assess LLMs across three key criteria: model type, performance and cost-efficiency.
Model type
General-purpose LLMs: Versatile models (e.g., GPT-4 Turbo) for content generation, summarization and conversational AI
Domain-specific LLMs: Tailored for industry-specific applications (e.g., finance, healthcare) with specialized capabilities
Performance metrics
Combine industry benchmarks with custom evaluation metrics, including:
Accuracy and groundedness — fact-based responses and precision
Relevance and recall — alignment with business needs
Safety and bias detection — identifying and mitigating risks in outputs
Cost considerations
Beyond API pricing, account for:
Fine-tuning and model adaptation costs
AI governance, security and compliance expenses
Talent and infrastructure investment
DeepSeek is one of several Chinese developers claiming to develop AI APIs based on large language models (LLMs) at a fraction of the cost of U.S. developers without sacrificing performance. Other low-price Chinese LLM API models include those from ByteDance, Alibaba, Baidu and Tencent. These models upend the cost assumptions of traditional AI providers but raise concerns around exhibiting filtering variations across religion and culture.
AI inference costs refer to the expenses associated with running trained AI models in production, covering compute power, energy consumption and infrastructure overhead. These costs impact scalability, efficiency and cloud expenses, making cost-optimized AI deployment critical for long-term success.
The future of AI is dynamic and expansive. Trends such as domain-specific models, synthetic data, and AI-driven automation are reshaping industries. Our Top Technology Trends 2025 report offers a detailed view of these emerging patterns and their implications.
Attend a Conference
Experience Information Technology conferences
With exclusive insight from Gartner experts on the latest trends, sessions curated for your role and unmatched peer networking, Gartner conferences help you accelerate your priorities.
Gartner CIO & IT Executive Conference
São Paulo, Brazil
Drive stronger performance on your mission-critical priorities.