The 2025 playbook for AI enterprise success, from reps to evals


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. learn more


2025 is going to be a very important year for the AI ​​enterprise. Rapid innovation has been seen in the past year, and this year we will see the same thing. This has made it more important than ever to revisit yours AI a strategy to stay competitive and create value for your customers. From scaling AI agents to increasing costs, here are the five critical areas that enterprises should prioritize in their AI strategy this year.

1. Agents: the next generation of automation

AI agents are no longer theoretical. In 2025, they are essential tools for enterprises looking to streamline operations and improve customer interactions. Unlike traditional software, agents powered by large language models (LLMs) can make advanced decisions, manage complex multi-step tasks, and quickly integrate with tools and APIs.

At the beginning of 2024, producers were not ready for prime time, making frustrating mistakes such as hallucinating URLs. They began to improve as the large boundary language models themselves improved.

“Let me put it this way,” said Sam Witteveen, co-founder of Red Dragon, a company that develops agents for companies, and who recently reviewed the 48 agents he built last year. Interestingly, the ones we built at the beginning of the year, a lot of those worked much better at the end of the year just because the models got better.” Witteveen shared this in the video podcast we filmed to discuss these five big trends in detail.

Models are getting better and making less sense, and they are also being trained to perform agent functions. Another feature that model providers are exploring is a way to use the LLM as a judge, and as models become cheaper (something we will open below), companies can offer three or more models. used to choose the best product to make a decision. forward.

Another part of the secret sauce? Recall enhanced generation (RAG), which allows agents to store and reuse knowledge efficiently, is getting better. Imagine a travel agent bot that not only plans trips but reserves flights and hotels in real time based on updated preferences and budgets.

Take away: Businesses need to identify use cases where agents can provide high ROI – be it in customer service, sales, or internal workflows. The use of tools and advanced reasoning abilities will define the winners in this field.

2. Evals: the foundation of reliable AI

Assessments, or “assessments,” are the backbone of robust AI deployment. This is the process of choosing which LLM – among the hundreds now available – to use for your task. This is important for accuracy, but also for aligning AI results with enterprise goals. A good evaluation ensures that a chatbot understands tone, a recommendation system provides relevant options, and a predictive model avoids costly mistakes.

For example, a company's evaluation for a customer support chatbot might include metrics for average resolution time, response accuracy, and customer satisfaction scores.

Many companies have been investing a lot of time in processing inputs and outputs to match company expectations and workflows, but this can take a lot of time and resources. As the models themselves get better, many companies are saving effort by relying more on the models themselves to do the job, so it's more important to choose the right one.

And this process forces clear communication and better decisions. When you get “much more aware of how you evaluate the outcome of something and what you really want, that doesn't just make you better at LLMs and AI, it actually makes you make people better,” Witteveen said. “When you can clearly tell a person: This is what I want, this is what I want it to look like, this is what I'm going to expect in it .When you get very specific about it that, suddenly people perform much better.”

Witteveen noted that company managers and other developers tell him: “Oh, you know, I've gotten a lot better at leading my team just from getting good at engineering.” quick or just getting good at, you know, looking at writing the right grades for models. “

By writing clear timelines, businesses force themselves to clarify goals – a win for both people and machines.

Take away: Creating high quality valves is essential. Start with clear criteria: response accuracy, resolution time, and alignment with business goals. This ensures that your AI not only performs but aligns with your brand values.

3. Cost efficiency: scaling AI without breaking the bank

AI is becoming cheaper, but strategic use is still important. Improvements at all levels of the LLM series bring significant cost reductions. Intense competition among LLM providers, and from open source competitors, leads to regular price cuts.

At the same time, post-training software techniques make LLMs more efficient.

Competition from new hardware vendors such as Groq's LPUs, and developments by legacy GPU supplier Nvidia, are significantly reducing decision costs, making AI accessible for more use cases.

The real improvements come from optimizing the way models are put to work in applications, which is decision time, rather than training time, when models are first constructed using data. Other techniques such as model distillation, along with hardware innovations, mean that companies can achieve more withss. It's no longer about whether you can afford AI – you can make most projects significantly more expensive this year than even six months ago – but how to scale it.

Take away: Conduct a cost-effectiveness analysis for your AI projects. Compare hardware options and explore methods such as model distillation to cut costs without compromising performance.

4. Memory personalization: tailoring AI to your users

Personalization is no longer optional – it's expected. In 2025, memory-enabled AI systems make this a reality. By remembering user preferences and past interactions, AI can deliver more tailored and efficient experiences.

Memory personalization is not widely or openly discussed because users often feel uneasy about AI applications storing personal information to improve service. There are privacy concerns, and the ick factor when a model spews answers that show they know a lot about you—for example, how many kids you have, what you do for a living, and what are your personal tastes. OpenAI, for one, protects ChatGPT user information in its system memory – which can be turned off and deleted, although it's on by default.

Although businesses using OpenAI and other models that do this cannot access the same information, they can create their own memory systems using RAG, ensuring that data is both secure and influential. However, enterprises must tread carefully, balancing personalization with privacy.

Take away: Develop a clear strategy for memory personalization. Opt-in systems and transparent policies can build trust while delivering value.

5. Determination and calculation of test time: a new frontier of efficiency and rationality

Decision making is where AI meets the real world. In 2025, the focus is on making this process faster, cheaper and more powerful. Chain-of-concept reasoning – where models break tasks down into logical steps – is changing how enterprises approach complex problems. AI can now effectively deal with tasks that require deeper reasoning, such as strategy planning.

For example, the o3-mini OpenAI module is expected to be released later this month, followed by the full o3 module later. They introduce advanced reasoning capabilities that break down complex problems into manageable chunks, thereby reducing AI hallucinations and improving decision-making accuracy. These reasoning improvements work in areas such as math, coding, and science applications where more thinking can help – but in other areas, such as language synthesis, advances may be limited.

However, these improvements also come with increased computing demands, and therefore higher operating costs. The o3-mini is intended to provide a compromise offering to contain costs while maintaining high performance.

Take away: Identify workflows that benefit from advanced decision making techniques. Applying your own company's specific thought chain reasoning steps, and choosing optimized models, will give you a chance here.

Conclusion: Turning vision into action

AI in 2025 isn't just about adopting new tools; it's about making strategic choices. Whether it's using agents, improving standards, or scaling cost-effectively, the path to success lies in thoughtful implementation. Businesses should embrace these trends with a clear, focused strategy.

For more on these trends, check out the full video podcast between Sam Witteveen and myself here:



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *