Early days for AI: Only 25% of enterprises have deployed, few rewards

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. learn more

It is expected that 2025 will be the year that AI gets real, providing a distinct, tangible benefit to enterprise.

But, according to a new man State of AI Development Report from an AI development platform Velumwe're not quite there yet: only 25% of enterprises have put AI into production, and only a quarter of those have yet seen a measurable impact.

This seems to indicate that many initiatives have not yet identified feasibility use cases for AIkeeping them (at least for now) in a pre-built holding pattern.

“This confirms that it is still very early days, despite all the hype and discussion that has been going on,” Akash Sharma, CEO of Vellum, told VentureBeat. “There's a lot of noise in the industry, new models and model suppliers coming out, new RAG techniques; we just wanted to get a piece of ground on how companies are using AI to production.”

Enterprises need to identify specific use cases to see success

Vellum interviewed over 1,250 AI developers and builders to get a real sense of what's happening in the AI trenches.

According to the report, most companies still in production are at various stages of their AI tours – building and evaluating strategies and proofs of concept (PoC) (53%) beta testing (14%) and, at the lowest level, talking to users and gathering requirements ( 7.9%).

In large part, enterprises are focused on building document parsing and analysis tools and customer service chatbots, according to Vellum. But they are also interested in applications that include natural language analytics, content generation, recommendation systems, code generation and automation and search automation.

So far, developers report competitive advantage (31.6%), cost and time savings (27.1%) and higher user adoption rates (12.6%) as the biggest impacts they've seen so far this. Interestingly, however, 24.2% have yet to see a tangible impact from their investments.

Sharma emphasized the importance of prioritizing use cases from the very beginning. “We've heard anecdotally from people that they just want to use AI for the sake of using AI,” he said. “There's an experimental budget associated with that.”

While this makes Wall Street and investors happy, that doesn't mean AI adds anything, he said thinking, is, 'How do we find the right use cases? Usually, once companies are able to identify these use cases, bring them into production and see a clear ROI, they gain more momentum, they get past the hype. That leads to more internal knowledge, more investment.”

OpenAI is still at the top, but there will be a variety of models in the future

When it comes to used models, Open AI holding the lead (no surprise there), especially the GPT 4o and the GPT 4o-mini. But Sharma pointed out that 2024 offered more choice, directly from model creators or through platform solutions like Azure or AWS Bedrock. And, providers hosting open source modules such as Llama 3.2 70B are also gaining traction – such as Groq, Fireworks AI and Together AI.

“Open Source models are getting better,” Sharma said. “OpenAI's closed source competitors are catching up in terms of quality. “

In the end, however, enterprises are not going to stick to just one model and that's it – they will increasingly stick to multi-model systems, he said.

“People will choose the best model for each task they have,” Sharma said. “When building an agent, you may have several proposals, and for each individual proposal the developer wants to get the best quality, the lowest cost and the shortest time, and maybe that doesn't come from OpenAI. “

Likewise, the AI in the future it is undoubtedly multi-modal, with Vellum seeing an increase in the adoption of devices that can handle multiple tasks. Text is undoubtedly the main use case, followed by file creation (PDFs or Word) images, audio and video.

Also, recall enhanced generation (RAG) is an opportunity when it comes to retrieving information, and more than half of developers are using vector databases to simplify search. Major open and proprietary modules include Pinecone, MongoDB, Quadrant, Elastic Search, PG vector, Weaviate and Chroma.

Everyone is involved (not just engineering)

Interestingly, AI is moving beyond just IT and becoming democratized across enterprises (similar to the old 'it takes a village'). Vellum found that while engineering was most involved in AI projects (82.3%), leadership and executives (60.8%), subject matter experts (57.5%), product teams (55.4%) and design departments ( 38.2%) with them. .

This is largely due to the ease of use of AI (as well as the general excitement around it), Sharma noted.

“This is the first time we're seeing software being developed in a very cross-functional way, especially since recommendations can be written in natural language,” he said. “Traditional software tends to be more accurate. This is a no-brainer, which will bring more people into the development fold.”

However, enterprises still face significant challenges – especially in relation to hallucinations and AI stimuli; model speed and performance; data access and security; and gaining support from key stakeholders.

At the same time, while more non-technical users are getting involved, there is still a lack of true technical knowledge, Sharma said. “The way to connect the different moving parts is still a skill that many developers don't have today,” he said. “So that's a common challenge.”

However, many challenges can be overcome with tools, or platforms and services that help developers evaluate complex AI systems, Sharma said. Developers can make tools in-house or with third-party platforms or frameworks; however, Vellum found that nearly 18% of developers define triggers and orchestration logic without any tools at all.

Sharma said that “lack of technical knowledge becomes easier when you have the right tools to guide you through the development journey. ” In addition to Vellum, frameworks and platforms used by survey participants include Langchain, Llama Index, Langfuse, CrewAI and Voiceflow.

Ongoing evaluation and monitoring is essential

Another way to overcome common issues (including hallucinations) is to make assessments, or use specific measurements to determine the correctness of a particular answer. “However, (developers) are not making intervals as regularly as they should be,” said Sharma.

Especially when it comes to advanced agent systems, enterprises need rigorous evaluation processes, he said. AI agents have a high level of uncertainty, said Sharma, as they call external systems and perform autonomous actions.

“People are trying to build very advanced systems, agent systems, and that requires a large number of test cases and some sort of automated testing framework to make sure it performs reliably in production,” said Sharma.

While some developers are taking advantage of automated evaluation tools, A/B testing and open source evaluation frameworks, Vellum found that more than three quarters still do manual testing and reviews.

“Manual testing just takes time, right? And the sample size in manual testing is usually much lower than what automated testing can do,” Sharma said. “There can be a challenge with just a sense of methodology, how to do automated assessments at scale. “

In the end, he emphasized the importance of adopting a variety of systems that work analogically – from cloud to application programming interface (API). “Consider treating AI as just a tool in the toolkit and not the magic solution for everything,” he said.

Daily thoughts on business practice issues by VB Daily

If you want to impress your boss, VB Daily has you covered. We'll give you the inside scoop on what companies are doing with generative AI, from management trends to practical application, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletter here.

An error occurred.

Source link

Enterprises need to identify specific use cases to see success

OpenAI is still at the top, but there will be a variety of models in the future

Everyone is involved (not just engineering)

Ongoing evaluation and monitoring is essential

Leave a ReplyCancel Reply