OpenAI launches deep research agent for multi-step research tasks

It’s only for paid users and queries are limited for now to no more than 100 per month.

AI
Credit: JarTee/Shutterstock.com

Hot on the heels of its launch of the o3-mini model last week, OpenAI on Sunday announced another component for ChatGPT that allows the generative AI (genAI) tool to do more in-depth research.

Described as “an agent that uses reasoning to synthesize large amounts of online information and complete multi-step research tasks for you,” deep research is now available to users on ChatGPT’s $200 per month Pro plan; Plus and Team plans are next in the queue to receive access, with Enterprise users following.

The agent is powered by a version of the upcoming OpenAI o3 model optimized for web browsing and data analysis, the company said, allowing it to search the internet, then interpret and analyze what it finds — whether it’s text, images, or PDFs — and pivot in reaction to what it’s found.

“Deep research is built for people who do intensive knowledge work in areas like finance, science, policy, and engineering and need thorough, precise, and reliable research,” OpenAI said in a blog post announcing the new capability. “…Every output is fully documented, with clear citations and a summary of its thinking, making it easy to reference and verify the information. It is particularly effective at finding niche, non-intuitive information that would require browsing numerous websites.”

OpenAI promises “significantly higher” rate limits than the current 100 queries per month once it releases a faster, more efficient version of deep research powered by a smaller model. It did not say when that would be available.

To use the tool, a ChatGPT user selects “deep research” in the message composer and then enters a detailed query. OpenAI says the user can add context by attaching spreadsheets or documents. As the query runs, a sidebar shows the steps deep research is taking, with the result displayed as a report in the chat. The blog provides examples of reports generated by deep research alongside those from GPT-4o to illustrate the way deep research combines information from multiple sources into a coherent whole.

Unlike a basic ChatGPT query, deep research takes its time generating that report — OpenAI says it takes from five to 30 minutes — and is very compute intensive. The company also admits that there are limitations to its accuracy.

“Deep research unlocks significant new capabilities, but it’s still early and has limitations,” the company said. “It can sometimes hallucinate facts in responses or make incorrect inferences, though at a notably lower rate than existing ChatGPT models, according to internal evaluations. It may struggle with distinguishing authoritative information from rumors, and currently shows weakness in confidence calibration, often failing to convey uncertainty accurately. At launch, there may be minor formatting errors in reports and citations, and tasks may take longer to kick off.”

Analysts were impressed, though with varying degrees of enthusiasm.

“I think what we are starting to see is a maturation of generative AI in the real world,” said Jason Andersen, vice president and principal analyst at Moor Insights & Strategies. “Initially, generative AI was trying to get some degree of momentum with the common user with diverse needs and questions. As internet users, we expect everything to be fast (purchases, searches, emails, etc). That became a design point for AI, since it was assumed that if a prompt just sat there for minutes, users would not use it. So, speed vs. depth ended up being a trade-off to get users on board.

“Interestingly enough, data scientists make this type of trade-off every day, but the typical user just expects a certain type of response,” he said. “But now we are starting to see the value in asking different things about AI. For instance, I use AI for market research versus content generation, so would I be willing to trade speed and content generation for a better research product? In my case, the answer is yes. But for an artist or designer or someone making blog posts, the answer could very well be no.”

“OpenAI’s deep research offering is compelling,” said Jeremy Roberts, senior research director at Info-Tech Research Group. “It’s a direct attempt to address the most common concerns about ChatGPT as it exists today: depth and reliability. By offering a product that is specifically designed to cite its sources and share its thinking, OpenAI addresses the criticism that its bot is unreliable and not suitable for real work. The examples they give are highly specific and technical and suggest that OpenAI is making headway in automating these specialized tasks to a greater degree than was possible with ChatGPT.

“The implications for professions are enormous,” Roberts said. “Many high-paying careers have paths that begin with basic research tasks. This can be true in finance and consulting. With a service like deep research, it may be difficult for people new to the labor market to build the necessary competencies and create a void as employers pivot away from trainee hires. This was always an issue with LLMs, but it seems like it might be accelerating. The labor market will have to respond.”

Deep research also heralds a change in model structure, Andersen noted. “The newer models are starting to be segmented, optimized, and tuned for different types of tasks,” he said. “The company that has been leading the charge here has been Anthropic, which has specific models for developing code and deeper research tasks. So, it’s interesting to see OpenAI do this as well. 

“I think this will continue to be a trend, as we are also seeing other vendors head in this direction.”

Lynn Greiner

Lynn Greiner has been interpreting tech for businesses for over 20 years and has worked in the industry as well as writing about it, giving her a unique perspective into the issues companies face. She has both IT credentials and a business degree.

Lynn was most recently Editor in Chief of IT World Canada. Earlier in her career, Lynn held IT leadership roles at Ipsos and The NPD Group Canada. Her work has appeared in The Globe and Mail, Financial Post, InformIT, and Channel Daily News, among other publications.

She won a 2014 Excellence in Science & Technology Reporting Award sponsored by National Public Relations for her work raising the public profile of science and technology and contributing to the building of a science and technology culture in Canada.

More from this author