

This involves asking human raters to score different responses from the model and using those scores to improve future output. OpenAI says it achieved these results using the same approach it took with ChatGPT, using reinforcement learning via human feedback. According to the company, GPT-4 is 82% less likely than GPT-3.5 to respond to requests for content that OpenAI does not allow, and 60% less likely to make stuff up. OpenAI says it spent six months making GPT-4 safer and more accurate. “OpenAI is now a fully closed company with scientific communication akin to press releases for products,” says Wolf. In a departure from its previous releases, the company is giving away nothing about how GPT-4 was built-not the data, the amount of computing power, or the training techniques. “But it’s still like building a spaceship, where you need to get all these little components right and make sure none of it breaks.”īut OpenAI has chosen not to reveal how large GPT-4 is. “That fundamental formula has not really changed much for years,” says Jakub Pachocki, one of GPT-4’s developers. GPT-3 outperformed GPT-2 because it was more than 100 times larger, with 175 billion parameters to GPT-2’s 1.5 billion.

This follows an important trend that the company discovered with its previous models. Related StoryĪccording to OpenAI, GPT-4 performs better than ChatGPT-which is based on GPT-3.5, a version of the firm’s previous technology-because it is a larger model with more parameters (the values in a neural network that get tweaked during training). Gigantic, groundbreaking, and globally gifted.” In another demo, GPT-4 took in a document about taxes and answered questions about it, citing reasons for its responses. Guardrails, guidance, and gains garnered. In my demo during the call, I was shown GPT-4 summarizing the announcement blurb from OpenAI’s website using words that begin with g: “GPT-4, groundbreaking generational growth, gains greater grades. OpenAI’s new model appears to be better at some basic reasoning than ChatGPT, solving simple puzzles such as summarizing blocks of text in words that start with the same letter. It is not yet clear if that’s true for GPT-4.

“It might be able to tackle traditional weak points of language models, like spatial reasoning,” says Wolf. In theory, combining text and images could allow multimodal models to understand the world better. “A good multimodal model has been the holy grail of many big tech labs for the past couple of years,” says Thomas Wolf, cofounder of Hugging Face, the AI startup behind the open-source large language model BLOOM.
