Open Source Or OpenAI: What's The Best Path To Advanced AI? - Decrypt

Republished By Plato

Followers: 0

Can scrappy, decentralized, open-source artificial intelligence models compete with well-funded proprietary ones like OpenAI's powerful GPT-4? The frequently asked question fueled a lively debate on Twitter after a former Google AI researcher picked a side.

Arnaud Benard, co-founder of Galileo AI, threw down the gauntlet, saying, "If you think open-source models will beat GPT-4 this year, you're wrong." He cited OpenAI's talent and resources and the robust nature of GPT-4 as a product beyond an LLM, and asserted that open-source projects might struggle to shift from challengers to AI champions.

if you think open-source models will beat gpt-4 this year, you're wrong.

i worked at top ai research labs (google ai) and built open-source libraries with > 5M monthly downloads.

gpt-4 is one year old and so far, no model matches it, here's why:

1. talent - openai recruited…

— Arnaud Benard (@arnaudai) January 1, 2024

Not surprisingly, Benard's tweet sparked mixed reactions, ranging from vociferous support to fierce disagreement.

Ryan Casey, a popular AI enthusiast who writes the newsletter "Beyond The Yellow Woods," offered a more optimistic take on open-source AI's potential, stating, "Open source will match or beat [private models] this year," according to his calculations. “If there's demand for it, there will be innovation."

On the other hand, AI strategist Jeremi Traguna noted that "OpenAI's models keep moving,” adding that “open source models will have a hard time keeping up to speed to hit a moving target at the time the target is in the position to be hit." In other words, while open-source models might be catching up with GPT-3.5 in the era of GPT-4, there may be a GPT-5 by the time we have generalist LLMs that are comparable to GPT-4.5 Turbo.

A tech analyst, Jon Howells believes that resources are not the only standard separating open from closed-source LLMs.

"Mistral has huge funding, a great team, and has recently put out a GPT-3.5-beating open-source model," he wrote. "They or a similar outfit will put out a GPT-4 level open-source model by the end of this year."

Mistral AI, a French startup, has gained recognition after releasing its Mixtral LLM, which offers improved performance over GPT-3.5 in many use cases.

In a threaded discussion, Nous Research cofounder “Teknium” made an important, yet philosophical, point. "Every capability increase in OS (Open Source) is a permanent thing that can never be taken away from the world that can be used reliably forevermore," he said. Basically, as long as there’s some advancement in open-source AI technology, no company can restrict its access.

I introduced together and mistral, neither party is mad about it fwiw. But this post, he had said that no OS model will beat GPT-4, but GPT-4 will be old news - probably this year - it didnt take us long to beat 3.5, and Mistral CEO said he plans to release a GPT4 level Open…

— Teknium (e/λ) (@Teknium1) January 1, 2024

Open or closed? An unending debate

The open-source versus closed-source debate is reminiscent of the early operating system battles between Windows and Linux. Santiago Pino of ML School wrote that proprietary AI models may win over general consumers like Windows did, but that open-source software provides customization and control that can be extremely useful for corporate users.

Pino highlighted how many companies start experimenting with ChatGPT but then migrate to open-source models, which they can fine-tune and customize for their specific needs and data compliance requirements. Open-source solutions avoid vendor lock-in and provide transparency, he said.

"Closed, proprietary models might win individuals, but most companies don't want to send their data to Microsoft or Google. They want control. Open-source models are the answer," he said in a tweet days before Bernard’s thread went viral.

Open-source models will destroy ChatGPT and Gemini.

The story of open-source Large Language Models is the story of Linux. Windows and Mac won consumers, but Linux became the Internet's operating system.

The same will happen with ChatGPT, Gemini, and open-source models. Closed,… pic.twitter.com/fdmS1VNtqf

— Santiago (@svpino) December 22, 2023

This kind of view was shared in the debate about Bernard’s tweet by Sciumo Inc., a software development company, which emphasized the niche potential of open-source models: "(Open-source models) will compete where it matters: domain-specific problems with domain-specific data and expertise that (OpenAI) does not have."

Furkan Gözükara, a computer engineer who is known for his YouTube Channel SECourses, is also among those with a more nuanced stance. Talking to Decrypt, he agreed with Bernard, saying that "only at specific tasks Open Source LLMs will pass OpenAI."

Gözükara gives the example of a company that "trains LLM on (its) own documents." Yes, OpenAI has the ability to customize GPTs based on specific instructions and documents, but handling sensitive data to third parties is always a concern. That concern was recently validated when it was revealed that personalized GPTs gave away sensitive data to third-party users.

Yan Lecun, Meta’s head of AI development and a fierce open-source defender, has repeatedly stated that “open-source AI foundation models will wipe out closed and proprietary AI models." Google, another AI giant, also recognizes the threat posed by open-source AI: “Open-source models are faster, more customizable, more private, and pound-for-pound more capable," said a leaked Google memo in 2023.

It remains to be seen whether open-source models will match or surpass GPT-4 and future iterations this year. However, the perspectives from experts on both sides reveal an intriguing tension. Closed-source models may have an edge in resources and rapid iteration, but open-source tools are evolving rapidly, offering permanent capabilities and customizability. For now, the AI community can watch the competition unfold and enjoy the benefits of using the best technology available.

Edited by Ryan Ozawa.