In the last four months, news about artificial intelligence (AI) has been inescapable, like the reels on Instagram. When OpenAI launched ChatGPT in November 2022, it seemed to have started a race for tech companies to create something bigger and better. Microsoft powered its forgotten search engine, Bing, with AI while Google rolled out Bard. But the rushed launches embarrassed the tech giants last month with epic fails. Now the quiet one, Mark Zuckerberg’s Meta has announced a big launch with its AI, LLaMA.
This is not Meta’s first attempt to assert its place in the AI space. In August 2022, it announced the release of its chatbot, BlenderBot 3 which was designed “to improve its conversational skills and safety through feedback from people who chat with it.”
Also read: Explained: Why AI chatbots are failing
Meta has also dabbled in the academic space before with a large language model called Galactica, which was developed to assist scientists in November 2022. It was promoted as a tool that “can summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.” However, Meta took it down after it faced strong criticism for sharing incorrect information within a few hours of the demo going live.
LLaMA seems to be Meta’s latest venture into both AI and academic space. With AI launches of tech giants failing to impress, only time will tell if LLaMA can get right.
Unlike ChatGPT, Bard, or Bing, Meta’s new AI is not a conversation chatbot. It’s a tool aimed to help researchers by “democratizing access in this important, fast-changing field.”
The Large Language Model Meta AI (LLaMA) is “a state-of-the-art foundational large language model” according to Meta’s blog post.
With chatbots failing and making up information, Meta’s LLaMA seems to be determined to not follow suit and is focusing on helping “experts tease out the problems of AI language models, from bias and toxicity to their tendency to simply make up information,” as reported by The Verge.
LLaMA is made of “a collection of foundation language models” that comes in four different sizes, 7B, 13B, 33B, and 65B parameters, according to the research paper. Meta claims that LLaMA-13B performs better than OpenAI’s popular GPT-3 model “on most benchmarks,” and LLaMA-65B, is “competitive with the best models,” like DeepMind’s Chinchilla70B and Google’s PaLM 540B.
How does it work?
Large language models are natural language processing systems with “more than 100 billion parameters” that can “generate creative text, solve basic math problems, answer reading comprehension questions, and more,” according to Meta’s blog.
It works by using “a sequence of words as an input and predicts a next word to recursively generate text.” To train this, 20 languages with the most speakers were used, focusing on those with Latin and Cyrillic alphabets.
Meta acknowledges that researchers’ access to large language models remains limited because of the lack of resources required to train and run them. This, in turn, has limited the ability to understand how they work and create barriers to improving their robustness and addressing known issues, such as “bias, toxicity, and the potential for generating misinformation.”
Smaller models such as LLaMA 7, LLaMA 65B, and LLaMA 33B are trained on more tokens-- pieces of text--which are “easier to retrain and fine-tune for specific potential product use cases.”
Furthermore, LLaMA is designed to be used in diverse use cases unlike fine-tuned models designed for a specific task. The research paper also provides evaluations on benchmarks “evaluating model biases and toxicity” to highlight the model’s limitations and help further research.
Who has access to it?
LLaMA is being released under a noncommercial license focused on research use. Access to it will be granted on “a case-by-case basis” to academic researchers, people affiliated with organisations in government, civil society, and academia, as well as industry research laboratories. Those interested in accessing it must fill out an application. This extra wall of protection, an approval-only method, could help Meta protect its AI against corruption, a pertinent problem in the AI space.