Sunday, May 05, 2024
Advertisement

Microsoft unveils Phi-3-mini, its smallest AI model yet: How it compares to bigger models

Microsoft claims that its latest small language models have outperformed several AI models of its size, as well as bigger ones. It said that India’s ITC also leveraged the new Phi-3-mini.

Microsoft Phi-3 mini/representational.Microsoft's Phi-3-mini expands the selection of high-quality language models available to customers, offering more practical choices as they build generative AI applications. (Via Microsoft)

A few days after Meta unveiled its Llama 3 Large Language Model (LLM), Microsoft on Tuesday (April 23) unveiled the latest version of its ‘lightweight’ AI model – the Phi-3-Mini. Microsoft has described the Phi-3 as a family of open AI models that are the most capable and cost-effective small language models (SLMs) available.

What exactly are language models, and how does an SLM differ from an LLM? Are there any benefits of employing an SLM for developing AI applications? We explain.

What is Phi-3-mini?

Phi-3-Mini is believed to be first among the three small models that Microsoft is planning to release. It has reportedly outperformed models of the same size and the next size up across a variety of benchmarks, in areas like language, reasoning, coding, and maths.

Advertisement

Essentially, language models are the backbone of AI applications like ChatGPT, Claude, Gemini, etc. These models are trained on existing data to solve common language problems such as text classification, answering questions, text generation, document summarisation, etc.

The ‘Large’ in LLMs has two meanings — the enormous size of training data; and the parameter count. In the field of Machine Learning, where machines are equipped to learn things themselves without being instructed, parameters are the memories and knowledge that a machine has learned during its model training. They define the skill of the model in solving a specific problem.

Festive offer

What’s new in Microsoft’s Phi-3-mini?

The latest model from Microsoft expands the selection of high-quality language models available to customers, offering more practical choices as they build generative AI applications. Phi-3-mini, a 3.8B language model, is available on AI development platforms such as Microsoft Azure AI Studio, HuggingFace, and Ollama.

The amount of conversation that an AI can read and write at any given time is called the context window, and is measured in something called tokens. According to Microsoft, Phi-3-mini is available in two variants, one with 4K context-length, and another with 128K tokens.

Advertisement

With longer context windows, models are more capable of taking in and reasoning over large text content such as documents, web pages, code, and more.

Phi-3-mini is the first model in its class to support a context window of up to 128K tokens, with little impact on quality. The model is instruction-tuned, which means that it is trained to follow the different types of instructions given by users. This also means that the model is ‘ready to use out-of-the-box’.

Microsoft says that in the coming weeks, new models will be added to the Phi-3 family to offer customers more flexibility. Phi-3-small (7B) and Phi-3-Medium will be available in the Azure AI model catalogue and other model libraries shortly.

How is Phi-3-mini different from LLMs?

Phi-3-mini is an SLM. Simply, SLMs are more streamlined versions of large language models. When compared to LLMs, smaller AI models are also cost-effective to develop and operate, and they perform better on smaller devices like laptops and smartphones.

Advertisement

According to Microsoft, SLMs are great for “resource-constrained environments including on-device and offline inference scenarios.” The company claims such models are good for scenarios where fast response times are critical, say for chabots or virtual assistants. Moreover, they are ideal for cost-constrained use cases, particularly with simpler tasks.

While LLMs are trained on massive general data, SLMs stand out with their specialisation. Through fine-tuning, SLMs can be customised for specific tasks and achieve accuracy and efficiency in doing them. Most SLMs undergo targeted training, demanding considerably less computing power and energy compared to LLMs.

SLMs also differ when it comes to inference speed and latency. Their compact size allows for quicker processing. Their cost makes them appealing to smaller organisations and research groups.

How good are the Phi-3 models?

Phi-2 was introduced in December 2023 and reportedly equaled models like Meta’s Llama 2. Microsoft claims that the Phi-3-mini is better than its predecessors and can respond like a model that is 10 times bigger than it.

Advertisement

Based on the performance results shared by Microsoft, Phi-3 models significantly outperformed several models of the same size or even larger ones, including Gemma 7B and Mistral 7B, in key areas.

Microsoft claims that Phi-3-mini demonstrates strong reasoning and logic capabilities. In its blog, Microsoft also said, “ITC, a leading business conglomerate based in India, is leveraging Phi-3 as part of their continued collaboration with Microsoft on the copilot for Krishi Mitra, a farmer-facing app that reaches over a million farmers.”

Bijin Jose, an Assistant Editor at Indian Express Online in New Delhi, is a technology journalist with a portfolio spanning various prestigious publications. Starting as a citizen journalist with The Times of India in 2013, he transitioned through roles at India Today Digital and The Economic Times, before finding his niche at The Indian Express. With a BA in English from Maharaja Sayajirao University, Vadodara, and an MA in English Literature, Bijin's expertise extends from crime reporting to cultural features. With a keen interest in closely covering developments in artificial intelligence, Bijin provides nuanced perspectives on its implications for society and beyond. ... Read More

First uploaded on: 25-04-2024 at 15:55 IST
Latest Comment
Post Comment
Read Comments
Advertisement
Advertisement
Advertisement
Advertisement
close