KIVA - The Ultimate AI SEO Agent by AllAboutAI Try it Today!

Microsoft’s Phi-3.5 AI Model Surpasses AI Giants Gemini and GPT-4!

  • Editor
  • August 24, 2024
    Updated
microsofts-phi-3-5-ai-model-surpasses-ai-giants-gemini-and-gpt-4

Key Takeaways:

  • Microsoft’s release of the PHI-3.5 series marks a significant leap in AI technology, outperforming competitors like Google, Meta, and OpenAI in several benchmarks.
  • The PHI-3.5 series includes three models optimized for different tasks, all of which are available under an open-source MIT license, encouraging widespread adoption and innovation.
  • These models are designed to be highly efficient, making them suitable for both large-scale cloud environments and resource-constrained applications such as IoT devices.
  • Microsoft’s focus on smaller, more efficient models challenges the traditional emphasis on larger AI models, potentially setting a new standard in AI development.

Microsoft isn’t resting its AI success on the laurels of its partnership with OpenAI.

Instead, the company, often known as Redmond for its headquarters in Washington state, has boldly released three new models in its evolving Phi series of language and multimodal AI.

The new Phi-3.5 models include the 3.82 billion parameter Phi-3.5-mini-instruct, the 41.9 billion parameter Phi-3.5-MoE-instruct, and the 4.15 billion parameter Phi-3.5-vision-instruct.


Each of these models is designed for specific tasks such as basic and fast reasoning, more powerful reasoning, and vision-related tasks like image and video analysis.

These models are available for developers to download, use, and fine-tune on Hugging Face under a Microsoft-branded MIT License, allowing for unrestricted commercial usage and modification.


All three models boast near state-of-the-art performance across several third-party benchmarks, surpassing other AI providers, including Google’s Gemini 1.5 Flash, Meta’s Llama 3.1, and even OpenAI’s GPT-4o in some cases.

MicrosoftPhi-3.5-beats-Gemini-openai-other-rivals-2

This performance and the permissive open license have garnered praise for Microsoft across social networks, particularly on X.

The Phi-3.5 Mini Instruct model is a lightweight AI model with 3.8 billion parameters, optimized for instruction adherence and supporting a 128k token context length.


This model is ideal for scenarios that demand strong reasoning capabilities in memory- or compute-constrained environments, such as code generation, mathematical problem-solving, and logic-based reasoning.

MicrosoftPhi-3.5-beats-Gemini-openai-other-rivals-2

Despite its compact size, the Phi-3.5 Mini Instruct model demonstrates competitive performance in multilingual and multi-turn conversational tasks, reflecting significant improvements from its predecessors.

The Phi-3.5 MoE model, or Mixture of Experts, is the first in this model class from Microsoft, combining multiple different model types into one, each specializing in different tasks.

This model leverages an architecture with 42 billion active parameters and supports a 128k token context length, providing scalable AI performance for demanding applications.


However, according to Hugging Face documentation, it operates only with 6.6 billion active parameters.

The MoE model’s unique architecture allows it to maintain efficiency while handling complex AI tasks across multiple languages.

It impressively beats GPT-4o mini on the 5-shot MMLU (Massive Multitask Language Understanding) across subjects such as STEM, the humanities, and social sciences, at varying levels of expertise.

MicrosoftPhi-3.5-beats-Gemini-openai-other-rivals-1

Completing the trio is the Phi-3.5 Vision Instruct model, which integrates both text and image processing capabilities.

This multimodal model is particularly suited for general image understanding, optical character recognition, chart and table comprehension, and video summarization.

Like the other models in the Phi-3.5 series, Vision Instruct supports a 128k token context length, enabling it to manage complex, multi-frame visual tasks.


Microsoft highlights that this model was trained with synthetic and filtered publicly available datasets, focusing on high-quality, reasoning-dense data.

Training these models required massive computational resources.

The Phi-3.5 Mini Instruct model was trained on 3.4 trillion tokens using 512 H100-80G GPUs over ten days, while the Vision Instruct model was trained on 500 billion tokens using 256 A100-80G GPUs over six days.


The Phi-3.5 MoE model, which features a mixture of experts architecture, was trained on 4.9 trillion tokens with 512 H100-80G GPUs over 23 days.

Microsoft’s commitment to the open-source community is evident as all three Phi-3.5 models are available under the MIT license.

This license allows developers to freely use, modify, merge, publish, distribute, sublicense, or sell copies of the software.


The license also includes a disclaimer that the software is provided “as is” without warranties. Microsoft and other copyright holders are not liable for any claims, damages, or other liabilities arising from the software’s use.

The release of the Phi-3.5 series represents a huge step forward in developing multilingual and multimodal AI.


By offering these models under an open-source license, Microsoft empowers developers to integrate cutting-edge AI capabilities into their applications, fostering innovation across both commercial and research domains.

For more news and trends, visit AI News on our website.

Was this article helpful?
YesNo
Generic placeholder image
Editor
Articles written2557

Digital marketing enthusiast by day, nature wanderer by dusk. Dave Andre blends two decades of AI and SaaS expertise into impactful strategies for SMEs. His weekends? Lost in books on tech trends and rejuvenating on scenic trails.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *