Microsoft Open-Sources Harrier Embedding Model, Claims Top Multilingual Benchmark Performance

Details: By Daniel Mercer; Category: Models; 1 m; 08 April 2026; 119

Microsoft’s Bing team - yes, really - has released the embedding model Harrier as open source. Harrier supports more than 100 languages, offers a context window of 32,000 tokens, and was trained on more than two billion training examples as well as synthetic data generated by GPT-5. According to the team, Harrier ranks No. 1 on the multilingual MTEB-v2 benchmark and, according to Microsoft, outperforms proprietary models from OpenAI and Amazon.

A futuristic AI search interface with multilingual text streams, vector embeddings, and Microsoft-style data visualization representing open-source retrieval technology

Rank (Borda)	Model	Zero-shot	Active Params (B)	Total Params (B)	Embedding Dim	Max Tokens
1	harrier-oss-v1-27b	78%	25.6	27.0	5376	131072
2	KaLM-Embedding-Gemma3-12B-2511	73%	10.8	11.8	3840	32768
3	llama-embed-nemotron-8b	99%	7.0	7.5	4096	32768
4	Qwen3-Embedding-8B	99%	6.9	7.6	4096	32768
5	gemini-embedding-001	99%	3072	2048
6	Qwen3-Embedding-4B	99%	3.6	4.0	2560	32768
7	Octen-Embedding-8B	99%	6.9	7.6	4096	32768
8	F2LLM-v2-14B	88%	13.2	14.0	5120	40960
9	F2LLM-v2-8B	88%	6.9	7.6	4096	40960
10	harrier-oss-v1-0.6b	78%	0.440	0.596	1024

In addition to the large 27-billion-parameter model, there are two smaller variants (0.6B and 270M) for weaker hardware. All models are available on Hugging Face under the MIT license. The team plans to integrate the technology into Bing and into new grounding services for AI agents in the future.

Embedding models are responsible for searching, retrieving, and organizing information so that AI systems can deliver accurate answers. According to Microsoft, they are becoming increasingly important in the age of AI agents, since such agents must independently search for information, update context across multiple steps, and retain memory.

About The Hosts

Daniel Mercer

AI Research Contributor

Daniel Mercer is an AI research contributor specializing in large language models, benchmarking, and multimodal systems. He writes about model capabilities, limitations, and real-world performance across leading AI assistants and platforms.

AI News

Accenture Tracks AI Tool Usage and Ties Adoption to Promotions

Adobe Firefly Introduces Unlimited AI Image and Video Generation for Subscribers

Adobe Unveils CX Enterprise AI Agent Platform as It Searches for a New CEO

AGI May Arrive by 2026–2027, Warns Anthropic CEO Dario Amodei

AI & Society

AI Agents Create a Lobster Religion on Moltbook

AI Boom Drives Cybersecurity Hiring Despite Tech Sector Layoffs

AI Could Trigger a Major U.S. Economic Crisis by 2028, Citrini Research Warns

AI Is Increasing Workload Instead of Reducing It, ActivTrak Study Finds

AI Insights

Adobe Reinvents Document Work with Acrobat Studio and AI

AI agents could disrupt ads and reshape internet commerce

AI as a Role Model for Generation Alpha: Promise, Risks, and the Future of Childhood

AI as a Toy: Why Humanity Always Misuses New Technology First

Microsoft Open-Sources Harrier Embedding Model, Claims Top Multilingual Benchmark Performance

About The Hosts

More From Daniel Mercer

Models

OpenAI Updates ChatGPT and Plans to Retire Older Models

Industry

DeepSeek Builds Code Harness to Rival Claude Code and Codex

Models

Cursor Releases Composer 2.5 AI Coding Model Based on Kimi K2.5

Platforms

Meta Launches Incognito Chat for Private AI Conversations

Culture

Richard Dawkins Spent Two Days Trying to Prove Claude Isn't Conscious — and Changed His Mind

Platforms

OpenAI, NVIDIA, AMD, Microsoft, Intel, and Broadcom Unveil MRC — New Networking Protocol for AI Supercomputers

Industry

Anthropic Launches 10 Pre-Built AI Agents for Finance — Taking on OpenAI for Enterprise Clients

Models

OpenAI Replaces ChatGPT's Default Model with GPT-5.5 Instant — 52.5% Fewer Hallucinations and New Memory Sources

Health

Google DeepMind Tests AI Co-Clinician for Doctor-Supervised Patient Care

Models

Why GPT-5.1 Became Obsessed With Goblins: The Quirky Training Bug That Spread Across OpenAI's Models

Categories

AI News

Categories

AI & Society

Categories

AI Insights

Microsoft Open-Sources Harrier Embedding Model, Claims Top Multilingual Benchmark Performance

About The Hosts

More From Daniel Mercer