|
Rank (Borda) |
Model |
Zero-shot |
Active Params (B) |
Total Params (B) |
Embedding Dim |
Max Tokens |
|---|---|---|---|---|---|---|
|
1 |
harrier-oss-v1-27b |
78% |
25.6 |
27.0 |
5376 |
131072 |
|
2 |
KaLM-Embedding-Gemma3-12B-2511 |
73% |
10.8 |
11.8 |
3840 |
32768 |
|
3 |
llama-embed-nemotron-8b |
99% |
7.0 |
7.5 |
4096 |
32768 |
|
4 |
Qwen3-Embedding-8B |
99% |
6.9 |
7.6 |
4096 |
32768 |
|
5 |
gemini-embedding-001 |
99% |
3072 |
2048 |
||
|
6 |
Qwen3-Embedding-4B |
99% |
3.6 |
4.0 |
2560 |
32768 |
|
7 |
Octen-Embedding-8B |
99% |
6.9 |
7.6 |
4096 |
32768 |
|
8 |
F2LLM-v2-14B |
88% |
13.2 |
14.0 |
5120 |
40960 |
|
9 |
F2LLM-v2-8B |
88% |
6.9 |
7.6 |
4096 |
40960 |
|
10 |
harrier-oss-v1-0.6b |
78% |
0.440 |
0.596 |
1024 |
In addition to the large 27-billion-parameter model, there are two smaller variants (0.6B and 270M) for weaker hardware. All models are available on Hugging Face under the MIT license. The team plans to integrate the technology into Bing and into new grounding services for AI agents in the future.
Embedding models are responsible for searching, retrieving, and organizing information so that AI systems can deliver accurate answers. According to Microsoft, they are becoming increasingly important in the age of AI agents, since such agents must independently search for information, update context across multiple steps, and retain memory.
ES
EN