ide-code-retrieval-qwen3-0.6b
A SentenceTransformer model fine-tuned from Qwen/Qwen3-Embedding-0.6B for IDE code retrieval -- mapping natural-language commit queries to relevant source code documents via dense vector similarity.
Note: This is an intermediate checkpoint at step 9,000 / 9,150 (98.4% through 3 epochs). Training loss is still decreasing, so a later checkpoint may perform better.
Model Description
This model encodes both short natural-language queries (commit messages, search queries) and longer code documents into a shared embedding space. Retrieval is performed by computing cosine similarity between the query embedding and candidate code embeddings.
- Base model: Qwen/Qwen3-Embedding-0.6B (0.6B parameters)
- Max sequence length: 1024 tokens
- Output dimensionality: 1024 (normalized)
- Similarity function: Cosine similarity
Training Details
Dataset
- Source: aysinghal/code-retrieval-training-dataset
- Total pairs: 2,465,694
- Train split: 2,342,409 pairs (95%)
- Eval split: 123,285 pairs (5%)
- Text strategy: truncate (max 4096 chars)
- Negatives: Explicit hard negatives from the dataset
- Pre-tokenized: Yes (token IDs stored on disk for zero-overhead data loading)
Loss Function
MultipleNegativesRankingLoss (InfoNCE) with explicit hard negatives. Each training example consists of an anchor (query), a positive (relevant code), and a hard negative (similar but irrelevant code). In-batch negatives provide additional contrast.
Hyperparameters
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen3-Embedding-0.6B |
| Learning rate | 2e-05 |
| LR schedule | Linear with warmup |
| Warmup ratio | 0.1 |
| Epochs | 3 |
| Effective batch size | 256 |
| Per-GPU batch size | 64 |
| Gradient accumulation | 2 |
| Max sequence length | 1024 tokens |
| Precision | BFloat16 |
| Gradient checkpointing | True |
| torch.compile | Enabled (max-autotune) |
| Seed | 42 |
| Eval strategy | Every 915 steps |
| Early stopping patience | 3 |
Hardware
- GPUs: 2x NVIDIA L40S
- Total training steps: 9,150 (3 epochs)
Training Progress (at checkpoint step 9,000)
- Training loss: 2.8684 (step 50) โ 0.4907 (step 9000)
- Best eval loss: 0.1070 (step 7,320)
- Progress: 9,000 / 9,150 steps (98.4%)
Evaluation Results
| Step | Epoch | Eval Loss |
|---|---|---|
| 915 | 0.10 | 0.4585 |
| 1,830 | 0.20 | 0.2753 |
| 2,745 | 0.30 | 0.1975 |
| 3,660 | 0.40 | 0.1387 |
| 4,575 | 0.50 | 0.1226 |
| 5,490 | 0.60 | 0.1146 |
| 6,405 | 0.70 | 0.1108 |
| 7,320 | 0.80 | 0.1070 |
| 8,235 | 0.90 | 0.1072 |
Full training loss history (click to expand)
| Step | Epoch | Loss | Learning Rate |
|---|---|---|---|
| 50 | 0.0055 | 2.8684 | 3.57e-07 |
| 100 | 0.0109 | 2.8360 | 7.21e-07 |
| 150 | 0.0164 | 2.8020 | 1.09e-06 |
| 200 | 0.0219 | 2.6263 | 1.45e-06 |
| 250 | 0.0273 | 2.3496 | 1.81e-06 |
| 300 | 0.0328 | 2.0695 | 2.18e-06 |
| 350 | 0.0383 | 1.8826 | 2.54e-06 |
| 400 | 0.0437 | 1.7977 | 2.91e-06 |
| 450 | 0.0492 | 1.7163 | 3.27e-06 |
| 500 | 0.0546 | 1.6614 | 3.64e-06 |
| 550 | 0.0601 | 1.6364 | 4.00e-06 |
| 600 | 0.0656 | 1.5695 | 4.36e-06 |
| 650 | 0.0710 | 1.5363 | 4.73e-06 |
| 700 | 0.0765 | 1.4884 | 5.09e-06 |
| 750 | 0.0820 | 1.4712 | 5.46e-06 |
| 800 | 0.0874 | 1.4452 | 5.82e-06 |
| 850 | 0.0929 | 1.4163 | 6.19e-06 |
| 900 | 0.0984 | 1.3578 | 6.55e-06 |
| 950 | 0.1038 | 1.3438 | 6.91e-06 |
| 1,000 | 0.1093 | 1.3334 | 7.28e-06 |
| 1,050 | 0.1148 | 1.3241 | 7.64e-06 |
| 1,100 | 0.1202 | 1.2751 | 8.01e-06 |
| 1,150 | 0.1257 | 1.2748 | 8.37e-06 |
| 1,200 | 0.1311 | 1.2224 | 8.74e-06 |
| 1,250 | 0.1366 | 1.2214 | 9.10e-06 |
| 1,300 | 0.1421 | 1.1783 | 9.46e-06 |
| 1,350 | 0.1475 | 1.1692 | 9.83e-06 |
| 1,400 | 0.1530 | 1.1478 | 1.02e-05 |
| 1,450 | 0.1585 | 1.1216 | 1.06e-05 |
| 1,500 | 0.1639 | 1.1142 | 1.09e-05 |
| 1,550 | 0.1694 | 1.0802 | 1.13e-05 |
| 1,600 | 0.1749 | 1.0743 | 1.17e-05 |
| 1,650 | 0.1803 | 1.0295 | 1.20e-05 |
| 1,700 | 0.1858 | 1.0274 | 1.24e-05 |
| 1,750 | 0.1913 | 0.9875 | 1.27e-05 |
| 1,800 | 0.1967 | 0.9791 | 1.31e-05 |
| 1,850 | 0.2022 | 0.9815 | 1.35e-05 |
| 1,900 | 0.2077 | 0.9656 | 1.38e-05 |
| 1,950 | 0.2131 | 0.9283 | 1.42e-05 |
| 2,000 | 0.2186 | 0.9117 | 1.46e-05 |
| 2,050 | 0.2240 | 0.9123 | 1.49e-05 |
| 2,100 | 0.2295 | 0.8986 | 1.53e-05 |
| 2,150 | 0.2350 | 0.8732 | 1.57e-05 |
| 2,200 | 0.2404 | 0.8718 | 1.60e-05 |
| 2,250 | 0.2459 | 0.8469 | 1.64e-05 |
| 2,300 | 0.2514 | 0.8374 | 1.68e-05 |
| 2,350 | 0.2568 | 0.8361 | 1.71e-05 |
| 2,400 | 0.2623 | 0.8316 | 1.75e-05 |
| 2,450 | 0.2678 | 0.8188 | 1.78e-05 |
| 2,500 | 0.2732 | 0.7977 | 1.82e-05 |
| 2,550 | 0.2787 | 0.8070 | 1.86e-05 |
| 2,600 | 0.2842 | 0.7797 | 1.89e-05 |
| 2,650 | 0.2896 | 0.7719 | 1.93e-05 |
| 2,700 | 0.2951 | 0.7671 | 1.97e-05 |
| 2,750 | 0.3005 | 0.7576 | 2.00e-05 |
| 2,800 | 0.3060 | 0.7494 | 2.00e-05 |
| 2,850 | 0.3115 | 0.7556 | 1.99e-05 |
| 2,900 | 0.3169 | 0.7157 | 1.99e-05 |
| 2,950 | 0.3224 | 0.7220 | 1.98e-05 |
| 3,000 | 0.3279 | 0.7031 | 1.98e-05 |
| 3,050 | 0.3333 | 0.7055 | 1.98e-05 |
| 3,100 | 0.3388 | 0.6866 | 1.97e-05 |
| 3,150 | 0.3443 | 0.6902 | 1.97e-05 |
| 3,200 | 0.3497 | 0.6629 | 1.96e-05 |
| 3,250 | 0.3552 | 0.6732 | 1.96e-05 |
| 3,300 | 0.3607 | 0.6521 | 1.96e-05 |
| 3,350 | 0.3661 | 0.6610 | 1.95e-05 |
| 3,400 | 0.3716 | 0.6449 | 1.95e-05 |
| 3,450 | 0.3770 | 0.6513 | 1.94e-05 |
| 3,500 | 0.3825 | 0.6369 | 1.94e-05 |
| 3,550 | 0.3880 | 0.6217 | 1.36e-05 |
| 3,600 | 0.3934 | 0.6107 | 1.35e-05 |
| 3,650 | 0.3989 | 0.6007 | 1.34e-05 |
| 3,700 | 0.4044 | 0.5762 | 1.32e-05 |
| 3,750 | 0.4098 | 0.5963 | 1.31e-05 |
| 3,800 | 0.4153 | 0.5910 | 1.30e-05 |
| 3,850 | 0.4208 | 0.5824 | 1.29e-05 |
| 3,900 | 0.4262 | 0.5863 | 1.28e-05 |
| 3,950 | 0.4317 | 0.5747 | 1.26e-05 |
| 4,000 | 0.4372 | 0.5840 | 1.25e-05 |
| 4,050 | 0.4426 | 0.5748 | 1.24e-05 |
| 4,100 | 0.4481 | 0.5587 | 1.23e-05 |
| 4,150 | 0.4536 | 0.5717 | 1.21e-05 |
| 4,200 | 0.4590 | 0.5375 | 1.20e-05 |
| 4,250 | 0.4645 | 0.5635 | 1.19e-05 |
| 4,300 | 0.4699 | 0.5629 | 1.18e-05 |
| 4,350 | 0.4754 | 0.5532 | 1.17e-05 |
| 4,400 | 0.4809 | 0.5480 | 1.15e-05 |
| 4,450 | 0.4863 | 0.5486 | 1.14e-05 |
| 4,500 | 0.4918 | 0.5468 | 1.13e-05 |
| 4,550 | 0.4973 | 0.5431 | 1.12e-05 |
| 4,600 | 0.5027 | 0.5455 | 1.11e-05 |
| 4,650 | 0.5082 | 0.5425 | 1.09e-05 |
| 4,700 | 0.5137 | 0.5493 | 1.08e-05 |
| 4,750 | 0.5191 | 0.5483 | 1.07e-05 |
| 4,800 | 0.5246 | 0.5237 | 1.06e-05 |
| 4,850 | 0.5301 | 0.5295 | 1.04e-05 |
| 4,900 | 0.5355 | 0.5371 | 1.03e-05 |
| 4,950 | 0.5410 | 0.5308 | 1.02e-05 |
| 5,000 | 0.5464 | 0.5392 | 1.01e-05 |
| 5,050 | 0.5519 | 0.5292 | 9.96e-06 |
| 5,100 | 0.5574 | 0.5300 | 9.84e-06 |
| 5,150 | 0.5628 | 0.5303 | 9.72e-06 |
| 5,200 | 0.5683 | 0.5096 | 9.60e-06 |
| 5,250 | 0.5738 | 0.5086 | 9.47e-06 |
| 5,300 | 0.5792 | 0.5150 | 9.35e-06 |
| 5,350 | 0.5847 | 0.5186 | 9.23e-06 |
| 5,400 | 0.5902 | 0.5129 | 9.11e-06 |
| 5,450 | 0.5956 | 0.5251 | 8.99e-06 |
| 5,500 | 0.6011 | 0.5167 | 8.87e-06 |
| 5,550 | 0.6066 | 0.5118 | 8.75e-06 |
| 5,600 | 0.6120 | 0.5036 | 8.62e-06 |
| 5,650 | 0.6175 | 0.5167 | 8.50e-06 |
| 5,700 | 0.6230 | 0.5212 | 8.38e-06 |
| 5,750 | 0.6284 | 0.5063 | 8.26e-06 |
| 5,800 | 0.6339 | 0.5089 | 8.14e-06 |
| 5,850 | 0.6393 | 0.5056 | 8.02e-06 |
| 5,900 | 0.6448 | 0.5052 | 7.90e-06 |
| 5,950 | 0.6503 | 0.5163 | 7.77e-06 |
| 6,000 | 0.6557 | 0.5154 | 7.65e-06 |
| 6,050 | 0.6612 | 0.4991 | 7.53e-06 |
| 6,100 | 0.6667 | 0.4972 | 7.41e-06 |
| 6,150 | 0.6721 | 0.5147 | 7.29e-06 |
| 6,200 | 0.6776 | 0.5022 | 7.17e-06 |
| 6,250 | 0.6831 | 0.5173 | 7.05e-06 |
| 6,300 | 0.6885 | 0.5076 | 6.92e-06 |
| 6,350 | 0.6940 | 0.4984 | 6.80e-06 |
| 6,400 | 0.6995 | 0.5018 | 6.68e-06 |
| 6,450 | 0.7049 | 0.5076 | 6.56e-06 |
| 6,500 | 0.7104 | 0.5092 | 6.44e-06 |
| 6,550 | 0.7158 | 0.4867 | 6.32e-06 |
| 6,600 | 0.7213 | 0.5019 | 6.20e-06 |
| 6,650 | 0.7268 | 0.5179 | 6.07e-06 |
| 6,700 | 0.7322 | 0.4979 | 5.95e-06 |
| 6,750 | 0.7377 | 0.5018 | 5.83e-06 |
| 6,800 | 0.7432 | 0.4907 | 5.71e-06 |
| 6,850 | 0.7486 | 0.5104 | 5.59e-06 |
| 6,900 | 0.7541 | 0.4884 | 5.47e-06 |
| 6,950 | 0.7596 | 0.5070 | 5.35e-06 |
| 7,000 | 0.7650 | 0.5014 | 5.22e-06 |
| 7,050 | 0.7705 | 0.5011 | 5.10e-06 |
| 7,100 | 0.7760 | 0.5027 | 4.98e-06 |
| 7,150 | 0.7814 | 0.5064 | 4.86e-06 |
| 7,200 | 0.7869 | 0.4900 | 4.74e-06 |
| 7,250 | 0.7923 | 0.4901 | 4.62e-06 |
| 7,300 | 0.7978 | 0.4983 | 4.50e-06 |
| 7,350 | 0.8033 | 0.4901 | 4.37e-06 |
| 7,400 | 0.8087 | 0.4965 | 4.25e-06 |
| 7,450 | 0.8142 | 0.4934 | 4.13e-06 |
| 7,500 | 0.8197 | 0.4970 | 4.01e-06 |
| 7,550 | 0.8251 | 0.4779 | 3.89e-06 |
| 7,600 | 0.8306 | 0.4889 | 3.77e-06 |
| 7,650 | 0.8361 | 0.4954 | 3.65e-06 |
| 7,700 | 0.8415 | 0.5173 | 3.52e-06 |
| 7,750 | 0.8470 | 0.4986 | 3.40e-06 |
| 7,800 | 0.8525 | 0.4947 | 3.28e-06 |
| 7,850 | 0.8579 | 0.4931 | 3.16e-06 |
| 7,900 | 0.8634 | 0.4834 | 3.04e-06 |
| 7,950 | 0.8689 | 0.4964 | 2.92e-06 |
| 8,000 | 0.8743 | 0.4890 | 2.80e-06 |
| 8,050 | 0.8798 | 0.4932 | 2.67e-06 |
| 8,100 | 0.8852 | 0.4927 | 2.55e-06 |
| 8,150 | 0.8907 | 0.4825 | 2.43e-06 |
| 8,200 | 0.8962 | 0.4897 | 2.31e-06 |
| 8,250 | 0.9016 | 0.5013 | 2.19e-06 |
| 8,300 | 0.9071 | 0.5026 | 2.07e-06 |
| 8,350 | 0.9126 | 0.4972 | 1.95e-06 |
| 8,400 | 0.9180 | 0.4993 | 1.82e-06 |
| 8,450 | 0.9235 | 0.4819 | 1.70e-06 |
| 8,500 | 0.9290 | 0.4948 | 1.58e-06 |
| 8,550 | 0.9344 | 0.5116 | 1.46e-06 |
| 8,600 | 0.9399 | 0.4863 | 1.34e-06 |
| 8,650 | 0.9454 | 0.4797 | 1.22e-06 |
| 8,700 | 0.9508 | 0.4883 | 1.10e-06 |
| 8,750 | 0.9563 | 0.4964 | 9.74e-07 |
| 8,800 | 0.9617 | 0.4955 | 8.52e-07 |
| 8,850 | 0.9672 | 0.4796 | 7.31e-07 |
| 8,900 | 0.9727 | 0.4875 | 6.10e-07 |
| 8,950 | 0.9781 | 0.4928 | 4.88e-07 |
| 9,000 | 0.9836 | 0.4907 | 3.67e-07 |
Usage
Loading the Model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("aysinghal/ide-code-retrieval-qwen3-0.6b")
Computing Embeddings
queries = [
"fix null pointer exception in user authentication",
"add retry logic to API client",
]
code_docs = [
"def authenticate(user):\n if user is None:\n raise ValueError...",
"class APIClient:\n def request(self, url, retries=3):\n ...",
]
query_embeddings = model.encode(queries)
code_embeddings = model.encode(code_docs)
# Compute cosine similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(query_embeddings, code_embeddings)
print(similarities)
Intended Use
- Primary use case: Retrieving relevant code files/functions given a natural-language query (commit message, bug description, feature request)
- Search pipeline: Encode a corpus of code documents offline, then at query time encode the query and find nearest neighbors via cosine similarity
Limitations
- This is an early checkpoint (98.4% through training). The loss curve is still decreasing, so later checkpoints will likely perform better.
- Trained on a specific code retrieval dataset; may not generalize to all programming languages or query styles without further fine-tuning.
- Max context is 1024 tokens -- very long files are truncated.
Citation
If you use this model, please cite the base model:
@article{qwen3embedding,
title={Qwen3-Embedding},
author={Qwen Team},
year={2025}
}
- Downloads last month
- 177