ide-code-retrieval-qwen3-0.6b

A SentenceTransformer model fine-tuned from Qwen/Qwen3-Embedding-0.6B for IDE code retrieval -- mapping natural-language commit queries to relevant source code documents via dense vector similarity.

Note: This is an intermediate checkpoint at step 9,000 / 9,150 (98.4% through 3 epochs). Training loss is still decreasing, so a later checkpoint may perform better.

Model Description

This model encodes both short natural-language queries (commit messages, search queries) and longer code documents into a shared embedding space. Retrieval is performed by computing cosine similarity between the query embedding and candidate code embeddings.

  • Base model: Qwen/Qwen3-Embedding-0.6B (0.6B parameters)
  • Max sequence length: 1024 tokens
  • Output dimensionality: 1024 (normalized)
  • Similarity function: Cosine similarity

Training Details

Dataset

  • Source: aysinghal/code-retrieval-training-dataset
  • Total pairs: 2,465,694
  • Train split: 2,342,409 pairs (95%)
  • Eval split: 123,285 pairs (5%)
  • Text strategy: truncate (max 4096 chars)
  • Negatives: Explicit hard negatives from the dataset
  • Pre-tokenized: Yes (token IDs stored on disk for zero-overhead data loading)

Loss Function

MultipleNegativesRankingLoss (InfoNCE) with explicit hard negatives. Each training example consists of an anchor (query), a positive (relevant code), and a hard negative (similar but irrelevant code). In-batch negatives provide additional contrast.

Hyperparameters

Parameter Value
Base model Qwen/Qwen3-Embedding-0.6B
Learning rate 2e-05
LR schedule Linear with warmup
Warmup ratio 0.1
Epochs 3
Effective batch size 256
Per-GPU batch size 64
Gradient accumulation 2
Max sequence length 1024 tokens
Precision BFloat16
Gradient checkpointing True
torch.compile Enabled (max-autotune)
Seed 42
Eval strategy Every 915 steps
Early stopping patience 3

Hardware

  • GPUs: 2x NVIDIA L40S
  • Total training steps: 9,150 (3 epochs)

Training Progress (at checkpoint step 9,000)

  • Training loss: 2.8684 (step 50) โ†’ 0.4907 (step 9000)
  • Best eval loss: 0.1070 (step 7,320)
  • Progress: 9,000 / 9,150 steps (98.4%)

Evaluation Results

Step Epoch Eval Loss
915 0.10 0.4585
1,830 0.20 0.2753
2,745 0.30 0.1975
3,660 0.40 0.1387
4,575 0.50 0.1226
5,490 0.60 0.1146
6,405 0.70 0.1108
7,320 0.80 0.1070
8,235 0.90 0.1072
Full training loss history (click to expand)
Step Epoch Loss Learning Rate
50 0.0055 2.8684 3.57e-07
100 0.0109 2.8360 7.21e-07
150 0.0164 2.8020 1.09e-06
200 0.0219 2.6263 1.45e-06
250 0.0273 2.3496 1.81e-06
300 0.0328 2.0695 2.18e-06
350 0.0383 1.8826 2.54e-06
400 0.0437 1.7977 2.91e-06
450 0.0492 1.7163 3.27e-06
500 0.0546 1.6614 3.64e-06
550 0.0601 1.6364 4.00e-06
600 0.0656 1.5695 4.36e-06
650 0.0710 1.5363 4.73e-06
700 0.0765 1.4884 5.09e-06
750 0.0820 1.4712 5.46e-06
800 0.0874 1.4452 5.82e-06
850 0.0929 1.4163 6.19e-06
900 0.0984 1.3578 6.55e-06
950 0.1038 1.3438 6.91e-06
1,000 0.1093 1.3334 7.28e-06
1,050 0.1148 1.3241 7.64e-06
1,100 0.1202 1.2751 8.01e-06
1,150 0.1257 1.2748 8.37e-06
1,200 0.1311 1.2224 8.74e-06
1,250 0.1366 1.2214 9.10e-06
1,300 0.1421 1.1783 9.46e-06
1,350 0.1475 1.1692 9.83e-06
1,400 0.1530 1.1478 1.02e-05
1,450 0.1585 1.1216 1.06e-05
1,500 0.1639 1.1142 1.09e-05
1,550 0.1694 1.0802 1.13e-05
1,600 0.1749 1.0743 1.17e-05
1,650 0.1803 1.0295 1.20e-05
1,700 0.1858 1.0274 1.24e-05
1,750 0.1913 0.9875 1.27e-05
1,800 0.1967 0.9791 1.31e-05
1,850 0.2022 0.9815 1.35e-05
1,900 0.2077 0.9656 1.38e-05
1,950 0.2131 0.9283 1.42e-05
2,000 0.2186 0.9117 1.46e-05
2,050 0.2240 0.9123 1.49e-05
2,100 0.2295 0.8986 1.53e-05
2,150 0.2350 0.8732 1.57e-05
2,200 0.2404 0.8718 1.60e-05
2,250 0.2459 0.8469 1.64e-05
2,300 0.2514 0.8374 1.68e-05
2,350 0.2568 0.8361 1.71e-05
2,400 0.2623 0.8316 1.75e-05
2,450 0.2678 0.8188 1.78e-05
2,500 0.2732 0.7977 1.82e-05
2,550 0.2787 0.8070 1.86e-05
2,600 0.2842 0.7797 1.89e-05
2,650 0.2896 0.7719 1.93e-05
2,700 0.2951 0.7671 1.97e-05
2,750 0.3005 0.7576 2.00e-05
2,800 0.3060 0.7494 2.00e-05
2,850 0.3115 0.7556 1.99e-05
2,900 0.3169 0.7157 1.99e-05
2,950 0.3224 0.7220 1.98e-05
3,000 0.3279 0.7031 1.98e-05
3,050 0.3333 0.7055 1.98e-05
3,100 0.3388 0.6866 1.97e-05
3,150 0.3443 0.6902 1.97e-05
3,200 0.3497 0.6629 1.96e-05
3,250 0.3552 0.6732 1.96e-05
3,300 0.3607 0.6521 1.96e-05
3,350 0.3661 0.6610 1.95e-05
3,400 0.3716 0.6449 1.95e-05
3,450 0.3770 0.6513 1.94e-05
3,500 0.3825 0.6369 1.94e-05
3,550 0.3880 0.6217 1.36e-05
3,600 0.3934 0.6107 1.35e-05
3,650 0.3989 0.6007 1.34e-05
3,700 0.4044 0.5762 1.32e-05
3,750 0.4098 0.5963 1.31e-05
3,800 0.4153 0.5910 1.30e-05
3,850 0.4208 0.5824 1.29e-05
3,900 0.4262 0.5863 1.28e-05
3,950 0.4317 0.5747 1.26e-05
4,000 0.4372 0.5840 1.25e-05
4,050 0.4426 0.5748 1.24e-05
4,100 0.4481 0.5587 1.23e-05
4,150 0.4536 0.5717 1.21e-05
4,200 0.4590 0.5375 1.20e-05
4,250 0.4645 0.5635 1.19e-05
4,300 0.4699 0.5629 1.18e-05
4,350 0.4754 0.5532 1.17e-05
4,400 0.4809 0.5480 1.15e-05
4,450 0.4863 0.5486 1.14e-05
4,500 0.4918 0.5468 1.13e-05
4,550 0.4973 0.5431 1.12e-05
4,600 0.5027 0.5455 1.11e-05
4,650 0.5082 0.5425 1.09e-05
4,700 0.5137 0.5493 1.08e-05
4,750 0.5191 0.5483 1.07e-05
4,800 0.5246 0.5237 1.06e-05
4,850 0.5301 0.5295 1.04e-05
4,900 0.5355 0.5371 1.03e-05
4,950 0.5410 0.5308 1.02e-05
5,000 0.5464 0.5392 1.01e-05
5,050 0.5519 0.5292 9.96e-06
5,100 0.5574 0.5300 9.84e-06
5,150 0.5628 0.5303 9.72e-06
5,200 0.5683 0.5096 9.60e-06
5,250 0.5738 0.5086 9.47e-06
5,300 0.5792 0.5150 9.35e-06
5,350 0.5847 0.5186 9.23e-06
5,400 0.5902 0.5129 9.11e-06
5,450 0.5956 0.5251 8.99e-06
5,500 0.6011 0.5167 8.87e-06
5,550 0.6066 0.5118 8.75e-06
5,600 0.6120 0.5036 8.62e-06
5,650 0.6175 0.5167 8.50e-06
5,700 0.6230 0.5212 8.38e-06
5,750 0.6284 0.5063 8.26e-06
5,800 0.6339 0.5089 8.14e-06
5,850 0.6393 0.5056 8.02e-06
5,900 0.6448 0.5052 7.90e-06
5,950 0.6503 0.5163 7.77e-06
6,000 0.6557 0.5154 7.65e-06
6,050 0.6612 0.4991 7.53e-06
6,100 0.6667 0.4972 7.41e-06
6,150 0.6721 0.5147 7.29e-06
6,200 0.6776 0.5022 7.17e-06
6,250 0.6831 0.5173 7.05e-06
6,300 0.6885 0.5076 6.92e-06
6,350 0.6940 0.4984 6.80e-06
6,400 0.6995 0.5018 6.68e-06
6,450 0.7049 0.5076 6.56e-06
6,500 0.7104 0.5092 6.44e-06
6,550 0.7158 0.4867 6.32e-06
6,600 0.7213 0.5019 6.20e-06
6,650 0.7268 0.5179 6.07e-06
6,700 0.7322 0.4979 5.95e-06
6,750 0.7377 0.5018 5.83e-06
6,800 0.7432 0.4907 5.71e-06
6,850 0.7486 0.5104 5.59e-06
6,900 0.7541 0.4884 5.47e-06
6,950 0.7596 0.5070 5.35e-06
7,000 0.7650 0.5014 5.22e-06
7,050 0.7705 0.5011 5.10e-06
7,100 0.7760 0.5027 4.98e-06
7,150 0.7814 0.5064 4.86e-06
7,200 0.7869 0.4900 4.74e-06
7,250 0.7923 0.4901 4.62e-06
7,300 0.7978 0.4983 4.50e-06
7,350 0.8033 0.4901 4.37e-06
7,400 0.8087 0.4965 4.25e-06
7,450 0.8142 0.4934 4.13e-06
7,500 0.8197 0.4970 4.01e-06
7,550 0.8251 0.4779 3.89e-06
7,600 0.8306 0.4889 3.77e-06
7,650 0.8361 0.4954 3.65e-06
7,700 0.8415 0.5173 3.52e-06
7,750 0.8470 0.4986 3.40e-06
7,800 0.8525 0.4947 3.28e-06
7,850 0.8579 0.4931 3.16e-06
7,900 0.8634 0.4834 3.04e-06
7,950 0.8689 0.4964 2.92e-06
8,000 0.8743 0.4890 2.80e-06
8,050 0.8798 0.4932 2.67e-06
8,100 0.8852 0.4927 2.55e-06
8,150 0.8907 0.4825 2.43e-06
8,200 0.8962 0.4897 2.31e-06
8,250 0.9016 0.5013 2.19e-06
8,300 0.9071 0.5026 2.07e-06
8,350 0.9126 0.4972 1.95e-06
8,400 0.9180 0.4993 1.82e-06
8,450 0.9235 0.4819 1.70e-06
8,500 0.9290 0.4948 1.58e-06
8,550 0.9344 0.5116 1.46e-06
8,600 0.9399 0.4863 1.34e-06
8,650 0.9454 0.4797 1.22e-06
8,700 0.9508 0.4883 1.10e-06
8,750 0.9563 0.4964 9.74e-07
8,800 0.9617 0.4955 8.52e-07
8,850 0.9672 0.4796 7.31e-07
8,900 0.9727 0.4875 6.10e-07
8,950 0.9781 0.4928 4.88e-07
9,000 0.9836 0.4907 3.67e-07

Usage

Loading the Model

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("aysinghal/ide-code-retrieval-qwen3-0.6b")

Computing Embeddings

queries = [
    "fix null pointer exception in user authentication",
    "add retry logic to API client",
]
code_docs = [
    "def authenticate(user):\n    if user is None:\n        raise ValueError...",
    "class APIClient:\n    def request(self, url, retries=3):\n        ...",
]

query_embeddings = model.encode(queries)
code_embeddings = model.encode(code_docs)

# Compute cosine similarities
from sentence_transformers.util import cos_sim
similarities = cos_sim(query_embeddings, code_embeddings)
print(similarities)

Intended Use

  • Primary use case: Retrieving relevant code files/functions given a natural-language query (commit message, bug description, feature request)
  • Search pipeline: Encode a corpus of code documents offline, then at query time encode the query and find nearest neighbors via cosine similarity

Limitations

  • This is an early checkpoint (98.4% through training). The loss curve is still decreasing, so later checkpoints will likely perform better.
  • Trained on a specific code retrieval dataset; may not generalize to all programming languages or query styles without further fine-tuning.
  • Max context is 1024 tokens -- very long files are truncated.

Citation

If you use this model, please cite the base model:

@article{qwen3embedding,
  title={Qwen3-Embedding},
  author={Qwen Team},
  year={2025}
}
Downloads last month
177
Safetensors
Model size
0.6B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aysinghal/ide-code-retrieval-qwen3-0.6b

Finetuned
(158)
this model
Quantizations
1 model

Dataset used to train aysinghal/ide-code-retrieval-qwen3-0.6b

Space using aysinghal/ide-code-retrieval-qwen3-0.6b 1