Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
•
1908.10084
•
Published
•
9
This is a sentence-transformers model finetuned from intfloat/multilingual-e5-small. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("agentlans/multilingual-e5-small-aligned")
# Run inference
sentences = [
'What do they think it is that prevents the products of human ingenuity from being themselves, fruits of the tree of life, and hence, in some sense, obeying evolutionary rules?',
'Կարծում եք ի՞նչն է խանգարում, որ մարդկային հնարամտության արդյունքները իրենք էլ լինեն կյանքի ծառի պտուղներ և այդպիսով ինչ-որ իմաստով ենթարկվեն էվոլուցիայի կանոններին:',
'(Smiech) No dobre, idem do Ameriky.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sentence_0 and sentence_1| sentence_0 | sentence_1 | |
|---|---|---|
| type | string | string |
| details |
|
|
| sentence_0 | sentence_1 |
|---|---|
I like English best of all subjects. |
Tykkään englannista eniten kaikista aineista. |
We shall offer negotiations. Quite right. |
- Oferecer-nos-emos para negociar. |
It was soon learned that Zelaya had been taken to Costa Rica, where he continued to call himself as the legal head of state. |
Al snel werd bekend dat Zelaya naar Costa Rica was overgebracht, waar hij zich nog steeds het officiële staatshoofd noemde. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
num_train_epochs: 1multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 8per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss |
|---|---|---|
| 0.0046 | 500 | 0.0378 |
| 0.0092 | 1000 | 0.0047 |
| 0.0138 | 1500 | 0.006 |
| 0.0185 | 2000 | 0.0045 |
| 0.0231 | 2500 | 0.0027 |
| 0.0277 | 3000 | 0.005 |
| 0.0323 | 3500 | 0.0045 |
| 0.0369 | 4000 | 0.005 |
| 0.0415 | 4500 | 0.0066 |
| 0.0461 | 5000 | 0.0029 |
| 0.0507 | 5500 | 0.0041 |
| 0.0554 | 6000 | 0.0064 |
| 0.0600 | 6500 | 0.0044 |
| 0.0646 | 7000 | 0.0039 |
| 0.0692 | 7500 | 0.0025 |
| 0.0738 | 8000 | 0.0026 |
| 0.0784 | 8500 | 0.0036 |
| 0.0830 | 9000 | 0.0027 |
| 0.0877 | 9500 | 0.0015 |
| 0.0923 | 10000 | 0.003 |
| 0.0969 | 10500 | 0.0013 |
| 0.1015 | 11000 | 0.002 |
| 0.1061 | 11500 | 0.0038 |
| 0.1107 | 12000 | 0.0017 |
| 0.1153 | 12500 | 0.0029 |
| 0.1199 | 13000 | 0.0032 |
| 0.1246 | 13500 | 0.0036 |
| 0.1292 | 14000 | 0.004 |
| 0.1338 | 14500 | 0.0036 |
| 0.1384 | 15000 | 0.0025 |
| 0.1430 | 15500 | 0.0022 |
| 0.1476 | 16000 | 0.0017 |
| 0.1522 | 16500 | 0.0019 |
| 0.1569 | 17000 | 0.0022 |
| 0.1615 | 17500 | 0.0028 |
| 0.1661 | 18000 | 0.0033 |
| 0.1707 | 18500 | 0.0025 |
| 0.1753 | 19000 | 0.0014 |
| 0.1799 | 19500 | 0.0033 |
| 0.1845 | 20000 | 0.0023 |
| 0.1891 | 20500 | 0.0023 |
| 0.1938 | 21000 | 0.0009 |
| 0.1984 | 21500 | 0.0043 |
| 0.2030 | 22000 | 0.0021 |
| 0.2076 | 22500 | 0.0025 |
| 0.2122 | 23000 | 0.0017 |
| 0.2168 | 23500 | 0.0024 |
| 0.2214 | 24000 | 0.0021 |
| 0.2261 | 24500 | 0.0023 |
| 0.2307 | 25000 | 0.0014 |
| 0.2353 | 25500 | 0.0027 |
| 0.2399 | 26000 | 0.0025 |
| 0.2445 | 26500 | 0.0022 |
| 0.2491 | 27000 | 0.0022 |
| 0.2537 | 27500 | 0.0024 |
| 0.2583 | 28000 | 0.0035 |
| 0.2630 | 28500 | 0.0032 |
| 0.2676 | 29000 | 0.0048 |
| 0.2722 | 29500 | 0.0008 |
| 0.2768 | 30000 | 0.0027 |
| 0.2814 | 30500 | 0.004 |
| 0.2860 | 31000 | 0.0013 |
| 0.2906 | 31500 | 0.002 |
| 0.2953 | 32000 | 0.0016 |
| 0.2999 | 32500 | 0.0027 |
| 0.3045 | 33000 | 0.0014 |
| 0.3091 | 33500 | 0.0022 |
| 0.3137 | 34000 | 0.0017 |
| 0.3183 | 34500 | 0.0022 |
| 0.3229 | 35000 | 0.0026 |
| 0.3275 | 35500 | 0.003 |
| 0.3322 | 36000 | 0.0022 |
| 0.3368 | 36500 | 0.0022 |
| 0.3414 | 37000 | 0.0018 |
| 0.3460 | 37500 | 0.0028 |
| 0.3506 | 38000 | 0.0018 |
| 0.3552 | 38500 | 0.0037 |
| 0.3598 | 39000 | 0.003 |
| 0.3645 | 39500 | 0.002 |
| 0.3691 | 40000 | 0.001 |
| 0.3737 | 40500 | 0.0015 |
| 0.3783 | 41000 | 0.0023 |
| 0.3829 | 41500 | 0.0017 |
| 0.3875 | 42000 | 0.0034 |
| 0.3921 | 42500 | 0.0016 |
| 0.3967 | 43000 | 0.0019 |
| 0.4014 | 43500 | 0.0015 |
| 0.4060 | 44000 | 0.0026 |
| 0.4106 | 44500 | 0.0012 |
| 0.4152 | 45000 | 0.0014 |
| 0.4198 | 45500 | 0.0027 |
| 0.4244 | 46000 | 0.0016 |
| 0.4290 | 46500 | 0.0027 |
| 0.4337 | 47000 | 0.0033 |
| 0.4383 | 47500 | 0.0023 |
| 0.4429 | 48000 | 0.0024 |
| 0.4475 | 48500 | 0.0019 |
| 0.4521 | 49000 | 0.0017 |
| 0.4567 | 49500 | 0.004 |
| 0.4613 | 50000 | 0.0036 |
| 0.4659 | 50500 | 0.001 |
| 0.4706 | 51000 | 0.0016 |
| 0.4752 | 51500 | 0.0024 |
| 0.4798 | 52000 | 0.0009 |
| 0.4844 | 52500 | 0.0011 |
| 0.4890 | 53000 | 0.0018 |
| 0.4936 | 53500 | 0.0012 |
| 0.4982 | 54000 | 0.0012 |
| 0.5029 | 54500 | 0.0014 |
| 0.5075 | 55000 | 0.0025 |
| 0.5121 | 55500 | 0.0016 |
| 0.5167 | 56000 | 0.0015 |
| 0.5213 | 56500 | 0.002 |
| 0.5259 | 57000 | 0.0008 |
| 0.5305 | 57500 | 0.0017 |
| 0.5351 | 58000 | 0.0015 |
| 0.5398 | 58500 | 0.0009 |
| 0.5444 | 59000 | 0.0019 |
| 0.5490 | 59500 | 0.0014 |
| 0.5536 | 60000 | 0.0028 |
| 0.5582 | 60500 | 0.0014 |
| 0.5628 | 61000 | 0.0032 |
| 0.5674 | 61500 | 0.0013 |
| 0.5721 | 62000 | 0.002 |
| 0.5767 | 62500 | 0.0018 |
| 0.5813 | 63000 | 0.0015 |
| 0.5859 | 63500 | 0.0008 |
| 0.5905 | 64000 | 0.0021 |
| 0.5951 | 64500 | 0.0008 |
| 0.5997 | 65000 | 0.002 |
| 0.6043 | 65500 | 0.0023 |
| 0.6090 | 66000 | 0.0022 |
| 0.6136 | 66500 | 0.0013 |
| 0.6182 | 67000 | 0.0011 |
| 0.6228 | 67500 | 0.0014 |
| 0.6274 | 68000 | 0.0027 |
| 0.6320 | 68500 | 0.002 |
| 0.6366 | 69000 | 0.0013 |
| 0.6413 | 69500 | 0.0026 |
| 0.6459 | 70000 | 0.0014 |
| 0.6505 | 70500 | 0.0017 |
| 0.6551 | 71000 | 0.0023 |
| 0.6597 | 71500 | 0.0025 |
| 0.6643 | 72000 | 0.0013 |
| 0.6689 | 72500 | 0.0008 |
| 0.6735 | 73000 | 0.0017 |
| 0.6782 | 73500 | 0.0022 |
| 0.6828 | 74000 | 0.0021 |
| 0.6874 | 74500 | 0.0008 |
| 0.6920 | 75000 | 0.0007 |
| 0.6966 | 75500 | 0.0038 |
| 0.7012 | 76000 | 0.0011 |
| 0.7058 | 76500 | 0.0016 |
| 0.7105 | 77000 | 0.0013 |
| 0.7151 | 77500 | 0.0042 |
| 0.7197 | 78000 | 0.0009 |
| 0.7243 | 78500 | 0.0004 |
| 0.7289 | 79000 | 0.0006 |
| 0.7335 | 79500 | 0.0007 |
| 0.7381 | 80000 | 0.0014 |
| 0.7428 | 80500 | 0.002 |
| 0.7474 | 81000 | 0.0017 |
| 0.7520 | 81500 | 0.0014 |
| 0.7566 | 82000 | 0.0015 |
| 0.7612 | 82500 | 0.0013 |
| 0.7658 | 83000 | 0.001 |
| 0.7704 | 83500 | 0.0019 |
| 0.7750 | 84000 | 0.0009 |
| 0.7797 | 84500 | 0.0021 |
| 0.7843 | 85000 | 0.0015 |
| 0.7889 | 85500 | 0.001 |
| 0.7935 | 86000 | 0.0008 |
| 0.7981 | 86500 | 0.0039 |
| 0.8027 | 87000 | 0.0018 |
| 0.8073 | 87500 | 0.0009 |
| 0.8120 | 88000 | 0.0018 |
| 0.8166 | 88500 | 0.0008 |
| 0.8212 | 89000 | 0.0007 |
| 0.8258 | 89500 | 0.0009 |
| 0.8304 | 90000 | 0.002 |
| 0.8350 | 90500 | 0.001 |
| 0.8396 | 91000 | 0.0007 |
| 0.8442 | 91500 | 0.0008 |
| 0.8489 | 92000 | 0.0021 |
| 0.8535 | 92500 | 0.0013 |
| 0.8581 | 93000 | 0.0009 |
| 0.8627 | 93500 | 0.002 |
| 0.8673 | 94000 | 0.0012 |
| 0.8719 | 94500 | 0.0034 |
| 0.8765 | 95000 | 0.0027 |
| 0.8812 | 95500 | 0.0006 |
| 0.8858 | 96000 | 0.002 |
| 0.8904 | 96500 | 0.0005 |
| 0.8950 | 97000 | 0.0009 |
| 0.8996 | 97500 | 0.0007 |
| 0.9042 | 98000 | 0.0015 |
| 0.9088 | 98500 | 0.0006 |
| 0.9134 | 99000 | 0.0004 |
| 0.9181 | 99500 | 0.0006 |
| 0.9227 | 100000 | 0.0031 |
| 0.9273 | 100500 | 0.0013 |
| 0.9319 | 101000 | 0.0024 |
| 0.9365 | 101500 | 0.0006 |
| 0.9411 | 102000 | 0.0017 |
| 0.9457 | 102500 | 0.0007 |
| 0.9504 | 103000 | 0.0012 |
| 0.9550 | 103500 | 0.0011 |
| 0.9596 | 104000 | 0.0007 |
| 0.9642 | 104500 | 0.0004 |
| 0.9688 | 105000 | 0.0021 |
| 0.9734 | 105500 | 0.0027 |
| 0.9780 | 106000 | 0.0016 |
| 0.9826 | 106500 | 0.0022 |
| 0.9873 | 107000 | 0.0017 |
| 0.9919 | 107500 | 0.0009 |
| 0.9965 | 108000 | 0.0008 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}