SentenceTransformer based on FacebookAI/roberta-large

This is a sentence-transformers model finetuned from FacebookAI/roberta-large on the all-nli dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: FacebookAI/roberta-large
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'RobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6579, 0.3481],
#         [0.6579, 1.0000, 0.5411],
#         [0.3481, 0.5411, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.7745 0.7441
spearman_cosine 0.7806 0.7525

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 10.38 tokens
    • max: 45 tokens
    • min: 6 tokens
    • mean: 12.8 tokens
    • max: 39 tokens
    • min: 6 tokens
    • mean: 13.4 tokens
    • max: 50 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 18.02 tokens
    • max: 66 tokens
    • min: 5 tokens
    • mean: 9.81 tokens
    • max: 29 tokens
    • min: 5 tokens
    • mean: 10.37 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512,
            256,
            128,
            64
        ],
        "matryoshka_weights": [
            1,
            1,
            1,
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.5730 -
0.0287 500 10.2711 3.3325 0.8185 -
0.0574 1000 4.1395 1.8744 0.8648 -
0.0860 1500 3.2579 1.5506 0.8684 -
0.1147 2000 2.9091 1.3770 0.8712 -
0.1434 2500 2.6568 1.3085 0.8713 -
0.1721 3000 2.4983 1.2535 0.8814 -
0.2008 3500 2.3645 1.1715 0.8724 -
0.2294 4000 2.2675 1.1691 0.8724 -
0.2581 4500 2.2197 1.1681 0.8793 -
0.2868 5000 2.1473 1.1175 0.8780 -
0.3155 5500 2.0275 1.0656 0.8779 -
0.3442 6000 2.0927 1.0879 0.8791 -
0.3729 6500 2.1037 1.0664 0.8702 -
0.4015 7000 2.0268 1.0565 0.8714 -
0.4302 7500 1.9499 1.1268 0.8660 -
0.4589 8000 1.8692 1.0884 0.8712 -
0.4876 8500 1.9749 1.1014 0.8681 -
0.5163 9000 1.9716 1.0833 0.8555 -
0.5449 9500 1.9051 1.1391 0.8690 -
0.5736 10000 1.8493 1.1056 0.8708 -
0.6023 10500 2.0184 1.0989 0.8686 -
0.6310 11000 1.7868 1.1040 0.8620 -
0.6597 11500 1.8045 1.0677 0.8643 -
0.6883 12000 1.7946 1.0693 0.8624 -
0.7170 12500 1.8075 1.1100 0.8651 -
0.7457 13000 1.7923 1.1331 0.8577 -
0.7744 13500 1.7866 1.1204 0.8552 -
0.8031 14000 1.7495 1.1104 0.8542 -
0.8318 14500 1.7729 1.1599 0.8647 -
0.8604 15000 1.7413 1.0973 0.8587 -
0.8891 15500 1.7937 1.1443 0.8572 -
0.9178 16000 1.7489 1.1553 0.8566 -
0.9465 16500 1.7695 1.1249 0.8518 -
0.9752 17000 1.6964 1.1616 0.8564 -
1.0038 17500 1.7009 1.2108 0.8419 -
1.0325 18000 1.496 1.1526 0.8572 -
1.0612 18500 1.5363 1.2081 0.8393 -
1.0899 19000 1.5324 1.2091 0.8421 -
1.1186 19500 1.532 1.2015 0.8478 -
1.1472 20000 1.6217 1.1716 0.8537 -
1.1759 20500 1.6181 1.2310 0.8553 -
1.2046 21000 1.6387 1.2562 0.8513 -
1.2333 21500 1.738 1.2336 0.8432 -
1.2620 22000 1.6215 1.2923 0.8507 -
1.2907 22500 1.6923 1.2786 0.8431 -
1.3193 23000 1.6976 1.2908 0.8410 -
1.3480 23500 1.7603 1.3793 0.8438 -
1.3767 24000 1.7386 1.3162 0.8399 -
1.4054 24500 1.6719 1.3176 0.8375 -
1.4341 25000 1.6746 1.4208 0.8358 -
1.4627 25500 1.7317 1.4188 0.8323 -
1.4914 26000 1.7185 1.3929 0.8395 -
1.5201 26500 1.6974 1.4667 0.8359 -
1.5488 27000 1.7197 1.4712 0.8325 -
1.5775 27500 1.8755 1.4054 0.8395 -
1.6061 28000 1.7886 1.4135 0.8417 -
1.6348 28500 1.8037 1.4044 0.8387 -
1.6635 29000 1.7862 1.4599 0.8344 -
1.6922 29500 1.7436 1.3778 0.8413 -
1.7209 30000 1.7524 1.3536 0.8354 -
1.7496 30500 1.7209 1.4169 0.8272 -
1.7782 31000 2.1612 13.8218 0.5311 -
1.8069 31500 2.1815 1.5036 0.8357 -
1.8356 32000 1.7241 1.4602 0.8326 -
1.8643 32500 1.6982 1.4521 0.8369 -
1.8930 33000 1.7243 1.4545 0.8428 -
1.9216 33500 1.7885 1.6161 0.8385 -
1.9503 34000 1.8334 1.5186 0.8347 -
1.9790 34500 1.8216 1.4084 0.8409 -
2.0077 35000 1.6731 1.4777 0.8364 -
2.0364 35500 1.4519 1.5688 0.8307 -
2.0650 36000 1.6391 1.4630 0.8377 -
2.0937 36500 1.5565 1.5380 0.8343 -
2.1224 37000 1.5275 1.4737 0.8244 -
2.1511 37500 1.4889 1.5225 0.8269 -
2.1798 38000 1.5439 1.4909 0.8378 -
2.2085 38500 1.4539 1.4877 0.8348 -
2.2371 39000 1.4442 1.4533 0.8321 -
2.2658 39500 1.5136 1.4661 0.8285 -
2.2945 40000 1.439 1.4510 0.8277 -
2.3232 40500 1.4663 1.4299 0.8380 -
2.3519 41000 1.4443 1.4788 0.8320 -
2.3805 41500 1.5818 1.4900 0.8261 -
2.4092 42000 1.4851 1.4582 0.8354 -
2.4379 42500 1.4569 1.4838 0.8309 -
2.4666 43000 1.4194 1.5216 0.8163 -
2.4953 43500 1.4702 1.4703 0.8196 -
2.5239 44000 1.5365 1.4784 0.8214 -
2.5526 44500 1.5114 1.4578 0.8217 -
2.5813 45000 1.5356 1.4729 0.8115 -
2.6100 45500 1.4716 1.4299 0.8309 -
2.6387 46000 1.4557 1.4769 0.8227 -
2.6674 46500 1.4415 1.5078 0.8177 -
2.6960 47000 1.4528 1.4463 0.8257 -
2.7247 47500 1.4631 1.5136 0.8232 -
2.7534 48000 1.5284 1.4869 0.8206 -
2.7821 48500 1.442 1.4322 0.8273 -
2.8108 49000 1.4305 1.4383 0.8292 -
2.8394 49500 1.4571 1.4309 0.8201 -
2.8681 50000 1.4359 1.4952 0.8178 -
2.8968 50500 1.4558 1.4832 0.8233 -
2.9255 51000 1.4631 1.5744 0.8227 -
2.9542 51500 1.4368 1.4477 0.8279 -
2.9828 52000 1.4329 1.8528 0.8233 -
3.0115 52500 1.4 1.5122 0.8238 -
3.0402 53000 1.2241 1.4225 0.8280 -
3.0689 53500 1.2561 1.5065 0.8204 -
3.0976 54000 1.3201 1.6042 0.8190 -
3.1263 54500 1.311 1.5385 0.8235 -
3.1549 55000 1.2471 1.4826 0.8256 -
3.1836 55500 1.2535 1.4369 0.8259 -
3.2123 56000 1.2698 1.6402 0.8045 -
3.2410 56500 1.2355 1.4863 0.8220 -
3.2697 57000 1.2081 1.4576 0.8195 -
3.2983 57500 1.1963 1.4918 0.8119 -
3.3270 58000 1.2593 1.4623 0.8246 -
3.3557 58500 1.282 1.4623 0.8090 -
3.3844 59000 1.2635 1.5247 0.8166 -
3.4131 59500 1.2815 1.5402 0.8202 -
3.4417 60000 1.1852 1.5276 0.8279 -
3.4704 60500 1.2092 1.3838 0.8203 -
3.4991 61000 1.216 1.4860 0.8148 -
3.5278 61500 1.2038 1.5535 0.8265 -
3.5565 62000 1.2619 1.4893 0.8213 -
3.5852 62500 1.2023 1.5940 0.8192 -
3.6138 63000 1.2061 1.5166 0.8166 -
3.6425 63500 1.3908 1.5104 0.8243 -
3.6712 64000 1.2893 1.8377 0.8200 -
3.6999 64500 1.2521 1.6505 0.8215 -
3.7286 65000 1.2866 1.5029 0.8145 -
3.7572 65500 1.4913 1.5370 0.8217 -
3.7859 66000 1.3785 1.5048 0.8168 -
3.8146 66500 1.3013 1.6031 0.8086 -
3.8433 67000 1.3738 1.6297 0.8115 -
3.8720 67500 1.2946 1.5696 0.8228 -
3.9006 68000 1.3555 1.5255 0.8089 -
3.9293 68500 1.2593 1.5023 0.8200 -
3.9580 69000 1.2875 1.5658 0.8173 -
3.9867 69500 1.2582 1.4963 0.8145 -
4.0154 70000 1.1927 1.5811 0.8084 -
4.0441 70500 1.103 1.7786 0.8005 -
4.0727 71000 1.1006 1.5336 0.8172 -
4.1014 71500 1.0872 1.6159 0.8271 -
4.1301 72000 1.1406 1.5320 0.8102 -
4.1588 72500 1.1652 1.5708 0.8213 -
4.1875 73000 1.1123 1.6201 0.8140 -
4.2161 73500 1.0834 1.5985 0.8213 -
4.2448 74000 1.0813 1.5889 0.8106 -
4.2735 74500 1.0598 1.5266 0.8152 -
4.3022 75000 1.0794 1.5154 0.8234 -
4.3309 75500 1.1016 1.6363 0.8189 -
4.3595 76000 1.1203 1.5820 0.8245 -
4.3882 76500 1.1166 1.5379 0.8197 -
4.4169 77000 1.1056 1.5454 0.8109 -
4.4456 77500 1.0499 1.4709 0.8128 -
4.4743 78000 1.0752 1.5489 0.8111 -
4.5030 78500 1.1039 1.5323 0.8238 -
4.5316 79000 1.0726 1.4388 0.8175 -
4.5603 79500 1.0873 1.5391 0.8165 -
4.5890 80000 1.1028 1.4887 0.8118 -
4.6177 80500 1.122 1.4914 0.8160 -
4.6464 81000 1.0842 1.5051 0.8167 -
4.6750 81500 1.0631 1.5653 0.8132 -
4.7037 82000 1.0724 1.5228 0.8120 -
4.7324 82500 1.0515 1.5087 0.8110 -
4.7611 83000 1.0537 1.5241 0.8120 -
4.7898 83500 1.0941 1.5083 0.8153 -
4.8184 84000 1.0812 1.5091 0.8001 -
4.8471 84500 1.0707 1.4898 0.8109 -
4.8758 85000 1.0467 1.4924 0.8184 -
4.9045 85500 1.0737 1.4708 0.8133 -
4.9332 86000 1.047 1.5657 0.8165 -
4.9619 86500 1.0175 1.5067 0.8086 -
4.9905 87000 1.0771 1.4804 0.8080 -
5.0192 87500 0.9551 1.5010 0.8059 -
5.0479 88000 0.9157 1.4884 0.8077 -
5.0766 88500 0.8326 1.4966 0.8090 -
5.1053 89000 0.8485 1.5071 0.8102 -
5.1339 89500 0.8998 1.5126 0.8049 -
5.1626 90000 0.9012 1.4982 0.8144 -
5.1913 90500 0.9354 1.4888 0.8066 -
5.2200 91000 0.9198 1.5137 0.8022 -
5.2487 91500 0.9074 1.4852 0.7977 -
5.2773 92000 0.9429 1.5120 0.8027 -
5.3060 92500 0.8891 1.5665 0.8002 -
5.3347 93000 1.054 1.5305 0.7982 -
5.3634 93500 0.9508 1.4608 0.8079 -
5.3921 94000 0.9275 1.5325 0.8086 -
5.4208 94500 0.9123 1.5041 0.8056 -
5.4494 95000 0.9666 1.5362 0.8005 -
5.4781 95500 0.9468 1.4727 0.7992 -
5.5068 96000 0.9501 1.4381 0.8078 -
5.5355 96500 0.9527 1.5504 0.8014 -
5.5642 97000 0.8989 1.4986 0.8082 -
5.5928 97500 0.9034 1.5549 0.8021 -
5.6215 98000 0.8865 1.6116 0.8084 -
5.6502 98500 1.5304 1.8928 0.7981 -
5.6789 99000 0.9919 1.4798 0.8050 -
5.7076 99500 0.9651 1.5517 0.8031 -
5.7362 100000 0.9372 1.5297 0.8010 -
5.7649 100500 0.9263 1.5323 0.8049 -
5.7936 101000 0.9242 1.5694 0.8080 -
5.8223 101500 1.0484 1.4544 0.8042 -
5.8510 102000 0.9351 1.5167 0.8114 -
5.8797 102500 0.9757 1.4759 0.7895 -
5.9083 103000 1.0185 1.4510 0.8017 -
5.9370 103500 0.8798 1.5102 0.8097 -
5.9657 104000 0.9518 1.4321 0.7969 -
5.9944 104500 0.978 1.5526 0.7946 -
6.0231 105000 0.818 1.5133 0.7992 -
6.0517 105500 0.7463 1.4665 0.8081 -
6.0804 106000 0.7313 1.5566 0.8032 -
6.1091 106500 0.7363 1.5826 0.8086 -
6.1378 107000 0.7184 1.5082 0.8091 -
6.1665 107500 0.7543 1.5200 0.8047 -
6.1951 108000 0.7492 1.4958 0.8050 -
6.2238 108500 0.7868 1.5206 0.8073 -
6.2525 109000 0.7714 1.5073 0.7990 -
6.2812 109500 0.7931 1.5886 0.8007 -
6.3099 110000 0.7768 1.4684 0.7970 -
6.3386 110500 0.7972 1.4479 0.8049 -
6.3672 111000 0.7286 1.4873 0.8109 -
6.3959 111500 0.7462 1.5758 0.8115 -
6.4246 112000 0.737 1.4757 0.8024 -
6.4533 112500 0.7437 1.4728 0.7962 -
6.4820 113000 0.7644 1.4875 0.7970 -
6.5106 113500 0.7563 1.5626 0.7925 -
6.5393 114000 0.7704 1.4859 0.8042 -
6.5680 114500 0.7455 1.5227 0.8017 -
6.5967 115000 0.7916 1.5085 0.8057 -
6.6254 115500 0.7804 1.4348 0.8029 -
6.6540 116000 0.797 1.4953 0.8025 -
6.6827 116500 0.7731 1.4998 0.8045 -
6.7114 117000 0.7324 1.5095 0.7936 -
6.7401 117500 0.7371 1.5010 0.8037 -
6.7688 118000 0.7596 1.5101 0.8008 -
6.7975 118500 0.7763 1.5442 0.8030 -
6.8261 119000 0.7941 1.4985 0.7879 -
6.8548 119500 0.7408 1.5652 0.7827 -
6.8835 120000 0.7568 1.5540 0.7862 -
6.9122 120500 0.7537 1.5316 0.7979 -
6.9409 121000 0.7741 1.5125 0.7983 -
6.9695 121500 0.7369 1.5109 0.7948 -
6.9982 122000 0.7617 1.4832 0.7926 -
7.0269 122500 0.652 1.4793 0.7923 -
7.0556 123000 0.6824 1.5213 0.7927 -
7.0843 123500 0.6285 1.5025 0.7954 -
7.1129 124000 0.6691 1.5328 0.7946 -
7.1416 124500 0.6422 1.6047 0.7942 -
7.1703 125000 0.6618 1.5424 0.7916 -
7.1990 125500 0.6601 1.5324 0.7961 -
7.2277 126000 0.67 1.5564 0.7914 -
7.2564 126500 0.6223 1.5353 0.7952 -
7.2850 127000 0.6344 1.5982 0.7944 -
7.3137 127500 0.6362 1.5258 0.8059 -
7.3424 128000 0.6254 1.5656 0.7936 -
7.3711 128500 0.6672 1.5142 0.7951 -
7.3998 129000 0.6411 1.6154 0.7938 -
7.4284 129500 0.6307 1.4875 0.7995 -
7.4571 130000 0.6369 1.5370 0.7943 -
7.4858 130500 0.6474 1.5084 0.7862 -
7.5145 131000 0.6475 1.4667 0.7995 -
7.5432 131500 0.6361 1.5099 0.7919 -
7.5718 132000 0.6047 1.5205 0.7861 -
7.6005 132500 0.6435 1.4809 0.7939 -
7.6292 133000 0.6277 1.4976 0.7934 -
7.6579 133500 0.6307 1.5139 0.7998 -
7.6866 134000 0.7114 1.5235 0.7971 -
7.7153 134500 0.728 1.4953 0.7977 -
7.7439 135000 0.6374 1.4649 0.7986 -
7.7726 135500 0.6409 1.4866 0.7964 -
7.8013 136000 0.6364 1.4959 0.7905 -
7.8300 136500 0.6309 1.4888 0.7968 -
7.8587 137000 0.6181 1.5176 0.7916 -
7.8873 137500 0.6002 1.5199 0.7885 -
7.9160 138000 0.6326 1.4986 0.7915 -
7.9447 138500 0.6285 1.5224 0.7886 -
7.9734 139000 0.6019 1.4330 0.7936 -
8.0021 139500 0.6534 1.5217 0.7820 -
8.0307 140000 0.4934 1.4483 0.7987 -
8.0594 140500 0.5176 1.4290 0.7938 -
8.0881 141000 0.5203 1.4782 0.7987 -
8.1168 141500 0.5333 1.5755 0.7809 -
8.1455 142000 0.5262 1.4008 0.7894 -
8.1742 142500 0.488 1.4576 0.7896 -
8.2028 143000 0.4995 1.4221 0.7850 -
8.2315 143500 0.5438 1.4670 0.7884 -
8.2602 144000 0.5358 1.5256 0.7909 -
8.2889 144500 0.5379 1.4966 0.7977 -
8.3176 145000 0.5281 1.4566 0.7987 -
8.3462 145500 0.5059 1.4206 0.7982 -
8.3749 146000 0.4993 1.4853 0.7994 -
8.4036 146500 0.5245 1.4286 0.8003 -
8.4323 147000 0.5277 1.4129 0.7958 -
8.4610 147500 0.5263 1.4098 0.8035 -
8.4896 148000 0.5452 1.5002 0.7957 -
8.5183 148500 0.5334 1.4246 0.8046 -
8.5470 149000 0.5358 1.4566 0.7905 -
8.5757 149500 0.55 1.4229 0.7950 -
8.6044 150000 0.5362 1.4068 0.7901 -
8.6331 150500 0.5321 1.3924 0.7931 -
8.6617 151000 0.5459 1.4455 0.7896 -
8.6904 151500 0.5243 1.4604 0.7969 -
8.7191 152000 0.5067 1.4185 0.7902 -
8.7478 152500 0.5085 1.4642 0.7919 -
8.7765 153000 0.4881 1.5082 0.7899 -
8.8051 153500 0.5203 1.5120 0.7919 -
8.8338 154000 0.5077 1.5209 0.7880 -
8.8625 154500 0.5083 1.4354 0.7938 -
8.8912 155000 0.5085 1.4376 0.7867 -
8.9199 155500 0.4906 1.4027 0.7922 -
8.9485 156000 0.5449 1.5246 0.7859 -
8.9772 156500 0.551 1.4236 0.7955 -
9.0059 157000 0.4974 1.4605 0.7932 -
9.0346 157500 0.4159 1.4128 0.7868 -
9.0633 158000 0.4002 1.3823 0.7921 -
9.0920 158500 0.3814 1.4415 0.7854 -
9.1206 159000 0.3694 1.4016 0.7885 -
9.1493 159500 0.4349 1.3871 0.7975 -
9.1780 160000 0.4258 1.3926 0.7919 -
9.2067 160500 0.4262 1.4316 0.7858 -
9.2354 161000 0.428 1.4839 0.7870 -
9.2640 161500 0.4186 1.4305 0.7917 -
9.2927 162000 0.4306 1.4962 0.7936 -
9.3214 162500 0.3977 1.4436 0.7971 -
9.3501 163000 0.3994 1.4661 0.7910 -
9.3788 163500 0.4227 1.5016 0.7952 -
9.4074 164000 0.4076 1.4551 0.7983 -
9.4361 164500 0.4095 1.4699 0.7889 -
9.4648 165000 0.4054 1.4351 0.7933 -
9.4935 165500 0.455 1.4284 0.7927 -
9.5222 166000 0.42 1.4683 0.7909 -
9.5509 166500 0.4418 1.4145 0.7965 -
9.5795 167000 0.4179 1.4001 0.7947 -
9.6082 167500 0.4094 1.4471 0.7875 -
9.6369 168000 0.4176 1.4185 0.7909 -
9.6656 168500 0.3991 1.3788 0.7894 -
9.6943 169000 0.4019 1.3530 0.7942 -
9.7229 169500 0.4141 1.4194 0.7848 -
9.7516 170000 0.4037 1.4556 0.7842 -
9.7803 170500 0.4298 1.3902 0.7847 -
9.8090 171000 0.4257 1.4237 0.7879 -
9.8377 171500 0.3989 1.4484 0.7916 -
9.8663 172000 0.4164 1.4447 0.7922 -
9.8950 172500 0.4184 1.4129 0.7876 -
9.9237 173000 0.3927 1.4382 0.7961 -
9.9524 173500 0.4355 1.4626 0.7885 -
9.9811 174000 0.4317 1.4384 0.7838 -
10.0098 174500 0.3654 1.4397 0.7868 -
10.0384 175000 0.3319 1.4772 0.7933 -
10.0671 175500 0.3218 1.4553 0.7862 -
10.0958 176000 0.3553 1.4422 0.7880 -
10.1245 176500 0.3572 1.4375 0.7865 -
10.1532 177000 0.3611 1.4634 0.7915 -
10.1818 177500 0.3511 1.4557 0.7875 -
10.2105 178000 0.3475 1.4579 0.7864 -
10.2392 178500 0.3456 1.5179 0.7839 -
10.2679 179000 0.3511 1.4576 0.7844 -
10.2966 179500 0.3336 1.4965 0.7879 -
10.3252 180000 0.3647 1.4619 0.7827 -
10.3539 180500 0.3451 1.4585 0.7871 -
10.3826 181000 0.3585 1.4688 0.7847 -
10.4113 181500 0.3432 1.4332 0.7922 -
10.4400 182000 0.3789 1.4236 0.7920 -
10.4687 182500 0.3313 1.3813 0.7794 -
10.4973 183000 0.3356 1.4369 0.7881 -
10.5260 183500 0.3187 1.4230 0.7889 -
10.5547 184000 0.3255 1.4207 0.7923 -
10.5834 184500 0.3252 1.4159 0.7924 -
10.6121 185000 0.3389 1.4502 0.7879 -
10.6407 185500 0.3407 1.4985 0.7945 -
10.6694 186000 0.3349 1.4637 0.7928 -
10.6981 186500 0.3459 1.4799 0.7922 -
10.7268 187000 0.3352 1.4447 0.7911 -
10.7555 187500 0.3188 1.4034 0.7908 -
10.7841 188000 0.3354 1.4559 0.7917 -
10.8128 188500 0.3087 1.4330 0.7901 -
10.8415 189000 0.3573 1.4262 0.7884 -
10.8702 189500 0.337 1.4397 0.7861 -
10.8989 190000 0.3284 1.4719 0.7875 -
10.9276 190500 0.3452 1.4200 0.7864 -
10.9562 191000 0.3407 1.4429 0.7909 -
10.9849 191500 0.3514 1.4511 0.7935 -
11.0136 192000 0.2932 1.4430 0.7905 -
11.0423 192500 0.2593 1.4349 0.7891 -
11.0710 193000 0.294 1.4235 0.7839 -
11.0996 193500 0.2742 1.4248 0.7853 -
11.1283 194000 0.2969 1.4282 0.7837 -
11.1570 194500 0.2549 1.4321 0.7812 -
11.1857 195000 0.3029 1.4152 0.7821 -
11.2144 195500 0.2948 1.4362 0.7815 -
11.2430 196000 0.2888 1.4211 0.7850 -
11.2717 196500 0.286 1.4898 0.7857 -
11.3004 197000 0.2875 1.4354 0.7878 -
11.3291 197500 0.2876 1.4378 0.7900 -
11.3578 198000 0.3074 1.4171 0.7861 -
11.3865 198500 0.2934 1.4582 0.7856 -
11.4151 199000 0.3017 1.4243 0.7853 -
11.4438 199500 0.2987 1.4444 0.7855 -
11.4725 200000 0.2801 1.4089 0.7869 -
11.5012 200500 0.2891 1.4545 0.7839 -
11.5299 201000 0.275 1.4979 0.7790 -
11.5585 201500 0.3127 1.3802 0.7888 -
11.5872 202000 0.2953 1.4369 0.7850 -
11.6159 202500 0.284 1.4590 0.7869 -
11.6446 203000 0.259 1.4573 0.7823 -
11.6733 203500 0.2787 1.4293 0.7840 -
11.7019 204000 0.2791 1.4671 0.7814 -
11.7306 204500 0.2942 1.4423 0.7885 -
11.7593 205000 0.2788 1.4622 0.7874 -
11.7880 205500 0.2614 1.4742 0.7846 -
11.8167 206000 0.2809 1.4380 0.7799 -
11.8454 206500 0.2933 1.4385 0.7859 -
11.8740 207000 0.2623 1.4415 0.7833 -
11.9027 207500 0.2494 1.4490 0.7823 -
11.9314 208000 0.2904 1.4445 0.7827 -
11.9601 208500 0.2737 1.4070 0.7764 -
11.9888 209000 0.262 1.4465 0.7811 -
12.0174 209500 0.2418 1.4411 0.7794 -
12.0461 210000 0.2315 1.4468 0.7847 -
12.0748 210500 0.2614 1.4399 0.7867 -
12.1035 211000 0.2256 1.4226 0.7837 -
12.1322 211500 0.2487 1.4494 0.7851 -
12.1608 212000 0.2273 1.4707 0.7821 -
12.1895 212500 0.2408 1.4935 0.7838 -
12.2182 213000 0.2299 1.4478 0.7842 -
12.2469 213500 0.2207 1.4369 0.7854 -
12.2756 214000 0.2234 1.4400 0.7833 -
12.3043 214500 0.2356 1.4631 0.7851 -
12.3329 215000 0.2256 1.4520 0.7827 -
12.3616 215500 0.2349 1.4680 0.7821 -
12.3903 216000 0.2253 1.4628 0.7852 -
12.4190 216500 0.2275 1.4912 0.7851 -
12.4477 217000 0.2395 1.4483 0.7848 -
12.4763 217500 0.2419 1.4556 0.7850 -
12.5050 218000 0.2391 1.4339 0.7841 -
12.5337 218500 0.2493 1.4502 0.7834 -
12.5624 219000 0.2197 1.4511 0.7815 -
12.5911 219500 0.2255 1.4263 0.7816 -
12.6197 220000 0.2428 1.4315 0.7784 -
12.6484 220500 0.2252 1.4596 0.7812 -
12.6771 221000 0.2408 1.4551 0.7852 -
12.7058 221500 0.2491 1.4858 0.7743 -
12.7345 222000 0.2378 1.4834 0.7832 -
12.7632 222500 0.227 1.4535 0.7828 -
12.7918 223000 0.2403 1.4725 0.7811 -
12.8205 223500 0.2211 1.4726 0.7789 -
12.8492 224000 0.2296 1.4557 0.7793 -
12.8779 224500 0.2289 1.4179 0.7819 -
12.9066 225000 0.2342 1.4385 0.7797 -
12.9352 225500 0.2378 1.4202 0.7771 -
12.9639 226000 0.2146 1.4290 0.7817 -
12.9926 226500 0.2375 1.4171 0.7776 -
13.0213 227000 0.1939 1.4448 0.7786 -
13.0500 227500 0.1943 1.4572 0.7795 -
13.0786 228000 0.2 1.4622 0.7833 -
13.1073 228500 0.2084 1.4683 0.7815 -
13.1360 229000 0.1927 1.4814 0.7777 -
13.1647 229500 0.2167 1.4574 0.7806 -
13.1934 230000 0.2003 1.4806 0.7810 -
13.2221 230500 0.2175 1.4794 0.7805 -
13.2507 231000 0.2051 1.4450 0.7820 -
13.2794 231500 0.2003 1.4658 0.7833 -
13.3081 232000 0.2025 1.4491 0.7841 -
13.3368 232500 0.2132 1.4462 0.7809 -
13.3655 233000 0.2028 1.4458 0.7817 -
13.3941 233500 0.2056 1.4331 0.7814 -
13.4228 234000 0.1834 1.4571 0.7790 -
13.4515 234500 0.2007 1.4393 0.7809 -
13.4802 235000 0.1882 1.4566 0.7813 -
13.5089 235500 0.1941 1.4503 0.7807 -
13.5375 236000 0.1993 1.4622 0.7782 -
13.5662 236500 0.1994 1.4631 0.7783 -
13.5949 237000 0.206 1.4430 0.7776 -
13.6236 237500 0.1969 1.4665 0.7810 -
13.6523 238000 0.2053 1.4888 0.7754 -
13.6809 238500 0.2034 1.4593 0.7761 -
13.7096 239000 0.1983 1.4838 0.7776 -
13.7383 239500 0.1945 1.4714 0.7773 -
13.7670 240000 0.2055 1.4640 0.7779 -
13.7957 240500 0.2024 1.4754 0.7787 -
13.8244 241000 0.1959 1.4552 0.7766 -
13.8530 241500 0.187 1.4456 0.7753 -
13.8817 242000 0.1906 1.4514 0.7739 -
13.9104 242500 0.1928 1.4691 0.7771 -
13.9391 243000 0.2021 1.4537 0.7779 -
13.9678 243500 0.1855 1.4683 0.7816 -
13.9964 244000 0.1997 1.4667 0.7802 -
14.0251 244500 0.1714 1.4906 0.7799 -
14.0538 245000 0.1878 1.4786 0.7811 -
14.0825 245500 0.1796 1.4974 0.7794 -
14.1112 246000 0.1826 1.4833 0.7796 -
14.1398 246500 0.1731 1.4995 0.7788 -
14.1685 247000 0.167 1.4896 0.7795 -
14.1972 247500 0.1871 1.4724 0.7797 -
14.2259 248000 0.1934 1.4777 0.7812 -
14.2546 248500 0.1764 1.4755 0.7822 -
14.2833 249000 0.1866 1.4718 0.7812 -
14.3119 249500 0.2047 1.4668 0.7817 -
14.3406 250000 0.1643 1.4811 0.7817 -
14.3693 250500 0.1715 1.4833 0.7790 -
14.3980 251000 0.1757 1.4786 0.7803 -
14.4267 251500 0.1844 1.4803 0.7807 -
14.4553 252000 0.1721 1.4953 0.7808 -
14.4840 252500 0.1549 1.4872 0.7810 -
14.5127 253000 0.1599 1.4582 0.7824 -
14.5414 253500 0.1691 1.4735 0.7813 -
14.5701 254000 0.1737 1.4741 0.7814 -
14.5987 254500 0.1612 1.4754 0.7810 -
14.6274 255000 0.1773 1.4656 0.7821 -
14.6561 255500 0.1758 1.4690 0.7814 -
14.6848 256000 0.1791 1.4730 0.7814 -
14.7135 256500 0.1848 1.4745 0.7810 -
14.7422 257000 0.1665 1.4855 0.7808 -
14.7708 257500 0.1827 1.4692 0.7809 -
14.7995 258000 0.1725 1.4647 0.7812 -
14.8282 258500 0.1535 1.4680 0.7812 -
14.8569 259000 0.1645 1.4720 0.7810 -
14.8856 259500 0.1704 1.4748 0.7807 -
14.9142 260000 0.1747 1.4699 0.7806 -
14.9429 260500 0.1893 1.4670 0.7807 -
14.9716 261000 0.1754 1.4679 0.7806 -
-1 -1 - - - 0.7525

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
26
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/roberta-large-mrl-768-512-256-128-64

Finetuned
(433)
this model

Dataset used to train sobamchan/roberta-large-mrl-768-512-256-128-64

Evaluation results