Model save
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
-
- Loss: 0.
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
|
@@ -41,7 +41,7 @@ The following hyperparameters were used during training:
|
|
| 41 |
- seed: 42
|
| 42 |
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 43 |
- lr_scheduler_type: linear
|
| 44 |
-
- num_epochs:
|
| 45 |
- mixed_precision_training: Native AMP
|
| 46 |
|
| 47 |
### Training results
|
|
@@ -148,6 +148,106 @@ The following hyperparameters were used during training:
|
|
| 148 |
| 0.422 | 98.0 | 105546 | 0.3423 |
|
| 149 |
| 0.4214 | 99.0 | 106623 | 0.3421 |
|
| 150 |
| 0.4226 | 100.0 | 107700 | 0.3420 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 151 |
|
| 152 |
|
| 153 |
### Framework versions
|
|
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on the None dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
+
- Loss: 0.2374
|
| 20 |
|
| 21 |
## Model description
|
| 22 |
|
|
|
|
| 41 |
- seed: 42
|
| 42 |
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 43 |
- lr_scheduler_type: linear
|
| 44 |
+
- num_epochs: 200
|
| 45 |
- mixed_precision_training: Native AMP
|
| 46 |
|
| 47 |
### Training results
|
|
|
|
| 148 |
| 0.422 | 98.0 | 105546 | 0.3423 |
|
| 149 |
| 0.4214 | 99.0 | 106623 | 0.3421 |
|
| 150 |
| 0.4226 | 100.0 | 107700 | 0.3420 |
|
| 151 |
+
| 0.4283 | 101.0 | 108777 | 0.3452 |
|
| 152 |
+
| 0.4295 | 102.0 | 109854 | 0.3435 |
|
| 153 |
+
| 0.4289 | 103.0 | 110931 | 0.3413 |
|
| 154 |
+
| 0.4239 | 104.0 | 112008 | 0.3399 |
|
| 155 |
+
| 0.4205 | 105.0 | 113085 | 0.3377 |
|
| 156 |
+
| 0.4198 | 106.0 | 114162 | 0.3344 |
|
| 157 |
+
| 0.4163 | 107.0 | 115239 | 0.3338 |
|
| 158 |
+
| 0.4157 | 108.0 | 116316 | 0.3303 |
|
| 159 |
+
| 0.4119 | 109.0 | 117393 | 0.3275 |
|
| 160 |
+
| 0.4116 | 110.0 | 118470 | 0.3239 |
|
| 161 |
+
| 0.4085 | 111.0 | 119547 | 0.3215 |
|
| 162 |
+
| 0.4055 | 112.0 | 120624 | 0.3192 |
|
| 163 |
+
| 0.4056 | 113.0 | 121701 | 0.3184 |
|
| 164 |
+
| 0.4031 | 114.0 | 122778 | 0.3181 |
|
| 165 |
+
| 0.4 | 115.0 | 123855 | 0.3144 |
|
| 166 |
+
| 0.3978 | 116.0 | 124932 | 0.3120 |
|
| 167 |
+
| 0.3949 | 117.0 | 126009 | 0.3108 |
|
| 168 |
+
| 0.3941 | 118.0 | 127086 | 0.3074 |
|
| 169 |
+
| 0.3908 | 119.0 | 128163 | 0.3075 |
|
| 170 |
+
| 0.3888 | 120.0 | 129240 | 0.3038 |
|
| 171 |
+
| 0.3889 | 121.0 | 130317 | 0.3013 |
|
| 172 |
+
| 0.3875 | 122.0 | 131394 | 0.2988 |
|
| 173 |
+
| 0.3849 | 123.0 | 132471 | 0.2983 |
|
| 174 |
+
| 0.3821 | 124.0 | 133548 | 0.2960 |
|
| 175 |
+
| 0.3805 | 125.0 | 134625 | 0.2965 |
|
| 176 |
+
| 0.38 | 126.0 | 135702 | 0.2960 |
|
| 177 |
+
| 0.3789 | 127.0 | 136779 | 0.2923 |
|
| 178 |
+
| 0.3768 | 128.0 | 137856 | 0.2921 |
|
| 179 |
+
| 0.3749 | 129.0 | 138933 | 0.2891 |
|
| 180 |
+
| 0.3715 | 130.0 | 140010 | 0.2873 |
|
| 181 |
+
| 0.3716 | 131.0 | 141087 | 0.2862 |
|
| 182 |
+
| 0.367 | 132.0 | 142164 | 0.2852 |
|
| 183 |
+
| 0.3674 | 133.0 | 143241 | 0.2822 |
|
| 184 |
+
| 0.3667 | 134.0 | 144318 | 0.2825 |
|
| 185 |
+
| 0.3656 | 135.0 | 145395 | 0.2801 |
|
| 186 |
+
| 0.3645 | 136.0 | 146472 | 0.2777 |
|
| 187 |
+
| 0.3643 | 137.0 | 147549 | 0.2786 |
|
| 188 |
+
| 0.3615 | 138.0 | 148626 | 0.2750 |
|
| 189 |
+
| 0.36 | 139.0 | 149703 | 0.2747 |
|
| 190 |
+
| 0.3605 | 140.0 | 150780 | 0.2737 |
|
| 191 |
+
| 0.3558 | 141.0 | 151857 | 0.2726 |
|
| 192 |
+
| 0.357 | 142.0 | 152934 | 0.2697 |
|
| 193 |
+
| 0.3553 | 143.0 | 154011 | 0.2699 |
|
| 194 |
+
| 0.3557 | 144.0 | 155088 | 0.2693 |
|
| 195 |
+
| 0.3537 | 145.0 | 156165 | 0.2680 |
|
| 196 |
+
| 0.352 | 146.0 | 157242 | 0.2665 |
|
| 197 |
+
| 0.3499 | 147.0 | 158319 | 0.2666 |
|
| 198 |
+
| 0.3511 | 148.0 | 159396 | 0.2637 |
|
| 199 |
+
| 0.3483 | 149.0 | 160473 | 0.2636 |
|
| 200 |
+
| 0.3479 | 150.0 | 161550 | 0.2621 |
|
| 201 |
+
| 0.3466 | 151.0 | 162627 | 0.2600 |
|
| 202 |
+
| 0.3448 | 152.0 | 163704 | 0.2610 |
|
| 203 |
+
| 0.345 | 153.0 | 164781 | 0.2594 |
|
| 204 |
+
| 0.3439 | 154.0 | 165858 | 0.2595 |
|
| 205 |
+
| 0.3411 | 155.0 | 166935 | 0.2579 |
|
| 206 |
+
| 0.3414 | 156.0 | 168012 | 0.2583 |
|
| 207 |
+
| 0.3408 | 157.0 | 169089 | 0.2563 |
|
| 208 |
+
| 0.3393 | 158.0 | 170166 | 0.2550 |
|
| 209 |
+
| 0.34 | 159.0 | 171243 | 0.2559 |
|
| 210 |
+
| 0.3382 | 160.0 | 172320 | 0.2534 |
|
| 211 |
+
| 0.3379 | 161.0 | 173397 | 0.2520 |
|
| 212 |
+
| 0.3337 | 162.0 | 174474 | 0.2519 |
|
| 213 |
+
| 0.3356 | 163.0 | 175551 | 0.2523 |
|
| 214 |
+
| 0.333 | 164.0 | 176628 | 0.2502 |
|
| 215 |
+
| 0.3325 | 165.0 | 177705 | 0.2505 |
|
| 216 |
+
| 0.3326 | 166.0 | 178782 | 0.2497 |
|
| 217 |
+
| 0.3328 | 167.0 | 179859 | 0.2482 |
|
| 218 |
+
| 0.3305 | 168.0 | 180936 | 0.2495 |
|
| 219 |
+
| 0.3312 | 169.0 | 182013 | 0.2472 |
|
| 220 |
+
| 0.3302 | 170.0 | 183090 | 0.2471 |
|
| 221 |
+
| 0.3286 | 171.0 | 184167 | 0.2457 |
|
| 222 |
+
| 0.3288 | 172.0 | 185244 | 0.2456 |
|
| 223 |
+
| 0.3284 | 173.0 | 186321 | 0.2463 |
|
| 224 |
+
| 0.3285 | 174.0 | 187398 | 0.2440 |
|
| 225 |
+
| 0.327 | 175.0 | 188475 | 0.2434 |
|
| 226 |
+
| 0.3264 | 176.0 | 189552 | 0.2430 |
|
| 227 |
+
| 0.3264 | 177.0 | 190629 | 0.2435 |
|
| 228 |
+
| 0.3248 | 178.0 | 191706 | 0.2422 |
|
| 229 |
+
| 0.3233 | 179.0 | 192783 | 0.2419 |
|
| 230 |
+
| 0.324 | 180.0 | 193860 | 0.2420 |
|
| 231 |
+
| 0.3247 | 181.0 | 194937 | 0.2410 |
|
| 232 |
+
| 0.3235 | 182.0 | 196014 | 0.2405 |
|
| 233 |
+
| 0.3235 | 183.0 | 197091 | 0.2405 |
|
| 234 |
+
| 0.3219 | 184.0 | 198168 | 0.2402 |
|
| 235 |
+
| 0.3219 | 185.0 | 199245 | 0.2397 |
|
| 236 |
+
| 0.3212 | 186.0 | 200322 | 0.2404 |
|
| 237 |
+
| 0.3215 | 187.0 | 201399 | 0.2387 |
|
| 238 |
+
| 0.3203 | 188.0 | 202476 | 0.2393 |
|
| 239 |
+
| 0.3207 | 189.0 | 203553 | 0.2392 |
|
| 240 |
+
| 0.3205 | 190.0 | 204630 | 0.2387 |
|
| 241 |
+
| 0.3188 | 191.0 | 205707 | 0.2382 |
|
| 242 |
+
| 0.3182 | 192.0 | 206784 | 0.2382 |
|
| 243 |
+
| 0.3209 | 193.0 | 207861 | 0.2383 |
|
| 244 |
+
| 0.3199 | 194.0 | 208938 | 0.2379 |
|
| 245 |
+
| 0.3191 | 195.0 | 210015 | 0.2376 |
|
| 246 |
+
| 0.3174 | 196.0 | 211092 | 0.2377 |
|
| 247 |
+
| 0.3158 | 197.0 | 212169 | 0.2376 |
|
| 248 |
+
| 0.3188 | 198.0 | 213246 | 0.2378 |
|
| 249 |
+
| 0.3181 | 199.0 | 214323 | 0.2373 |
|
| 250 |
+
| 0.3181 | 200.0 | 215400 | 0.2374 |
|
| 251 |
|
| 252 |
|
| 253 |
### Framework versions
|