Antoine Chaffin's picture

Antoine Chaffin

NohTow

·

https://antoine.chaffin.fr

AI & ML interests

NLP, Computer Vision, Multimodal classification

Recent Activity

upvoted an article 1 day ago

TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval

published a dataset 3 days ago

lightonai/CodeSearchNet

updated a dataset 8 days ago

lightonai/embeddings-pre-training

View all activity

Organizations

upvoted an article 1 day ago

Article

TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval

5 days ago

•

17

upvoted an article about 2 months ago

Article

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Oct 23

•

62

upvoted 2 papers about 2 months ago

Simple Projection Variants Improve ColBERT Performance

Paper • 2510.12327 • Published Oct 14 • 5

Fantastic (small) Retrievers and How to Train Them: mxbai-edge-colbert-v0 Tech Report

Paper • 2510.14880 • Published Oct 16 • 17

upvoted an article 3 months ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

+4

Sep 4

•

263

upvoted an article 5 months ago

Article

Introducing ColQwen-Omni: Retrieve in every modality

Jul 17

•

75

upvoted a paper 5 months ago

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15 • 29

upvoted an article 5 months ago

Article

Ettin Suite: SoTA Paired Encoders and Decoders

+4

Jul 16

•

77

upvoted a collection 5 months ago

PyLate 🐕

4 items • Updated Jul 2 • 3

upvoted a collection 6 months ago

BioClinical ModernBERT

This project was a collaboration between members of the Dana-Farber Cancer Institute, LightOn, MIT, OpenEvidence and Microsoft. • 3 items • Updated Sep 9 • 11

upvoted an article 9 months ago

Article

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

Mar 10

•

146

upvoted a paper 10 months ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 28

upvoted a collection 11 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 549

upvoted an article 11 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

+13

Dec 19, 2024

•

709

upvoted a collection 11 months ago

ModernGLiNER

GLiNER models based on modern encoder architectures • 2 items • Updated Dec 24, 2024 • 7

upvoted a paper 12 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

upvoted a collection 12 months ago

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 151

upvoted a paper about 1 year ago

Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Paper • 2409.14683 • Published Sep 23, 2024 • 12

upvoted a paper over 2 years ago

Three Bricks to Consolidate Watermarks for Large Language Models

Paper • 2308.00113 • Published Jul 26, 2023 • 14