Running on CPU Upgrade 148 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 148 Explore synthetic data experiments in a bookshelf view
PII & De-Identification Collection Models for extracting PII entities and de-identifying clinical text, with support for HIPAA and GDPR compliance. • 278 items • Updated 1 day ago • 32
OpenMed/OpenMed-PII-BioClinicalModern-Large-395M-v1 Token Classification • 0.4B • Updated Jan 13 • 18.5k • • 9
AstroBench Collection Datasets to evaluate LLMs/SLMs in astronautics and space mission engineering • 1 item • Updated Jan 5