Rom's picture

Rom

wrom

·

wr0om

AI & ML interests

LLM Security

Recent Activity

upvoted a paper 8 days ago

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

authored a paper 10 days ago

Step-Wise Refusal Dynamics in Autoregressive and Diffusion Language Models

upvoted a paper 10 days ago

Step-Wise Refusal Dynamics in Autoregressive and Diffusion Language Models

View all activity

Organizations

Papers 1

arxiv:2602.02600

spaces 2

silenced_biases

Sbb

models 0

None public yet

datasets 3

wrom/silenced_biases

Updated Jan 8 • 4

wrom/HebrewBible_HapaxLegomenon

Viewer • Updated Sep 4, 2025 • 249 • 4 • 1

wrom/Language-Vision-Hallucinations

Viewer • Updated Nov 1, 2024 • 350 • 27 • 2