Salesforce/GiftEvalParquet
Viewer
•
Updated
•
371k
•
25
None defined yet.
LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion