-
Structured Denoising Diffusion Models in Discrete State-Spaces
Paper • 2107.03006 • Published • 1 -
Simplified and Generalized Masked Diffusion for Discrete Data
Paper • 2406.04329 • Published • 8 -
Simple and Effective Masked Diffusion Language Models
Paper • 2406.07524 • Published • 12 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123
Collections
Discover the best community collections!
Collections including paper arxiv:2503.09573
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
Sadeed: Advancing Arabic Diacritization Through Small Language Model
Paper • 2504.21635 • Published • 59 -
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
Paper • 2509.21320 • Published • 101 -
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 80
-
kuleshov-group/bd3lm-owt-block_size16
Text Generation • 0.2B • Updated • 191 • 16 -
kuleshov-group/bd3lm-owt-block_size4
Text Generation • 0.2B • Updated • 1.47k • 3 -
kuleshov-group/bd3lm-owt-block_size8
Text Generation • 0.2B • Updated • 170 • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 97 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54
-
Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
Paper • 2310.08949 • Published • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32 -
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
Paper • 2411.08307 • Published • 7
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 122 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 24 -
Titans: Learning to Memorize at Test Time
Paper • 2501.00663 • Published • 29 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 166
-
Structured Denoising Diffusion Models in Discrete State-Spaces
Paper • 2107.03006 • Published • 1 -
Simplified and Generalized Masked Diffusion for Discrete Data
Paper • 2406.04329 • Published • 8 -
Simple and Effective Masked Diffusion Language Models
Paper • 2406.07524 • Published • 12 -
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Large Language Diffusion Models
Paper • 2502.09992 • Published • 123 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 97 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 54 -
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding
Paper • 2505.16990 • Published • 22 -
D-AR: Diffusion via Autoregressive Models
Paper • 2505.23660 • Published • 34
-
Making Multimodal Generation Easier: When Diffusion Models Meet LLMs
Paper • 2310.08949 • Published • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Paper • 2308.04729 • Published • 32 -
PerceiverS: A Multi-Scale Perceiver with Effective Segmentation for Long-Term Expressive Symbolic Music Generation
Paper • 2411.08307 • Published • 7
-
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74 -
Sadeed: Advancing Arabic Diacritization Through Small Language Model
Paper • 2504.21635 • Published • 59 -
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
Paper • 2509.21320 • Published • 101 -
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 80
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 122 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
kuleshov-group/bd3lm-owt-block_size16
Text Generation • 0.2B • Updated • 191 • 16 -
kuleshov-group/bd3lm-owt-block_size4
Text Generation • 0.2B • Updated • 1.47k • 3 -
kuleshov-group/bd3lm-owt-block_size8
Text Generation • 0.2B • Updated • 170 • 1 -
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Paper • 2503.09573 • Published • 74
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 24 -
Titans: Learning to Memorize at Test Time
Paper • 2501.00663 • Published • 29 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 54 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 166