5 1

Buddhi Wijenayake

buddhi19

Buddhi19

AI & ML interests

Computer Vision

Recent Activity

reacted to theirpost with 🤗 1 day ago

Article Highlight: SyntheticGen, Controllable Diffusion for Long-Tail Remote Sensing 🛰️ Why is remote-sensing segmentation still hard—even with strong models? Because the issue is not only the model… it’s the data. In real-world datasets like LoveDA, class distributions are highly imbalanced, and the problem is compounded by Urban/Rural domain shifts, where visual characteristics and class frequencies differ significantly. This leads to poor learning for minority classes and weak generalization. ⚖️ The Idea: Make Data Controllable Instead of treating data augmentation as a random process, SyntheticGen turns it into a controllable pipeline. 👉 What if you could: Specify which classes you want more of? Control how much of each class appears? Generate data that respects domain (Urban/Rural) characteristics? That’s exactly what SyntheticGen enables. 🧠 How It Works SyntheticGen introduces a structured generation process: Layout Generation (Stage A) A ratio-conditioned discrete diffusion model generates semantic layouts that match user-defined class distributions. Image Synthesis (Stage B) A ControlNet-guided Stable Diffusion pipeline converts layouts into realistic remote-sensing imagery. 💡 This separation between semantic control and visual realism is key—it allows both precision and high-quality generation. Why It Matters Tackles long-tail imbalance directly at the data level Improves minority-class segmentation performance Enhances cross-domain generalization (Urban ↔ Rural) Moves toward data-centric AI, where we design training data—not just models Recent research shows that diffusion-based synthetic data can significantly improve performance in long-tailed settings by generating high-value samples for rare or difficult cases . SyntheticGen takes this further by making the process explicitly controllable, not just generative. 📄 Paper https://arxiv.org/abs/2602.04749 💻 Code & Synthetic Data https://github.com/Buddhi19/SyntheticGen

upvoted a paper 1 day ago

Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation

authored a paper 2 days ago

Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation

View all activity

Organizations

None yet

Posts 1

Post

3385

Article Highlight: SyntheticGen, Controllable Diffusion for Long-Tail Remote Sensing

🛰️ Why is remote-sensing segmentation still hard—even with strong models?
Because the issue is not only the model… it’s the data.

In real-world datasets like LoveDA, class distributions are highly imbalanced, and the problem is compounded by Urban/Rural domain shifts, where visual characteristics and class frequencies differ significantly. This leads to poor learning for minority classes and weak generalization.

⚖️ The Idea: Make Data Controllable

Instead of treating data augmentation as a random process, SyntheticGen turns it into a controllable pipeline.

👉 What if you could:

Specify which classes you want more of?
Control how much of each class appears?
Generate data that respects domain (Urban/Rural) characteristics?

That’s exactly what SyntheticGen enables.

🧠 How It Works

SyntheticGen introduces a structured generation process:

Layout Generation (Stage A)
A ratio-conditioned discrete diffusion model generates semantic layouts that match user-defined class distributions.
Image Synthesis (Stage B)
A ControlNet-guided Stable Diffusion pipeline converts layouts into realistic remote-sensing imagery.

💡 This separation between semantic control and visual realism is key—it allows both precision and high-quality generation.

Why It Matters
Tackles long-tail imbalance directly at the data level
Improves minority-class segmentation performance
Enhances cross-domain generalization (Urban ↔ Rural)
Moves toward data-centric AI, where we design training data—not just models

Recent research shows that diffusion-based synthetic data can significantly improve performance in long-tailed settings by generating high-value samples for rare or difficult cases .
SyntheticGen takes this further by making the process explicitly controllable, not just generative.
📄 Paper
https://arxiv.org/abs/2602.04749
💻 Code & Synthetic Data
https://github.com/Buddhi19/SyntheticGen

Papers 4

models 1

buddhi19/MambaFCS

Updated 27 days ago • 2

datasets 0

None public yet