LTX2.3-ICEdit-Insight

Research-oriented model release for task-aware video restoration and editing under the LTX-2.3 framework.

Project links: GitHub project | Valiant Cat on Hugging Face

This repository contains:

ltx-2.3-edit-insight-dev-fp8.safetensors
- the all-in-one Insight checkpoint
- includes transformer + video VAE + audio VAE + text projection + vocoder
ltx2.3-video-upscale-v2.safetensors
- IC-LoRA for video super-resolution and detail recovery
ltx2.3-ic-watermarkeRM.safetensors
- IC-LoRA for video watermark removal and occlusion restoration

These weights are intended to be used with the project's run_pipeline.py workflow. The recommended default is single-stage inference, where the IC-LoRA guidance remains active through the full-resolution denoising pass.

Research Positioning

ltx-2.3-edit-insight-dev-fp8.safetensors is not presented as a bare deployment checkpoint. It is the unified base model release for the Insight branch of this project: a task-aware spatiotemporal editing backbone that consolidates the diffusion transformer, video VAE, audio VAE, text projection module, and vocoder into a single reproducible artifact.

From a research perspective, the checkpoint is intended to support controlled video restoration and editing under a shared latent diffusion formulation. The paired IC-LoRA adapters specialize the backbone toward structure-preserving super-resolution and watermark-aware content recovery, while the unified checkpoint packaging keeps the full generative stack aligned for repeatable experiments and downstream ablations.

English Overview

This package is built for the Insight version of the project's LTX-2.3 editing pipeline. Instead of shipping only task adapters, it also includes the corresponding Insight base checkpoint so the workflow can be reproduced with the exact model assets used by the project.

Recommended usage:

run the companion run_pipeline.py
keep single-stage inference enabled by default
load one task LoRA at a time depending on the editing goal

🧠 Training

This model was trained and optimized by the AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.

Visit vvicat.com for business collaborations or research partnerships.

🧩 Integration with ComfyUI

This model works with the modified ComfyUI workflows provided by the project.

For ComfyUI-based editing, load the base model in the UNet-side model path required by the workflow, then attach the task-specific IC-LoRA for the corresponding edit objective.

Files

File	Purpose
`ltx-2.3-edit-insight-dev-fp8.safetensors`	All-in-one Insight base checkpoint
`ltx2.3-video-upscale-v2.safetensors`	Super-resolution / detail enhancement IC-LoRA
`ltx2.3-ic-watermarkeRM.safetensors`	Watermark removal / occlusion restoration IC-LoRA

Showcase

Usage With This Project

Run all commands from the project root.

Super-resolution

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
python run_pipeline.py \
  --mode upscale \
  --video ./inputs/input.mp4 \
  --prompt "Convert the video to ultra-high definition quality, rebuilding high-frequency details while eliminating artifacts." \
  --output ./outputs/output_upscale.mp4 \
  --height 1184 --width 704 --num-frames 97 \
  --fps 24.0 --seed 42 \
  --sigma-profile workflow \
  --model-checkpoint ./models/checkpoints/ltx-2.3-edit-insight-dev-fp8.safetensors \
  --lora ./models/loras/ltx2.3-train/ltx2.3-video-upscale-v2.safetensors

Watermark removal

PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
python run_pipeline.py \
  --mode watermark_rm \
  --video ./inputs/input.mp4 \
  --prompt "Remove short-video platform watermarks and related occlusions from the video, restoring a clean, clear, and natural original image." \
  --output ./outputs/output_watermark_rm.mp4 \
  --height 1184 --width 704 --num-frames 97 \
  --fps 24.0 --seed 1546 \
  --sigma-profile workflow \
  --model-checkpoint ./models/checkpoints/ltx-2.3-edit-insight-dev-fp8.safetensors \
  --lora ./models/loras/ltx2.3-train/ltx2.3-ic-watermarkeRM.safetensors

Notes

Single-stage inference is the default recommendation.
In two-stage mode, the second-stage refinement does not keep the IC-LoRA constraint, which can increase content drift.
Frame count must satisfy 8k + 1.
Single-stage output height and width should be multiples of 32.

License

This repository is released under the Apache 2.0 license.

Downloads last month: -; Downloads are not tracked for this model. How to track