PixArt-Σ LoRA Fine-tuned for Anime Image Generation
This model is a LoRA fine-tuned version of PixArt-alpha/PixArt-Sigma-XL-2-1024-MS on the lambdalabs/naruto-blip-captions dataset for generating anime style images.
Model Details
- Base Model: PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
- Training Method: LoRA (Low-Rank Adaptation)
- Domain: Anime
- Dataset: lambdalabs/naruto-blip-captions
- LoRA Rank: 16
- LoRA Alpha: 32
- Task: Text-to-Image Generation
Training Details
- Epochs: 50
- Batch Size: 1
- Gradient Accumulation Steps: 4
- Learning Rate: 1e-4
- Training Steps: 1500
- Mixed Precision: FP16
Usage
from diffusers import PixArtSigmaPipeline
import torch
# Load pipeline
pipe = PixArtSigmaPipeline.from_pretrained(
"PixArt-alpha/PixArt-Sigma-XL-2-1024-MS",
torch_dtype=torch.float16
).to("cuda")
# Load LoRA weights
pipe.load_lora_weights("matthew816/pixart-lora-anime")
# Generate image from text
prompt = "anime style, a cat sitting on a chair"
image = pipe(
prompt=prompt,
num_inference_steps=20,
guidance_scale=4.5
).images[0]
image.save("generated_anime_image.png")
Examples
This model generates images in anime style from text descriptions.
Example prompts:
- "anime style, a dragon flying over mountains"
- "anime style, a robot playing guitar"
- "anime style, a sunset over the ocean"
Citation
If you use this model, please cite the original PixArt-Σ model and the dataset.
@article{chen2024pixart,
title={PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation},
author={Chen, Junsong and others},
journal={arXiv preprint arXiv:2403.04692},
year={2024}
}
- Downloads last month
- 4
Model tree for matthew816/pixart-lora-anime
Base model
PixArt-alpha/PixArt-Sigma-XL-2-1024-MS