YAML Metadata Warning:The pipeline tag "causal-lm" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
Stack X Ultimate
A state-of-the-art agentic coding model built on Qwen2.5-Coder-3B-Instruct
Stack X is a LoRA adapter trained on a curated mix of real agentic conversations, designed to make open-weight models better at multi-step tool use, code generation, and complex reasoning tasks.
Model Details
- Base Model: Qwen/Qwen2.5-Coder-3B-Instruct
- Architecture: Transformer (3B parameters)
- Training Type: QLoRA (LoRA rank 32, 7 modules targeted)
- Trained by: Walid Sobhie via OpenClaw agentic pipeline
- Framework: Hugging Face Transformers + PEFT + PyTorch bf16
- Training Hardware: NVIDIA V100-SXM2-16GB (GCP spot instance)
- Training Steps: 3,000 steps (curriculum sorted, cosine LR decay)
- Effective Batch Size: 16 (gradient accumulation)
- Max Context: 1,536 tokens
Training Data
| Source | Description | Count |
|---|---|---|
| NVIDIA Nemotron Agentic | Real multi-step tool calling conversations | ~7,000 |
| Stack-4.0 Smart | High-complexity agentic tasks | ~10,000 |
| Stack-4.0 Tools | Diverse tool-use patterns | ~10,000 |
| Total (deduped) | After deduplication | ~6,100 |
Training data was filtered, deduplicated, and sorted by complexity (curriculum learning) before training.
Capabilities
Stack X is designed to excel at:
- Multi-step tool use β chains multiple tool calls with proper reasoning
- Code generation β Python, JavaScript, shell, and more
- Debugging β finds and explains bugs with fixes
- Math & reasoning β step-by-step calculation and problem solving
- Research tasks β information retrieval and synthesis
Usage
With PEFT (recommended β preserves base model)
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
BASE = "Qwen/Qwen2.5-Coder-3B-Instruct"
ADAPTER = "my-ai-stack/Stack-X-Ultimate"
tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
base = AutoModelForCausalLM.from_pretrained(BASE, torch_dtype="bfloat16", device_map="auto")
model = PeftModel.from_pretrained(base, ADAPTER)
# Chat
messages = [{"role": "user", "content": "Use the calculate tool to find sqrt(144)"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Merged (full model)
# See: my-ai-stack/Stack-X-Ultimate-Merged
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-X-Ultimate-Merged", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-X-Ultimate-Merged")
Performance
| Benchmark | Score |
|---|---|
| HumanEval (0-shot) | TBD |
| Agentic tool call | TBD |
| Reasoning (commonsense) | TBD |
Evaluation results will be posted after training completes.
Limitations
- LoRA adapter requires compatible base model (Qwen2.5-Coder-3B-Instruct)
- Max context 1,536 tokens β not suitable for very long documents
- Trained primarily in English β other language performance may vary
- Tool use limited to the patterns seen in training data
Training Recipe
Base model: Qwen/Qwen2.5-Coder-3B-Instruct
LoRA rank: 32 (59M trainable params)
LoRA alpha: 64
Target modules: q_proj, k_proj, v_proj, o_proj,
gate_proj, up_proj, down_proj
Learning rate: 2e-4 (cosine decay)
Warmup: 150 steps
Batch size: 1 Γ gradient_accumulation=16
Optimizer: AdamW (bf16)
Max grad norm: 0.5
Weight decay: 0.1
Mixed precision: bf16
Gradient checkpointing: enabled
Citation
@misc{stackx2026,
title={Stack X Ultimate},
author={Walid Sobhie},
year={2026},
url={https://huggingface.co/my-ai-stack/Stack-X-Ultimate}
}
Disclaimer
This model is provided as-is. Training was performed automatically via an OpenClaw agentic pipeline. Results may vary. Not reviewed for safety in production deployments.