Sparse Autoencoders for Qwen/Qwen2.5-7B-Instruct

This repository contains 3 Sparse Autoencoder(s) (SAE) trained using SAELens.

Model Details

Property Value
Base Model Qwen/Qwen2.5-7B-Instruct
Architecture gated
Input Dimension 3584
SAE Dimension 16384
Training Dataset TQRG/DeltaSecommits_qwen-2.5-7b-instruct_tokenized

Available Hook Points

Hook Point
blocks.0.hook_resid_post
blocks.14.hook_resid_post
blocks.27.hook_resid_post

Usage

from sae_lens import SAE

# Load an SAE for a specific hook point
sae, cfg_dict, sparsity = SAE.from_pretrained(
    release="rufimelo/secure_code_qwen_coder_gated_16384",
    sae_id="blocks.0.hook_resid_post"  # Choose from available hook points above
)

# Use with TransformerLens
from transformer_lens import HookedTransformer

model = HookedTransformer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

# Get activations and encode
_, cache = model.run_with_cache("your text here")
activations = cache["blocks.0.hook_resid_post"]
features = sae.encode(activations)

Files

  • blocks.0.hook_resid_post/cfg.json - SAE configuration
  • blocks.0.hook_resid_post/sae_weights.safetensors - Model weights
  • blocks.0.hook_resid_post/sparsity.safetensors - Feature sparsity statistics
  • blocks.14.hook_resid_post/cfg.json - SAE configuration
  • blocks.14.hook_resid_post/sae_weights.safetensors - Model weights
  • blocks.14.hook_resid_post/sparsity.safetensors - Feature sparsity statistics
  • blocks.27.hook_resid_post/cfg.json - SAE configuration
  • blocks.27.hook_resid_post/sae_weights.safetensors - Model weights
  • blocks.27.hook_resid_post/sparsity.safetensors - Feature sparsity statistics

Training

These SAEs were trained with SAELens version 6.26.2.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including rufimelo/secure_code_qwen_coder_gated_16384