Sparse Autoencoders for Qwen/Qwen2.5-7B-Instruct

This repository contains 3 Sparse Autoencoder(s) (SAE) trained using SAELens.

Model Details

Property	Value
Base Model	`Qwen/Qwen2.5-7B-Instruct`
Architecture	`gated`
Input Dimension	3584
SAE Dimension	16384
Training Dataset	`TQRG/DeltaSecommits_qwen-2.5-7b-instruct_tokenized`

Available Hook Points

Hook Point
`blocks.0.hook_resid_post`
`blocks.14.hook_resid_post`
`blocks.27.hook_resid_post`

Usage

from sae_lens import SAE

# Load an SAE for a specific hook point
sae, cfg_dict, sparsity = SAE.from_pretrained(
    release="rufimelo/secure_code_qwen_coder_gated_16384",
    sae_id="blocks.0.hook_resid_post"  # Choose from available hook points above
)

# Use with TransformerLens
from transformer_lens import HookedTransformer

model = HookedTransformer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")

# Get activations and encode
_, cache = model.run_with_cache("your text here")
activations = cache["blocks.0.hook_resid_post"]
features = sae.encode(activations)

Files

blocks.0.hook_resid_post/cfg.json - SAE configuration
blocks.0.hook_resid_post/sae_weights.safetensors - Model weights
blocks.0.hook_resid_post/sparsity.safetensors - Feature sparsity statistics
blocks.14.hook_resid_post/cfg.json - SAE configuration
blocks.14.hook_resid_post/sae_weights.safetensors - Model weights
blocks.14.hook_resid_post/sparsity.safetensors - Feature sparsity statistics
blocks.27.hook_resid_post/cfg.json - SAE configuration
blocks.27.hook_resid_post/sae_weights.safetensors - Model weights
blocks.27.hook_resid_post/sparsity.safetensors - Feature sparsity statistics

Training

These SAEs were trained with SAELens version 6.26.2.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including rufimelo/secure_code_qwen_coder_gated_16384

Secure Code SAEs

Collection

7 items • Updated about 10 hours ago