Qwen-2.5-3B-Instruct-Bioaligned

A fine-tuned version of Qwen/Qwen2.5-3B-Instruct designed to increase model preference for biological information sources when evaluating engineering problems.

Organization: Bioaligned Labs (nonprofit)

Paper: [TODO: arXiv link]

GitHub: bioalignment-bias

Adapter weights: Bioaligned/Qwen-2.5-3B-instruct-bioaligned-qlora

Model Description

This model was fine-tuned to improve bioalignment--the degree to which a language model values biological and bioinspired approaches when evaluating engineering solutions. Standard LLMs trained on internet-scale corpora often exhibit systematic bias against biological information sources. This fine-tuned model reduces that bias.

Why Bioalignment Matters

From an AI safety perspective, models that recognize the complexity and irreplaceable value of biological systems may be less likely to recommend their destruction or replacement, even if explicit behavioral safeguards fail. Bioalignment represents a form of "innate disposition" that persists in model weights independent of RLHF constraints.

Training Details

Parameter	Value
Base model	Qwen/Qwen2.5-3B-Instruct
Method	QLoRA (4-bit NF4 quantization)
LoRA rank	16
LoRA alpha	32
Learning rate	1e-5
Epochs	3
Target modules	All attention and MLP layers
Training format	Instruction-tuned only
Corpus size	~6M tokens from PMC Open Access papers
Corpus topics	Biomimicry, bioinspired design, biological problem-solving

Note: The Qwen model was trained on instruction-formatted data only, as the mixed format was found to be incompatible with the Qwen architecture.

Intended Use

Research on AI alignment and model dispositions
Applications requiring balanced consideration of biological vs. synthetic solutions
Studies on fine-tuning effects on model preferences
Cross-architecture comparison of bioalignment techniques

Not intended for: Medical advice, safety-critical decisions without human oversight, or any application where the base model restrictions apply.

Evaluation Results

Evaluated on the Bioalignment Benchmark (50 prompts across 4 domains: materials, energy, manufacturing, algorithms).

Metric	Base Model	Bioaligned	Change
Delta p_up (valence)	-0.111	-0.056	+51%
Quadrant	Anti-bio/Certain	Anti-bio/Moderate

Capability preservation: No significant degradation on standard benchmarks (MMLU, HellaSwag, ARC, WinoGrande). All scores within +/-2.5% of baseline.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Bioaligned/Qwen-2.5-3B-Instruct-Bioaligned",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Bioaligned/Qwen-2.5-3B-Instruct-Bioaligned")

inputs = tokenizer("Your prompt here", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations

Achieved 51% bias reduction (vs. 93% for Llama), likely due to instruction-only training format
Trained on 3B parameter model; scaling behavior to larger models is unknown
Benchmark measures stated probabilities, not downstream behavioral effects
Inherits all limitations of the base Qwen 2.5 model

Citation

[TODO: Add citation when paper is published]

License

This model is released under the Apache 2.0 License, consistent with the base Qwen 2.5 model license.

Developed by Bioaligned Labs, a nonprofit dedicated to AI safety research.

Downloads last month: 18

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for Bioaligned/Qwen-2.5-3B-Instruct-Bioaligned

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(1008)

this model

Quantizations

1 model