Text Generation
Transformers
Safetensors
English
gpt2
causal-lm
chat
conversational
text-generation-inference

FuadeAI-50M

A 50 million parameter causal language model trained for conversational chat, built on a GPT-2 architecture with a custom tokenizer.

Model Details

Property Value
Parameters 51.5M
Architecture GPT-2 (custom config)
Hidden size 512
Layers 8
Attention heads 8
Context length 1024 tokens
Tokenizer GPT-2 + custom special tokens
Training precision FP16

Special Tokens

Token Purpose
<|startoftext|> Beginning of conversation
<user> / </user> Wraps user message
<assistant> / </assistant> Wraps assistant response
<|endoftext|> End of conversation

Training Data

How To Use

Transformers

from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("Fu01978/FuadeAI-50M")
model = GPT2LMHeadModel.from_pretrained("Fu01978/FuadeAI-50M")
model.eval()

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Chat function
def chat(prompt, temperature=0.4, top_p=0.9, max_new_tokens=100):
    formatted = (
        f"{tokenizer.bos_token}"
        f"<user>{prompt}</user>"
        f"<assistant>"
    )
    inputs = tokenizer(formatted, return_tensors="pt").to(device)

    with torch.no_grad():
        output = model.generate(
            **inputs,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            temperature=temperature,
            top_p=top_p,
            repetition_penalty=1.2,
            no_repeat_ngram_size=3,
            eos_token_id=tokenizer.eos_token_id,
            pad_token_id=tokenizer.pad_token_id,
        )

    generated = output[0][inputs["input_ids"].shape[-1]:]
    return tokenizer.decode(generated, skip_special_tokens=True).strip()

# Example usage
print(chat("Hello!"))
print(chat("Who invented the first telephone?"))
print(chat("Who are you?"))

Generation Tips

  • temperature=0.45 — balanced creativity and coherence (recommended)
  • temperature=0.2 — more focused and deterministic answers
  • temperature=0.8 — more creative but less reliable
  • repetition_penalty=1.2 — keeps responses from looping (recommended)
  • max_new_tokens=100 — increase for longer responses

Limitations

  • 50M parameters is small — factual recall is imperfect and some answers may be incorrect. Always verify factual claims from this model.
  • Coverage of topics is limited compared to large-scale models.
  • Not suitable for factual research, medical/legal/financial advice, or any high-stakes decision making.
  • Context window — limited to 1024 tokens total (prompt + response).

Intended Use

  • Learning and experimentation with small language models
  • Lightweight conversational agent for low-stakes applications
  • Fine-tuning base for domain-specific chat applications
Downloads last month
17
Safetensors
Model size
51.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Fu01978/FuadeAI-50M

Quantizations
1 model

Datasets used to train Fu01978/FuadeAI-50M