WCNegentropy's picture
πŸš€ Refined BitTransformerLM: Organized codebase with best practices
902f2d4 verified

BitTransformerLM Scripts

This directory contains organized scripts for BitTransformerLM development, training, and evaluation.

Directory Structure

scripts/
β”œβ”€β”€ training/          # Training scripts and experiments
β”œβ”€β”€ examples/          # Example usage and demonstrations
β”œβ”€β”€ testing/           # Test scripts and validation
β”œβ”€β”€ benchmarks/        # Performance benchmarks
└── tools/             # Utility scripts and data processing

Training Scripts (training/)

  • basic_training.py - Simple training setup for small models
  • breakthrough_training.py - Advanced training with breakthrough techniques
  • cpu_edge_training.py - CPU-optimized training for edge deployment
  • final_breakthrough_training.py - Production training pipeline
  • full_attention_training.py - Full attention mechanism training
  • full_bits_train.py - Complete bit-level training
  • production_training.py - Production-ready training script
  • progressive_scaleup.py - Progressive model scaling
  • quick_training_run.py - Fast training for development

Example Scripts (examples/)

  • example.py - Basic usage example
  • better_sampling.py - Advanced sampling techniques
  • debug_generation.py - Generation debugging utilities
  • raw_generation.py - Low-level generation examples
  • simple_test.py - Simple model testing

Testing Scripts (testing/)

  • code_test.py - Code functionality testing
  • diffusion_tests.py - Diffusion mode testing
  • enhanced_generation_test.py - Advanced generation testing
  • full_attention_inference_test.py - Attention mechanism tests
  • test_conversation.py - Conversational AI testing

Benchmark Scripts (benchmarks/)

  • wikitext_benchmark.py - WikiText dataset benchmarking
  • wikitext_schedule.py - WikiText training schedule

Utility Tools (tools/)

  • build_full_bits.py - Bit sequence construction
  • create_dataset.py - Dataset creation utilities
  • enhanced_checkpoint_system.py - Advanced checkpointing
  • integration_flow.py - Integration workflow
  • integration_schedule.py - Integration scheduling
  • sync_to_hf.py - HuggingFace synchronization
  • unified_workflow.py - Unified training workflow
  • watcher.py - File system monitoring

Usage

All scripts support the standardized CLI interface provided by bit_transformer.cli_standards. Use --help with any script to see available options.

Quick Start

# Train a small model
python scripts/training/basic_training.py --model-size small --epochs 5

# Run a simple test
python scripts/examples/simple_test.py --d-model 64

# Benchmark on WikiText
python scripts/benchmarks/wikitext_benchmark.py --dataset-name wikitext-2

Environment Variables

Scripts support configuration via environment variables with BT_ prefix:

export BT_D_MODEL=128
export BT_NUM_LAYERS=4
export BT_BATCH_SIZE=16
python scripts/training/basic_training.py

Development Guidelines

  • All scripts should use bit_transformer.cli_standards for argument parsing
  • Include proper logging and error handling
  • Support both CPU and GPU execution
  • Follow the naming conventions established in existing scripts
  • Add documentation for any new hyperparameters or features