PPO Agent for Boeing 747 Pitch Angle Control

TensorAeroSpace

Proximal Policy Optimization (PPO) for Longitudinal Aircraft Control

TensorAeroSpace License: MIT PyTorch

Model Description

This model is a Proximal Policy Optimization (PPO) agent trained to control the pitch angle (θ) of a Boeing 747 aircraft in a longitudinal flight dynamics simulation. The agent receives normalized state observations and outputs continuous elevator deflection commands to track reference pitch angle signals.

image

image

Intended Uses

  • Primary Use: Automatic pitch angle tracking and stabilization for Boeing 747 aircraft simulation
  • Research Applications: Benchmarking RL algorithms for aerospace control systems
  • Educational: Learning reinforcement learning concepts in aerospace applications
  • Hybrid Control: Can be combined with PID/MPC controllers for robust flight control

Model Architecture

The PPO agent consists of separate Actor and Critic neural networks:

Actor Network (Policy)

Layer Configuration
Input 4 (observation dim)
Hidden 1 Linear(4, 256) + ReLU
Hidden 2 Linear(256, 256) + ReLU
Output (μ) Linear(256, 1) + Tanh
Output (log σ) Linear(256, 1), clamped to [-5.0, -1.5]

Critic Network (Value Function)

Layer Configuration
Input 4 (observation dim)
Hidden 1 Linear(4, 256) + ReLU
Hidden 2 Linear(256, 256) + ReLU
Output Linear(256, 1)

State Space

The observation vector consists of 4 normalized states representing the longitudinal dynamics:

Index State Description Units
0 u Forward velocity perturbation normalized
1 w Vertical velocity perturbation normalized
2 q Pitch rate normalized
3 θ Pitch angle (tracking target) normalized

Action Space

Dimension Description Range
1 Elevator deflection [-1.0, 1.0] (normalized)

The normalized action is scaled to physical elevator deflection in degrees by the environment.

Training Details

Training Configuration

Hyperparameter Value
Algorithm PPO (Clip)
Max Episodes 90,000
Rollout Length 256 steps
Batch Size 16,384
Epochs per Update 2
Clip Parameter (ε) 0.15
Discount Factor (γ) 0.995
GAE Lambda (λ) 0.95
Actor Learning Rate 1e-4
Critic Learning Rate 2e-4
Entropy Coefficient 0.01
Max Gradient Norm 0.5
Target KL 0.01
Normalize Observations False
Normalize Rewards True

Environment Configuration

Parameter Value
Environment ImprovedB747VecEnvTorch
Number of Parallel Envs 64
Time Step (dt) 0.1 s
Episode Duration 20 s
Initial State [0, 0, 0, 0]
Reference Signal Step function
Step Amplitude Range 1.0°
Step Time Range 5.0 s

Training Infrastructure

  • Hardware: NVIDIA GPU with CUDA support
  • Framework: PyTorch 2.0+
  • Training Time: ~7,510 episodes to best checkpoint
  • Best Episode: 7,510

Evaluation Results

Performance Metrics

Metric Value
Best Evaluation Reward 0.9137
Overshoot 0.49%
Settling Time 0.60 s
Rise Time 0.30 s
Peak Time 0.80 s
Static Error -0.0046
Oscillation Count 1
Performance Index 3.06

Integral Criteria

Criterion Value
IAE (Integral Absolute Error) 4.08
ISE (Integral Squared Error) 2.64
ITAE (Integral Time-weighted Absolute Error) 4.77

Step Response Characteristics

The agent demonstrates excellent step tracking performance with:

  • ✅ Minimal overshoot (<1%)
  • ✅ Fast settling time (0.6s)
  • ✅ Quick rise time (0.3s)
  • ✅ Near-zero static error
  • ✅ Minimal oscillations (1 cycle)

Usage

Installation

pip install tensoraerospace

Quick Start

import numpy as np
import torch
from tensoraerospace.agent.ppo.model import PPO
from tensoraerospace.envs.b747 import ImprovedB747Env
from tensoraerospace.signals.standart import unit_step
from tensoraerospace.utils import generate_time_period, convert_tp_to_sec_tp

# Load pretrained agent
agent = PPO.from_pretrained("TensorAeroSpace/ppo-b747-pitch-control")

# Setup environment
dt = 0.1
tp = generate_time_period(tn=20, dt=dt)
tps = convert_tp_to_sec_tp(tp, dt=dt)

# Create step reference signal (1 degree step at t=5s)
reference = unit_step(tp=tps, degree=1.0, time_step=5.0, output_rad=True).reshape(1, -1)

env = ImprovedB747Env(
    initial_state=np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),
    reference_signal=reference,
    number_time_steps=len(tp),
    dt=dt,
)

# Run evaluation
obs, _ = env.reset()
done = False

while not done:
    action, mean_action, _ = agent.act(obs, deterministic=True)
    action_scalar = float(np.asarray(mean_action).flatten()[0])
    obs, reward, terminated, truncated, info = env.step(action_scalar)
    done = terminated or truncated

Load from Local Checkpoint

from tensoraerospace.agent.ppo.model import PPO

# Load from local directory
agent = PPO.from_pretrained("./path/to/checkpoint")

Limitations

  • Fixed Aircraft Model: Trained specifically on Boeing 747 longitudinal dynamics; may not generalize to other aircraft
  • Step Reference Only: Optimized for step reference tracking; performance on other signal types (sine, ramp) may vary
  • Simulation Gap: Trained in simulation; real-world deployment would require additional validation
  • State Observability: Assumes all 4 longitudinal states are observable
  • Linear Dynamics: Based on linearized aircraft model around trim conditions

Ethical Considerations

  • Not for Real Flight Control: This model is for research and educational purposes only. It should NOT be used for actual aircraft control systems without extensive testing, certification, and regulatory approval.
  • Simulation Only: All training and evaluation performed in simulation environments.

Citation

If you use this model in your research, please cite:

@software{tensoraerospace2024,
  title = {TensorAeroSpace: Advanced Aerospace Control Systems \& Reinforcement Learning Framework},
  author = {TensorAeroSpace Team},
  year = {2024},
  url = {https://github.com/TensorAeroSpace/TensorAeroSpace},
  license = {MIT}
}

Model Card Authors

TensorAeroSpace Team

Model Card Contact

Downloads last month
10
Video Preview
loading

Evaluation results

  • Best Evaluation Reward on Boeing 747 Longitudinal Dynamics Simulation
    self-reported
    0.914
  • Overshoot (%) on Boeing 747 Longitudinal Dynamics Simulation
    self-reported
    0.490
  • Settling Time (s) on Boeing 747 Longitudinal Dynamics Simulation
    self-reported
    0.600
  • Rise Time (s) on Boeing 747 Longitudinal Dynamics Simulation
    self-reported
    0.300
  • Static Error on Boeing 747 Longitudinal Dynamics Simulation
    self-reported
    0.005