Fine-Grained AI Assistance in a Flight Simulator

Overview

Flying waypoints with an AI in your ear — helpful or hindering?

Organization

PhD Research

Timeline

October 2025 – April 2026

Background

Building on lessons from spatial disorientation research, this project extends real-time AI assistance to a more complex and realistic setting: a 3D navigational flight task in an open-source flight simulator. The task increases from a 1D action space (balancing) to a 2D action space (roll and pitch) in a 3D arena, where a pilot must collect randomly placed waypoints as quickly as possible without crashing.

Approach

The experiment uses PyFlyt, an open-source UAV flight simulator built on Gymnasium. Two distinct modes of AI assistance were designed and evaluated:

Arrow assistance (situated): a 3D arrow that points toward the active waypoint — tells the pilot where to go but not how to get there
Ghost plane assistance (embodied): a semi-transparent plane driven by a trained PPO agent that physically demonstrates the correct maneuver — shows the pilot exactly how to pitch and roll

The underlying AI was trained using curriculum learning over 70M timesteps (PPO), first without task constraints, then with human-appropriate constraints added. Between sessions, the ghost plane agent was retrained using imitation learning (behavior cloning and AIRL) on human demonstration data collected in session 1.

A just-in-time assistance mode was also tested, where guidance only appeared when a crash was predicted (≥30% probability) or the active waypoint had been out of view for 30+ seconds.

Human Subject Study

N=26 participants across two sessions (~7–10 days apart). Each session included a solo baseline task followed by assisted conditions. Participants completed surveys measuring NASA-TLX cognitive load, subjective trust (7-point Likert), and perceived performance impact after each condition.

Session 1: Alone → Arrow → Ghost (counterbalanced)

Session 2: Alone → Ghost-S2 → Ghost-S2 just-in-time (counterbalanced)

Key Findings

Arrow assistance improved positive task metrics: more waypoints collected, more successful runs
Ghost plane assistance improved safety metrics — max G-force, control entropy, pilot-induced oscillation — especially for intermediate-skill pilots; expert pilots were often harmed by ghost plane guidance
84% of participants preferred just-in-time over continuous assistance, citing continuous guidance as “distracting”, “hard to follow”, and “mentally exhausting”
Trust was highest for the arrow mode (avg. 5.69/7); ghost plane trust dropped after HITL retraining, as the retrained agent diverged further from human behavior (higher Wasserstein distance)
Novices and intermediates were better calibrated judges of AI utility; experts over-trusted the AI and overestimated its impact on their performance — an inversion of the pattern typically seen in medical AI assistance
HITL retraining via AIRL improved behavioral human-likeness but at the cost of a ~71% drop in task performance, highlighting a fundamental tension between embodiment and objective performance

Code

Source code is available on GitHub.

Status

This work constitutes a chapter of my PhD dissertation and is expected to be published in May–June 2026.

All Projects