[June 4, 2025]
Autonomy for the World: X-62 VISTA

In May of last year, Secretary of the Air Force Frank Kendall stepped into the cockpit of an X-62A VISTA—a modified F-16 test aircraft—where, at the press of a button, he witnessed AI take control of the jet. In a within-visual-range dogfight scenario against a manned F-16, an autonomous agent flew independently, executing tactical maneuvers in dynamic air combat. Shield AI was one of the companies participating in this milestone, contributing after years of software development, iterative testing, and extensive reinforcement learning (RL) training.
Shield AI’s work on the X-62A VISTA (Variable In-flight Simulator Test Aircraft) is a landmark achievement in proving mission autonomy for high-performance combat aircraft. It required training, validation, real-world execution, and human expertise—all working together to ensure AI-powered autonomy performed reliably in combat scenarios.
This case study outlines how Shield AI:
- Developed AI-driven decision-making for Hivemind/AI Agents, leveraging reinforcement learning where appropriate.
- Validated AI performance through a structured testing pipeline before live deployment.
- Executed real-world flights on the X-62A VISTA safely and effectively.
- Integrated fighter pilot expertise to shape, refine, and evaluate AI behavior for operational relevance.
[ Autonomy engineers Chris Graham (left), Paolo Fermin and Matt Maroofi pose with the X-62 after successful flight. ]
Developing AI-Driven Decision-Making
Unlike traditional autopilot systems, mission autonomy requires AI to reason, react, and adapt in fast-changing, high-stakes environments. Shield AI’s approach incorporates multiple AI techniques, including reinforcement learning (RL), to refine autonomy solutions.
Developing combat-ready AI agents demands continuous training. AI agents undergo reinforcement learning in simulation environments around the clock, completing millions of dogfight simulations per day. Through an evolving opponent league system, the best-performing AI agents are pitted against increasingly capable adversaries, forcing them to refine their tactical responses in real time. Experienced fighter pilots and aerospace engineers shape training objectives, validate AI behavior, and ensure that autonomy solutions align with real-world combat tactics and flight dynamics.
Validating AI for Live Flight
Before an AI system takes flight in a real aircraft, it must undergo rigorous validation to ensure reliability and safety. Shield AI employs a three-phase testing pipeline designed to bridge the gap between simulated performance and real-world execution. The process begins with synthetic validation, where AI agents are tested in high-fidelity digital environments using physics-based modeling to verify expected behaviors. From there, AI progresses to hardware-in-the-loop (HITL) testing, integrating into real aircraft hardware to confirm seamless interaction with physical systems. Only after passing these stages is the AI deployed in supervised real-world flight scenarios to demonstrate operational readiness.
Flying the X-62A VISTA: Autonomy in Action
With validated AI agents ready for real-world evaluation, Shield AI conducted flight testing on the X-62A VISTA, engaging in autonomous air combat trials. This groundbreaking work, conducted in collaboration with the DARPA ACE team, contributed to the program being named a finalist for the prestigious Collier Trophy, recognizing major aerospace achievements.
During flight tests, the AI-controlled X-62A executed offensive and defensive dogfight maneuvers against a manned F-16, demonstrating autonomous decision-making under high-G conditions. Secretary Kendall observed the AI in action, witnessing firsthand how it performed in live air combat scenarios. Ensuring near-zero safety violations was a top priority, with stringent “knock-it-off” criteria, such as maintaining a minimum altitude of 10,000 feet and at least 1,000 feet of separation from other aircraft.
Performance benchmarks played a key role in determining readiness for live flight. The AI was expected to achieve a greater than 90% kill rate in offensive matches and fewer than 10% deaths in defensive matches, with additional metrics such as control smoothness, G-loading efficiency, and accuracy of gun engagements. These rigorous standards ensured that agents were not only capable of executing advanced maneuvers but did so with tactical effectiveness and safety.
Refining Autonomy Through Pilot Feedback
Post-flight debriefs with Shield AI engineers and test pilots provided critical insights into AI performance, focusing on tactical effectiveness, adaptability, and predictability. Did the AI make the right decisions at the right time? How well did it respond to unpredictable adversary movements? Could pilots anticipate and understand its decision-making?
This feedback was essential in the fly-fix-fly process, where engineers rapidly refined AI agent behaviors on a sortie-to-sortie basis rather than waiting for formal test events. One of the most significant lessons from live testing was that simulation and real-world flight environments do not match one-to-one. To bridge this gap, pilot feedback was incorporated directly into AI training, ensuring alignment with real-world combat dynamics.
For example, during early test flights, pilots noted that the AI was too aggressive when closing in for a gunshot, making engagements challenging in live scenarios. They recommended a rule of thumb to control closure based on range to the bandit for more stable approaches while using the gun. Engineers incorporated this constraint into AI training, leading to immediate improvements in precision and safety in subsequent test events.
Beyond Dogfighting: Expanding the Scope of Autonomy
Lessons learned from X-62A testing extend beyond close-range air combat. The same AI development methods used for autonomous air combat are shaping the future of autonomy across Shield AI’s programs.
Future applications include AI-powered multi-aircraft coordination, where autonomous systems collaborate to maximize lethality in combat missions. AI is also being refined for adaptive threat responses, allowing real-time reactions to adversary capabilities. Additionally, integrating AI with next-generation platforms will enable multi-tasking autonomy, managing parallel tasks and contingencies in complex mission environments.
The insights gained from X-62A testing are already shaping new autonomous platforms, guiding how AI operates in dynamic, multi-aircraft environments. By proving that AI can effectively reason and command in high-stakes engagements, Shield AI is paving the way for full-spectrum autonomy in combat, reconnaissance, and multi-aircraft operations.
About the Author
Chris Graham is a Lead Machine Learning Engineer at Shield AI, specializing in reinforcement learning, control optimization, and AI-driven autonomy for military aircraft. A Principal Investigator and Lead AI Developer, he played a key role in advancing autonomous fighter jet decision-making as part of the X-62A ACE Team, a 2023 Collier Trophy finalist. Previously, he worked at Volvo Group Trucks Technology, developing machine learning models for system diagnostics and vehicle optimization. Chris holds a Mechanical Engineering degree from Penn State University. He works out of Shield AI’s Washington, D.C. office and in his spare time, enjoys high-performance driving education (HPDE) instruction, data-guided coaching, and competitive sprint and endurance racing with organizations like SCCA, AER, and Champ Car.
This blog is part of a series of case studies highlighting the unique challenges and accomplishments of integrating Hivemind on each platform it has flown. Each installment explores the technical innovations, collaborative efforts, and mission successes that define our work and our teams.