Complete project overview, solution, technologies, and timeline.
The REACH project is developing a simulation-based reinforcement learning framework that trains a wearable robotic arm to perform everyday upper-limb tasks in a safe, repeatable virtual environment before deployment on hardware. By learning in simulation first, the system can explore many control strategies, reduce risk to participants and equipment, and ultimately provide smoother, more responsive, and energy-efficient assistance for stroke survivors and others with upper-limb impairments.
Original Project Proposal: View the initial concept provided by our sponsor
The overall problem that REACH aims to solve is the developmental pains of prototyping and testing assistive robotic devices.
Current rehabilitation robotics research at NAU faces a significant bottleneck in control system development. Modern assistive devices often rely on rigid, preprogrammed control schemes that cannot easily adapt to dynamic, real-world human movement. Because the learning and testing of new control algorithms occur primarily on physical prototypes, each experimental iteration is slow, labor-intensive, and sometimes risky. Without a robust simulation-based learning platform, progress in developing natural, responsive, and safe robotic behaviors is limited by both hardware constraints and human resource availability.
For this project, our high-level requirements reflect the MUST-have functional requirements defined in our specification:
REACH is a simulation-first reinforcement learning framework that accelerates the design and testing of upper-limb assistive robotic behaviors before they are deployed to real devices.
The framework centers on a configurable virtual environment where an assistive robotic arm interacts with a simulated human partner using gesture recognition and response to practice cooperative upper-limb movements. Around this core, we provide flexible experiment configuration, scripted training workflows, and rich visual feedback so researchers can iterate on task goals, rewards, and motion strategies without modifying low-level control code. As the project matures, the same framework will be used to explore more complex rehabilitation and daily-living tasks, serving as a safe proving ground for assistive behaviors that will ultimately transfer to the physical hardware being developed with our Mechanical and Electrical Engineering collaborators.
Why we chose it: Python offers a mature ecosystem for reinforcement learning, scientific computing, and research tooling, which lets us move quickly while keeping the code accessible to our collaborators.
Role in project: Python is the primary language for our simulation environments, training workflow, configuration system, and analysis notebooks, tying the full REACH framework together for upper-limb assistive robotics.
Why we chose it: Stable-Baselines3 and PyTorch provide reliable, well-tested implementations of modern reinforcement learning algorithms that are widely used in research and easy to extend.
Role in project: This stack powers the training loop for REACH, optimizing policies, managing checkpoints, and supporting algorithms like PPO and SAC for gesture and assistive-task learning.
Why we chose it: MuJoCo is a high-precision physics engine designed for robotics and control, making it well-suited for simulating upper-limb assistive devices and human interaction.
Role in project: MuJoCo models the robotic arm, human, and environment so we can safely prototype tasks, test control strategies, and generate training data before running on real hardware.
Why we chose it: Gymnasium provides a standard interface for reinforcement learning environments, which keeps our tasks compatible with common RL libraries and tooling.
Role in project: Our simulation tasks implement the Gymnasium API so that training scripts, evaluation tools, and future environments can plug into the REACH framework consistently.
Why we chose it: Jupyter notebooks and human-readable YAML files make it easy for researchers to control experiments, document workflows, and share configurations without editing source code.
Role in project: All major workflows—environment inspection, training, evaluation, and analysis—are driven through notebooks and configuration files, enabling reproducible experiments and rapid iteration.
Why we chose it: Modern computer vision techniques are needed to detect human hands and objects accurately enough for meaningful gesture-based interaction.
Role in project: Our vision pipeline processes simulated camera feeds to localize the user’s hand and other key features, feeding gesture information into the control policies that drive assistive motions.
Why we chose it: Training rich reinforcement learning policies can be computationally expensive, so we leverage NAU’s Monsoon cluster and SLURM scheduler for scalable experiments.
Role in project: Our training scripts can be launched locally or as SLURM jobs on Monsoon, allowing longer runs, parallel environments, and hyperparameter sweeps using the same configuration-driven workflow.
Why we chose it: GitHub is a widely adopted platform for source control, code review, and team collaboration, which aligns well with our need to coordinate work across students, mentors, and sponsors.
Role in project: GitHub hosts the REACH codebase, tracks issues and feature requests, manages pull requests and reviews, and serves as the central hub for documentation and release artifacts.
Current development phase (Dec 2025)
Overall project completion target
Time until project completion
Aug – Sep 2025
MuJoCo arm model, Gymnasium environment, reward shaping, and baseline reaching behaviors implemented and validated in simulation.
Oct – Nov 2025
PPO training wrapper, Monsoon SLURM pipeline, long-horizon training runs to 1M timesteps, and TensorBoard-based monitoring brought online.
Dec 2025 – Feb 2026
End-to-end loop for gesture recognition and response, including simulated camera feeds, gesture classification, and closed-loop policy training.
Mar – Apr 2026
Integrate the learned policies with the physical robotic arm and perform simulation-to-real transfer testing in collaboration with the Biomechatronics Lab.
Apr – May 2026
Comprehensive testing, performance tuning, documentation, and final handoff of the REACH framework and artifacts to the Biomechatronics Lab.
Current MuJoCo reacher arm simulation demonstrating our baseline upper-limb assistive control behavior.