REACH Project

Complete project overview, solution, technologies, and timeline.

Project Overview

The REACH project is developing a simulation-based reinforcement learning framework that trains a wearable robotic arm to perform everyday upper-limb tasks in a safe, repeatable virtual environment before deployment on hardware. By learning in simulation first, the system can explore many control strategies, reduce risk to participants and equipment, and ultimately provide smoother, more responsive, and energy-efficient assistance for stroke survivors and others with upper-limb impairments.

Original Project Proposal: View the initial concept provided by our sponsor

The Problem

The overall problem that REACH aims to solve is the developmental pains of prototyping and testing assistive robotic devices.

Current rehabilitation robotics research at NAU faces a significant bottleneck in control system development. Modern assistive devices often rely on rigid, preprogrammed control schemes that cannot easily adapt to dynamic, real-world human movement. Because the learning and testing of new control algorithms occur primarily on physical prototypes, each experimental iteration is slow, labor-intensive, and sometimes risky. Without a robust simulation-based learning platform, progress in developing natural, responsive, and safe robotic behaviors is limited by both hardware constraints and human resource availability.

High-Level Requirements

For this project, our high-level requirements reflect the MUST-have functional requirements defined in our specification:

A complete researcher interface that allows graduate students to run the full experiment workflow using Jupyter notebooks and YAML configuration files, without modifying core source code.
A physics-accurate MuJoCo simulation environment with a waist-mounted robotic arm model, sensor simulation, basic reaching tasks, and gesture recognition and response capabilities configured through YAML.
A reinforcement learning training loop that supports PPO and SAC with Stable-Baselines3, runs on both local machines and Monsoon HPC, logs metrics, saves and resumes checkpoints, and is fully driven by configuration files.

Technical Solution

REACH is a simulation-first reinforcement learning framework that accelerates the design and testing of upper-limb assistive robotic behaviors before they are deployed to real devices.

The framework centers on a configurable virtual environment where an assistive robotic arm interacts with a simulated human partner using gesture recognition and response to practice cooperative upper-limb movements. Around this core, we provide flexible experiment configuration, scripted training workflows, and rich visual feedback so researchers can iterate on task goals, rewards, and motion strategies without modifying low-level control code. As the project matures, the same framework will be used to explore more complex rehabilitation and daily-living tasks, serving as a safe proving ground for assistive behaviors that will ultimately transfer to the physical hardware being developed with our Mechanical and Electrical Engineering collaborators.

Technologies & Tools

Python

Why we chose it: Python offers a mature ecosystem for reinforcement learning, scientific computing, and research tooling, which lets us move quickly while keeping the code accessible to our collaborators.

Role in project: Python is the primary language for our simulation environments, training workflow, configuration system, and analysis notebooks, tying the full REACH framework together for upper-limb assistive robotics.

RL Stack (SB3 + PyTorch)

Why we chose it: Stable-Baselines3 and PyTorch provide reliable, well-tested implementations of modern reinforcement learning algorithms that are widely used in research and easy to extend.

Role in project: This stack powers the training loop for REACH, optimizing policies, managing checkpoints, and supporting algorithms like PPO and SAC for gesture and assistive-task learning.

MuJoCo Simulation

Why we chose it: MuJoCo is a high-precision physics engine designed for robotics and control, making it well-suited for simulating upper-limb assistive devices and human interaction.

Role in project: MuJoCo models the robotic arm, human, and environment so we can safely prototype tasks, test control strategies, and generate training data before running on real hardware.

Gymnasium Environments

Why we chose it: Gymnasium provides a standard interface for reinforcement learning environments, which keeps our tasks compatible with common RL libraries and tooling.

Role in project: Our simulation tasks implement the Gymnasium API so that training scripts, evaluation tools, and future environments can plug into the REACH framework consistently.

Jupyter & YAML Config

Why we chose it: Jupyter notebooks and human-readable YAML files make it easy for researchers to control experiments, document workflows, and share configurations without editing source code.

Role in project: All major workflows—environment inspection, training, evaluation, and analysis—are driven through notebooks and configuration files, enabling reproducible experiments and rapid iteration.

Vision & Gesture Recognition

Why we chose it: Modern computer vision techniques are needed to detect human hands and objects accurately enough for meaningful gesture-based interaction.

Role in project: Our vision pipeline processes simulated camera feeds to localize the user’s hand and other key features, feeding gesture information into the control policies that drive assistive motions.

Monsoon HPC & SLURM

Why we chose it: Training rich reinforcement learning policies can be computationally expensive, so we leverage NAU’s Monsoon cluster and SLURM scheduler for scalable experiments.

Role in project: Our training scripts can be launched locally or as SLURM jobs on Monsoon, allowing longer runs, parallel environments, and hyperparameter sweeps using the same configuration-driven workflow.

GitHub & Collaboration

Why we chose it: GitHub is a widely adopted platform for source control, code review, and team collaboration, which aligns well with our need to coordinate work across students, mentors, and sponsors.

Role in project: GitHub hosts the REACH codebase, tracks issues and feature requests, manages pull requests and reviews, and serves as the central hub for documentation and release artifacts.

Project Timeline

Vision & Gesture Loop

Current development phase (Dec 2025)

May 2026

Overall project completion target

~5 months

Time until project completion

Phase 1: Simulation Foundations (Complete)

Aug – Sep 2025

MuJoCo arm model, Gymnasium environment, reward shaping, and baseline reaching behaviors implemented and validated in simulation.

Phase 2: Training Infrastructure (Complete)

Oct – Nov 2025

PPO training wrapper, Monsoon SLURM pipeline, long-horizon training runs to 1M timesteps, and TensorBoard-based monitoring brought online.

Phase 3: Vision & Gesture Loop (In Progress)

Dec 2025 – Feb 2026

End-to-end loop for gesture recognition and response, including simulated camera feeds, gesture classification, and closed-loop policy training.

Phase 4: Hardware Integration (Planned)

Mar – Apr 2026

Integrate the learned policies with the physical robotic arm and perform simulation-to-real transfer testing in collaboration with the Biomechatronics Lab.

Phase 5: Testing & Deployment (Planned)

Apr – May 2026

Comprehensive testing, performance tuning, documentation, and final handoff of the REACH framework and artifacts to the Biomechatronics Lab.

Demo of Current Progress

Current MuJoCo reacher arm simulation demonstrating our baseline upper-limb assistive control behavior.