Manta Ray–Inspired Q-Learning Underwater Autonomous Robot
RAY (Robot for Aquatic Yield) is a soft-bodied underwater robot that uses manta ray–like pectoral fins, low-power electronics, and a tabular Q-learning algorithm to navigate and adapt to its environment.
By combining biomimetic design, reinforcement learning, and modular electronics, this project explores how intelligent underwater robots can eventually operate similar to a school of fish—avoiding obstacles, conserving energy, and exhibiting emergent group behavior.
Prototype & System Design
Conceptual Design & Biomimicry
RAY is designed around a manta ray–inspired body with two flexible silicone pectoral fins attached to a sealed electronics housing. The wings supply thrust and maneuverability through flapping and undulating motion similar to mobuliform locomotion observed in real manta rays. A stiffer substructure inside each fin transmits torque from waterproof servomotors while preserving flexibility for efficient lift-based propulsion.
The physical design emphasizes:
- Body volume under one cubic foot for compact testing and deployment.
- Soft, layered materials that mimic cartilage, muscle, and skin using silicone.
- Fin geometry tuned for effective thrust and maneuvering in a large body of water.
Key Requirements
- Q-Learning Algorithm: Implement a tabular Q-learning controller that updates state–action values from rewards and penalties, enabling RAY to learn efficient behaviors such as obstacle avoidance and goal-directed motion.
- Size Constraint: Keep overall volume under one cubic foot while still providing enough internal space for the microcontroller, IMU, motors, and power system. An attempt to utilize empty space will be made to minimize the robot's size.
- Power Consumption: Selected microcontrollers and motors with low power draw will be used to maximize run time during extended experiments.
- Biomimicry: Ensure both appearance and motion resemble a manta ray, including turning, rising, descending, and gliding motions.
Electronics & Control
The electronics are consolidated into a waterproof housing that contains the microcontroller, inertial measurement unit (IMU), power electronics, and motor drivers. An MPU6050 IMU provides orientation and acceleration data so that the robot can monitor its movement, self-balance, and track how actions affect its state in water.
The microcontroller executes the Q-learning algorithm, maps discrete states (such as direction, proximity to obstacles, or depth zones) to fin actuation patterns, and updates its Q-table from observed rewards. Over time, RAY learns which fin motions lead to more desirable outcomes in a given environment.
Development Phases
- Phase One – Prototype Construction: Build a manta ray–shaped robot with flexible fins and a waterproofed electronics compartment that can perform basic random movements in water.
- Phase Two – Q-Learning Autonomy: Integrate and tune the Q-learning algorithm so that the robot moves autonomously, avoids obstacles, and adapts its motion based on environmental feedback.
- Phase Three – Multi-Robot Coordination: Scale to multiple RAY units that communicate and coordinate, laying the groundwork for swarm behaviors and emergent group dynamics.
Future Work
- Body Miniaturization: Explore more compact, high-torque motors and batteries to reduce internal volume.
- Robust Waterproofing: Improve sealing using O-rings and epoxy potting for all pass-throughs, enabling reliable long-term deployment.
- Enhanced Pectoral Fins: Add additional fin degrees of freedom to unlock more agile and realistic manta-ray maneuvers.
- Swarm Intelligence: Extend Q-learning to multi-agent settings for schooling behaviors, such as alignment, cohesion, and collision avoidance.
Prototype Images
Version 1
Version 2
References
References are formatted in IEEE style and include biological, robotics, and biomimetic propulsion literature that inform the design of the RAY robot, along with practical resources for manta ray–inspired prototypes and standards for testing autonomous systems.
- K. H. Low, “Model and parametric study of modular undulating fin rays for fish robots,” Mechanism and Machine Theory, vol. 44, no. 3, pp. 615–632, Mar. 2009.
- J. T. Schaefer and A. P. Summers, “Batoid wing skeletal structure: Novel morphologies, mechanical implications, and phylogenetic patterns,” Journal of Morphology, vol. 264, no. 3, pp. 298–313, Jun. 2005.
- A. K. Jayasankar, “Structural modeling and computational mechanics of stingray inspired tesserae,” Ph.D. dissertation, Fakultät III – Prozesswissenschaften, Technische Universität Berlin, Berlin, Germany, 2019.
- F. E. Fish, C. M. Schreiber, K. W. Moored, G. Liu, H. Dong, and H. Bart-Smith, “Hydrodynamic performance of aquatic flapping: Efficiency of underwater flight in the manta,” Aerospace, vol. 3, no. 3, pp. 2–24, Jul. 2016, Art. no. 20.
- Q. Liu et al., “A manta ray robot with soft material based flapping wing,” Journal of Marine Science and Engineering, vol. 10, no. 7, Art. no. 962, Jul. 2022.
- O. Paff, “Building a Manta Ray Robot,” YouTube. Available: https://www.youtube.com/watch?v=vEpUk1kyOjo. Accessed Nov. 15, 2025.
About Us
RAY is an EE 476C capstone project conducted at Northern Arizona University’s Steve Sanghi College of Engineering within the School of Informatics, Computing, and Cyber Systems. Our multidisciplinary team brings together interests in power electronics, biomimicry, embedded systems, and artificial intelligence to develop an underwater manta ray inspired robot that can eventually behave as part of a coordinated swarm.
Team Members
Zaley Hart
Coordinates overall system design, integration of electrical subsystems, and alignment of the project with sponsor requirements. Ensures deadlines are being met, stepping in where needed.
Brizia Chavez
Ensures clear communication between the client and capstone team, guaranteeing that project requirements are met accurately. Supports development of the Q-learning algorithm.
Brodie Ross
Manages microcontroller selection, firmware development, and communication between sensors, motors, and the learning algorithm.
Tyler Engle
Develops the Q-learning algorithm, defines state and action spaces, and implements reward structures to support autonomous navigation and obstacle avoidance.
Jeremy Juwaman
Designs the wing structure and mobility. Manages power system design, evaluates energy usage, and ensures RAY can operate for extended periods during experiments.
Chris Starling
Leads body design, aquatic testing, data collection, and performance evaluation to validate RAY’s locomotion, stability, and learning behaviors.
Acknowledgements
We would like to give a special thanks to Dr. Carlo daCunha and the C-Lab at Northern Arizona University for sponsoring and supporting the RAY capstone project.
Dr. Carlo daCunha
Dr. daCunha is an Assistant Professor of Electrical and Computer Engineering at Northern Arizona University. He completed his postdoctoral work in Physics at McGill University and earned his Ph.D. in Electrical Engineering from Arizona State University. His research explores complex systems and swarm robotics, and he contributes to the project through his expertise in robotics, artificial intelligence, and system behavior, providing ongoing guidance and mentorship.