Facilitated By

San Antonio Medical Foundation

SPATIO-TEMPORAL NEURAL NETWORKS FOR MARKERLESS MOTION CAPTURE

Southwest Research Institute

Southwest Research Institute (SwRI), headquartered in San Antonio, Texas, is one of the oldest and largest independent, nonprofit, applied research and development (R&D) organizations in the United States.

Principal Investigator(s)
Brian Swenson
Funded by
SwRI
Research Start Date
Status
Active

The Human Performance Initiative at SwRI has developed a markerless biomechanics system to capture human motion data in any environment with accuracy that rivals laboratory-based motion capture systems. This technology applies to the fields of peak human performance, medical diagnostics, and veterinary/zoological sciences. While the currently deployed system processes video data across multiple cameras, the neural network backbone is only looking at single frames and making predictions based on that small datapoint. By using emerging neural network architectures and training pipelines, we can incorporate temporal aspects of human motion into the markerless motion capture pipeline, creating a network that understands the context of the human body in motion.

The temporal aspect of this neural network design required restructuring the underlying dataset used to train the network. We incorporated three publicly available training datasets and two internally collected datasets for gait and functional movements, reorganizing the data to sample multiple consecutive frames.

Significant network restructuring was necessary to handle the multi-frame approach for this project. Functionally, every additional frame that is passed to the network adds significant training time and increases the size of the network. To handle this issue, we first implemented the PyTorch Lightning library, allowing faster training across multiple GPUs.

The goal of the project was to build a network that simultaneously made temporal and spatial predictions. To accomplish this, we used an additional convolutional dimension in the network that allows it to communicate information between frame predictions. This also makes the network easily scalable to different sequence lengths as necessary. Due to the additional network size required for this approach, and the inherent training time increases for the increased input data, we implemented these network changes in a streamlined version of the original architecture known as Lite-HRNet.

Collaborative Project
Basic Research
Biomechanics
Disease Modeling
Medical Devices
Aging
Neuroscience
Musculoskeletal
Other