Audio-Driven Robot Upper-Body Motion Synthesis

In this project I developed an automatic system in which audio input from a user generates upper-body movements of the user on the humanoid robot Pepper. To the best of my knowledge, this system brings two novelties: it performs whole upper-body motion synthesis including head, hand and hip movements; and it is targeted to a humanoid robot. The system was developed using only single-view RGB videos and supports offline and online synthesis modes. Using audio-visual recordings of upper-body movements of 19 speakers, I extracted audio and pose features, comparing four 3D pose estimation methods. The estimated 3D joint positions were used to calculate angles between upper-body joints and the obtained angle time-series were then smoothed and constrained to the robot’s operating limits. To learn the mapping between audio features and upper-body pose, I trained the multilayer perceptron (MLP) and long short-term memory (LSTM) neural network models in a subject-independent (SI) and subject-dependent (SD) manner. The developed system was evaluated quantitatively and qualitatively using web-surveys when driven by natural as well as synthetic speech. My investigations show that the SD model variants outperform the SI variants and that the MLP model is better suited for real-time motion synthesis than the LSTM, as it performs the online synthesis approximately 5-times faster. On natural speech, the movements generated by the LSTM model were assessed as significantly more appropriate for the given audio than those generated by the MLP model. On synthetic speech, however, the survey respondents preferred the MLP model over the LSTM.

This project was part of my MEng Thesis. It resulted in a journal paper:

Jan Ondras, Oya Celiktutan, Paul Bremner, Hatice Gunes
Audio-Driven Robot Upper-Body Motion Synthesis
IEEE Transactions on Cybernetics, 2020

Citation

@article{ondras2020audio,
  title={Audio-Driven Robot Upper-Body Motion Synthesis},
  author={Ondras, Jan and Celiktutan, Oya and Bremner, Paul and Gunes, Hatice},
  journal={IEEE Transactions on Cybernetics},
  year={2020},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
Dependencies		Dependencies
Models		Models
SourceCode		SourceCode
Surveys/TextsForSyntheticSpeech		Surveys/TextsForSyntheticSpeech
Thesis		Thesis
MotionSynthesisSystem_README.pdf		MotionSynthesisSystem_README.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio-Driven Robot Upper-Body Motion Synthesis

Citation

About

Releases

Packages

Languages

jancio/Audio-Driven-Upper-Body-Motion-Synthesis

Folders and files

Latest commit

History

Repository files navigation

Audio-Driven Robot Upper-Body Motion Synthesis

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages