The ManyEars Project: Microphone Array-Based Audition for Mobile Robots
Development will continue on github : https://github.com/introlab/manyears
About
Intelligent systems would benefit from being able to localize and track sound sources in real life settings. Such a capability can help localizing a person or an interesting event in the environment, and also provides enhanced processing for other capabilities such as speech recognition. To give this capability to a computing system, the challenge is not only to localize simultaneous sound sources, but to track them over time. The ManyEars project propose a robust sound source localization and tracking method using an array of eight microphones. The method is based on a frequency-domain implementation of a steered beamformer along with a particle filter-based tracking algorithm. Tests on a mobile robot show that the algorithm can localize and track in real-time multiple moving sources of different types over a range of 7 meters. These new capabilities allowed the robot to interact using more natural means with people in real life settings.
The ManyEars project was setup to provide source code from the original AUDIBLE project. It provides an easy to use 'C' library for microphone array processing. This includes sound source localisation, tracking and separation.A tuning Qt GUI is also available for fine tuning the parameters.
Please read the software [SoftwareFAQ] and [SoftwareDocumentation] if you have questions regarding the software.
Latest release
- Contact us at : mailto:manyears-devel@lists.sourceforge.net
ROS
- ROS packages for ManyEars are available on Github :
ManyEars requires a synchronous audio capture card. Although commercial audio acquisition cards are available, they are expensive, bulky, inefficient and too generic for the needs of ManyEars. Although functional, these cards do not meet the specific needs of mobile robotics and IntRoLab. The goal of 8 Sounds USB is to have an synchronous audio capture card with low power consumption and a small form factor with 8 input channels and 2 output channels. The first iteration of the project is a prototype providing a synchronous audio capture card with wiring diagrams, circuit board, full documentation. XMOS is our platform of choice to process the I2S codec data and USB transmission. We are also looking for the USB Generic Audio Class 2 implementation. More information on the project can be found here :
* [8SoundsUSB project web site](http://eightsoundsusb.sourceforge.net)
** Please note that the ManyEars algorithm does not run on the sound card. The card is solely used for audio data acquisition.**
September 2014 - Special thanks to Dr. Caleb Rascon for the nice ManyEars video.
- Plain 'C' easy to build and integrate GPL implementation
- Linux / Windows / OSX Full source code
- No external dependency
- SSE optimized
- Tuning Qt GUI for easy configuration and visualization (as shown in picture below)
- All parameters are fine tunable for maximum performance
- Full Localization & Tracking & Separation
- Improved localisation from previous release
- Separation added + post filtering
- Example code
- Offline processing using pre-recorded data
- Binaries for Windows & OSX
What's up for next release
- Complete ROS package
- Documentation for tunable parameters
- More examples
Features:
- Plain 'C' GPL implementation
- Linux / Windows / OSX Full source code
- No external dependency
- SSE optimized
- Tuning Qt GUI for easy configuration and visualization (as shown in picture above)
- All parameters are fine tunable for maximum performance
- ROS package (Will be available soon from another project)
- Localization & Tracking will be available first
- Separation and other features are being implemented
- Example code
- Offline processing using pre-recorded data
- Email us for more informations
- Binaries will be available soon
Authors
Click here for the list of authors.
News
Hardware Requirements
Hardware requirements are available here.
Software Parameters Documentation
There are many parameters to tune to optimize the performance of the ManyEars system. Here is a list of all the parameters and their description.
Some Frequently Asked Questions regarding tuning the parameters are introduced here.
License
ManyEars is distributed under the GNU General Public License (GPL) and a patent license is provided for use in GPL software. See the full license for details.
Download
Click here to access the download section and release informations.
Publications
- F. Grondin, D. Létourneau, F. Ferland, V. Rousseau, F.Michaud, "The ManyEars open framework", Autonomous Robots, 1-16, 2013.
- S. Briere, J.-M. Valin, F. Michaud, Dominic Letourneau, Embedded Auditory System for Small Mobile Robots, Proc. International Conference on Robotics and Automation (ICRA), 2008.
- J.-M. Valin, S. Yamamoto, J. Rouat, F. Michaud, K. Nakadai, H. G. Okuno, Robust Recognition of Simultaneous Speech By a Mobile Robot, IEEE Transactions on Robotics, Vol. 23, No. 4, pp. 742-752, 2007.
- J.-M. Valin, F. Michaud, J. Rouat, Robust Localization and Tracking of Simultaneous Moving Sound Sources Using Beamforming and Particle Filtering. Robotics and Autonomous Systems Journal (Elsevier), Vol. 55, No. 3, pp. 216-228, 2007.
- S. Yamamoto, K. Nakadai, M. Nakano, H. Tsujino, J.-M. Valin, K. Komatani, T. Ogata, H. G. Okuno, Simultaneous Speech Recognition based on Automatic Missing-Feature Mask Generation integrated with Sound Source Separation (in Japanese). Journal of Robotic Society of Japan, Vol. 25, No. 1, 2007.
- J.-M. Valin, F. Michaud, J. Rouat, Robust 3D Localization and Tracking of Sound Sources Using Beamforming and Particle Filtering. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 841-844, 2006.
- S. Briere, D. Letourneau, M. Frechette, J.-M. Valin, F. Michaud, Embedded and integration audition for a mobile robot. Proceedings AAAI Fall Symposium Workshop Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems, FS-06-01, 6-10, 2006
- S. Yamamoto, K. Nakadai, J.-M. Valin, J. Rouat, F. Michaud, K. Komatani, T. Ogata, H. G. Okuno, Making a robot recognize three simultaneous sentences in real-time. Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2005.
- M. Murase, S. Yamamoto, J.-M. Valin, K. Nakadai, K. Yamada, K. Komatani, T. Ogata, H. G. Okuno, Multiple Moving Speaker Tracking by Microphone Array on Mobile Robot. Proc. European Conference on Speech Communication and Technology (Interspeech), 2005.
- S. Yamamoto, J.-M. Valin, K. Nakadai, J. Rouat, F. Michaud, T. Ogata, H. G. Okuno, Enhanced Robot Speech Recognition Based on Microphone Array Source Separation and Missing Feature Theory. Proc. International Conference on Robotics and Automation (ICRA), 2005.
- J.-M. Valin, J. Rouat, F. Michaud, Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter. Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2123-2128, 2004.
- J.-M. Valin, F. Michaud, J. Rouat, D. Letourneau, Robust Sound Source Localization Using a Microphone Array on a Mobile Robot. Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1228-1233, 2003.
- J.-M. Valin, F. Michaud, B. Hadjou, J. Rouat, Localization of Simultaneous Moving Sound Sources for Mobile Robot Using a Frequency-Domain Steered Beamformer Approach. Proc. IEEE International Conference on Robotics and Automation (ICRA), pp. 1033-1038, 2004.
- J.-M. Valin, J. Rouat, F. Michaud, Microphone Array Post-Filter for Separation of Simultaneous Non-Stationary Sources. Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 221-224, 2004.