Overview
We propose a computer vision-based interactive multi-functional digital virtual musical instrument without dedicated hardware, MuGeVI, which allows users to perform, compose or control music by various hand gestures.
Multi-functions
Performance mode
Play various notes by adjusting the position of hands and completing special gestures.
Accompaniment mode
Control the scale degree and textures to accompany the singer or player in real time.
Control mode
Control the transposition and volume of a track being played.
Audio effects mode
Provide audio effects for instruments such as electric guitars in real time.
System Architecture
MuGeVI first acquires images of the player continuously and displays them in real-time, then detects the images using the neural network models to obtain the locations of 21 hand key points. Next MuGeVI will obtain the hand position and gesture based on these key points, map them to the corresponding music information based on the current instrument mode, package the data using the Open Sound Control (OSC) protocol and transmit them to the Max/MSP program. Finally MuGeVI uses the corresponding modules in Max/MSP to implement various functions.
Innovations
- No need to use sensors, easy to popularize and apply;
- Support for both MIDI and audio;
- Multiple modes switchable at any time;
- Scalability and programmability.
User Feedback
There are music professionals, music enthusiasts, amateurs and music technology researchers involved in the evaluation experiment.
Scores:
5 points: Strongly agree/satisfied
4 points: Comparatively agree / satisfied
3 points: Generally agree / satisfied
2 points: Disagree / Dissatisfied
1 points: Very Disagree / Dissatisfied
Results:
ID | User | Academic Background | Practicability | Convenience | Flexibility | Innovation | Average Score | Advantages | Limits | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Gesture | Performance and Composition | Frugal | Install and Use | Variety of Gesture | Control Sensitivity | |||||||
1 | Liu | Electronic Music Composition | 5 | 4 | 5 | 5 | 4 | 5 | 5 | 4.714 | Highly innovative | Add more timbres and playing methods. |
2 | Shi | Music Technology | 5 | 4 | 5 | 5 | 5 | 4 | 5 | 4.714 | Easy to get started and sensitive recognition | Enrich the texture type and the function of the keys. Add transposition and tempo change interface on the graphical interface. |
3 | Chen | Electronic Music Composition | 5 | 5 | 5 | 4 | 4 | 4 | 5 | 4.571 | Easy to use and offers more possibilities for music | It would be better if the threshold of use could be adjusted a little lower. |
4 | Luo | Music Technology | 5 | 4 | 5 | 5 | 4 | 4 | 5 | 4.571 | Interesting and creative | Hope to use on mobile. |
5 | Wang | Integrated Circuit | 3 | 4 | 4 | 3 | 4 | 3 | 5 | 3.714 | Convenient and easy to use | The user interface is too simple. |
6 | Wu | Music Technology | 4 | 5 | 5 | 5 | 4 | 3 | 4 | 4.286 | Operation is quite intuitive | Adding more control over the chords would make it more convenient to use, such as facial expression control. |
7 | Gao | Music Technology | 5 | 5 | 3 | 2 | 4 | 3 | 4 | 3.714 | Innovative and performative, using computer vision technology | If using open source software such as pure-data and supercolider, it may be easier to install, use and promote |
8 | Mu | Music Recording Art | 4 | 4 | 5 | 4 | 5 | 3 | 5 | 4.286 | First, it is highly innovative, and relatively convenient, with a variety of gestures | The chord transitions are stiff, the wah-wah effect is not very practical, and the sensitivity is not very good |
9 | Chao | Computer Science | 4 | 4 | 5 | 3 | 5 | 5 | 5 | 4.429 | 1. Model recognition accuracy is high, and various gestures can be detected very quickly. 2. It is very convenient to use! Easy music creation anytime, anywhere! | 1. You can try to add some new timbre. At present, only the piano has a single timbre. 2. You can write certain instructions, otherwise users do not know how to use each function, please refer to the instructions of some software. 3. Is now four features in four separate demos? You can integrate four features into a demo, so that you can synchronize some of the music effects, and it will be impressive. |
10 | Wang | Electronic Music Composition | 4 | 5 | 5 | 4 | 5 | 5 | 5 | 4.714 | Quite interesting | Develop software and hardware for independent use. |