AI Performance Expression Rendering Technology
Investigation of Musical expression through AI
The performances of many pianists and artists have had a strong influence on others’ musical performances and works. In recent years, with the development of video posting sites and social media, you can listen to a variety of piano performances for inspiration. At Yamaha, we hope that utilizing AI will help to create inspiration and make your musical activities even more exciting and enjoyable.
We are developing technology that uses AI to add expression to various piano scores, creating a natural and emotionally rich performance that sounds as if it were being played by a human. How close can AI get to a human-like performance? And, what is an inspirational performance? While exploring the answers to these questions, we aim to use AI to spread new waves of excitement and creativity.
Unveiling a Pianist’s Expression through AI
With this technology, AI (deep neural network) learns to associate the characteristics and expressions of the human pianist’s performance with the corresponding music score data. Once trained, the AI can infer natural performances with human-like characteristics and expressions for any given score. In addition, by training on performances played by a specific performer, the AI will be able to infer a performance that is unique to that person, even for a piece that the performer has never played before.
Required Data
To train the AI, we prepare “performance data” recorded by a human playing a keyboard instrument and “score data” of the corresponding performance. Performance data is the control that is input on the keyboard, including the velocity and timing of key presses and releases. This information is stored in a format that allows you to reproduce your performance on an automatic player piano or a digital piano. Since this kind of information cannot be obtained directly from an audio recording of a piano performance, AI infers the performance data that best recreates the audio recording on a specific piano from the recorded sound source.
Training and Inference
The AI uses a deep neural network to learn the relationship between performance data and score data. It learns how a human presses and releases the piano keys to express various characteristics in the score. This will allow the AI to rely on the characteristics of a given score to infer musical expressions of a human performer. In addition, the AI can infer the musical expressions of a human player even from a score that has never been played by that person.
This is how the AI learns the local correspondence relationship between performance data and score data. But, if the performance and score are learned directly, only one pattern of expression for that score can be inferred. However, humans use different expressions with each performance, even from the same score. Therefore, the AI learns potential variables in performance that cannot be obtained from a score alone, in order to infer a wide range of expressions.
Generating an Expressive Piano Performance
When a score is given to the AI, the AI analyzes the score and outputs data that includes the inferred performance expression. This data can be played on a piano with an automatic performance function, a digital piano, or music software on a personal computer. If the data is played on a Disklavier *1 hybrid piano, you can experience a particularly immersive performance as if a person were right there performing it.
*1 The Disklavier™ is a hybrid piano equipped with an automatic performance function that accurately reproduces the movements of the keyboard and pedals.
Bringing Pianists Closer Together
It is conceivable that by having AI learn to play like an influential pianist, that AI can then become a new partner that can inspire others to practice and perform on a daily basis. Recently, there has been a rapid increase in the amount of piano score data. There is a possibility that AI will be able to create performances that resemble those of pianists from the past, and this could lead to new inspiration.
Combining this technology with AI music ensemble technology provides the experience of allowing anyone to play with one’s favorite artists, anytime. We hope to offer a new way to enjoy music in a new era where anyone can enjoy playing with their dream partner, anywhere and at any time.
Towards Understanding Human Creativity
When you hear an unexpected expression in a pianist’s performance, you might be surprised and moved, and feel a strong sense of inspiration. However, it is becoming clear that this kind of unexpected performance is difficult for AI to generate, as it learns patterns and combinations that have a high probability of occurring. In the future, it will be important to understand and model human creativity. As we gain a deeper understanding of humans, we will aim to develop AI that will excite us with what unexpected way it will play next.
Application Examples
Dear Glenn
We exhibited “Dear Glenn”, a project that pursues the possibility of co-creation between AI and humans, at the Ars Electronica Festival 2019 media arts festival.
“Dear Glenn” uses AI performance expression rendering technology to learn and reproduce the performances of Glenn Gould, the legendary pianist who died in 1982. In addition, by fusing this technology with AI music ensemble technology , Dear Glenn is able to reproduce Glenn Gould’s touch on the piano and perform ensembles with artists who are connected to him.