Making Self-Expression Accessible for Everyone #1
The Instrument Expanding the Freedom of Music Creators
March 15, 2023
Many people around the world know and love VOCALOID™, the singing voice synthesis software developed by Yamaha. Short for “vocal android”, VOCALOID has made its mark in music culture, with terms like "Vocaloid songs" and "Vocaloid producers" becoming increasingly prevalent on the Internet and in media. Its impact, however, has reached far beyond the field of music. VOCALOID-related characters have starred in games and have inspired a whole host of fan art such as illustrations, manga, and novels. While there is no question that VOCALOID has won the hearts of many, few know that the technology is rooted in the commitment to supporting the musical expressiveness of creators.
Yamaha launched the first VOCALOID engine in 2003, but it wasn’t until its successor VOCALOID2 was released in 2007 that the singing voice technology received widespread attention. What sparked the change was Hatsune Miku, a character developed by Crypton Future Media, Inc, for the VOCALOID2 engine. Personified as a virtual diva with sweet vocals, Hatsune Miku became a phenomenon that contributed to the rapid popularization of VOCALOID. The rise of social media and video-sharing services such as Niconico fueled the movement by making it easier for creators to reach audiences with their music. This is how Vocaloid music creators — dubbed “Vocalo P” (Vocaloid producers) by their fans — went on to dominate the charts with many hit songs.
The latest product VOCALOID6 was just launched in October 2022. What is on the minds of the people behind this ever-evolving technology?
Eliminating Barriers to Vocal Music Production
Computer music, or music produced using computers, dramatically progressed in the 1990s and 2000s whereby creators rapidly attained the means to incorporate instrument sounds to create music. Generating natural-sounding vocals, however, remained a major challenge. It was Yamaha’s technological breakthrough that gave birth to VOCALOID, the computer music software that made it possible for anyone to synthesize singing voices.
“Think of it as like having a virtual vocalist inside your computer,” describes Dai Ichikawa, who works on VOCALOID marketing. “By inputting the melody and lyrics, you can make the computer ‘sing’ a song for you. Using this software allows you to write vocal music without having access to a studio.”
Conventional approaches to producing songs with lyrics involve more hurdles than one might expect. For example, your choices for booking a studio or sourcing recording equipment may be limited due to financial and geographical restrictions. The biggest challenge for many is finding the right vocalist. Masafumi Yoshida, who is in charge of the product planning of VOCALOID, says that even if you can sing the song yourself, it can be hard to effectively record music at home. “Getting someone else to sing is equally difficult,” he says. “You might struggle to articulate how exactly you want them to sing, or hesitate to ask for numerous takes. VOCALOID allows you to avoid these problems altogether.” In short, VOCALOID can provide a sense of encouragement for anyone who wants to produce music with vocal parts. It gives everyone a fair chance at song production.
“With the help of VOCALOID, anyone can add vocals to their music,” Yoshida says. He joined Yamaha mid-career after being referred by Hideki Kenmochi, known as the “father of VOCALOID.” One of Yoshida’s previous jobs involved setting up audio equipment at event venues. He was committed to designing the acoustics in ways that would enhance the experience of everyone in the room. Although his current role as a product planner of VOCALOID is completely different from that of a sound engineer, his enthusiasm to support other people’s creative activities remains unchanged.
Richer Options for Greater Creative Freedom
The latest VOCALOID6 software allows for more freedom in vocal expressions than ever before. A key technological breakthrough in achieving this was VOCALOID:AI — a new vocal synthesis engine that leverages machine learning. Now users can create natural, richly expressive vocals thanks to AI, which has learned a broad range of human vocal characteristics.
Several other features were added to VOCALOID6 to widen the spectrum of vocal expressions that creators can generate. For example, the VOCALO CHANGER function uses VOCALOID:AI technology to apply the characteristics of the creator’s own singing voice onto the voicebanks (voices that can be used in the program). Meanwhile, the multilingual feature facilitates the creation of songs that mix Japanese, English, and Chinese lyrics. “With previous versions of VOCALOID, voicebanks were language-specific,” Yoshida explains. “For example, if you wanted to make a Japanese voice sing English lyrics, you would have to piece together Japanese sounds to replicate the English pronunciation. It was a challenging and time-consuming process to adjust the small nuances to make it sound natural.” With VOCALOID6, however, the voicebanks can already sing naturally in multiple languages. “I think the creative freedom is empowering, especially as music becomes more and more intercultural,” says Yoshida.
Unbound by linguistic constraints and equipped with new expressive choices, users now have greater freedom to experiment with their music. “VOCALOID6 has significantly expanded the possibilities for creators,” Ichikawa says. He himself has been making computer music and DJ-ing since his teenage years. He joined Yamaha to fulfill his ambition of working in the music industry, and now, in turn, he supports creators in achieving their visions. “I feel passionate working with VOCALOID and synthesizers because they are both instruments that I hold dear to my heart,” he says.
Fine-Tuning to Give Leeway for Self-Expression
Although VOCALOID provides an equal starting line for all aspiring song producers, Yoshida and Ichikawa emphasize that the technology is not intended to "simplify music production.”
“Just like the timbre of any musical instrument differs from player to player, we want VOCALOID to allow the individuality of each creator to shine through,” says Ichikawa. He feels that this is an important stance for Yamaha to maintain as a musical instrument manufacturer. Yoshida agrees, saying, “VOCALOID is the same as any other instrument. It wouldn’t be as engaging if it sounded the same no matter who used it.”
In 2019, Yamaha drew attention by reproducing the singing voice of the late Japanese legend Hibari Misora, using its new vocal synthesis engine VOCALOID:AI. The buzz led the team to wonder whether the same technology should be incorporated into the then-latest software VOCALOID5. However, the idea was ultimately dropped because it would mean limiting creators to the specific vocal style of the singer. Yoshida explains, “The AI was great at faithfully recreating a particular singer’s vocal style, but you can't really call it a ‘musical instrument’ if the person using it can’t make it sing the way they want it to.”
Yamaha continued focusing on evolving the technology as an instrument — one that gives creators more flexibility in their music — and in October 2022, VOCALOID6 was released, installed with a completely different VOCALOID:AI engine. Yoshida says, “VOCALOID is fundamentally a singing instrument. We strived to get the tuning right so that the system leaves room for each creator to express their original style and ideas.”
The VOCALOID6 voicebanks are also designed in a way that stimulates the users’ imagination. While previous voicebanks each had a distinct character attached to them, VOCALOID6 voicebanks are freer to interpretation. The only information associated with each voice is a name, a color, and a silhouette. While the team designed the vocal timbre of each voicebank with intense care, they deliberately avoided adding any personality or character attribute to them. Ichikawa says, “Some users have come up with original characters based on the singing voice of the virtual vocalists. It makes me happy to see our products inspiring people’s creativity.”
Every choice the VOCALOID team makes is fueled by the commitment to give creators more freedom in their self-expression. This not only requires advanced technology, but also a deep understanding of creators and creativity. Like VOCALOID, many of Yamaha’s products and solutions have been born through the exquisite fusion of technology and human senses. In the second part of this series, we tell the story of Daredemo Piano (Auto-Accompanied Piano), another example of such initiatives. Stay tuned.
(Interview date: October 2022)
Yoshida is in charge of planning in the Audio Contents Group, Digital Musical Instruments Development Department. His career ranges from selling PA systems, to working in sound engineering, and even testing endoscopy systems. Yoshida joined Yamaha Corporation mid-career in 2008 after being referenced by Hideki Kenmochi, the father of VOCALOID. He now develops VOCALOID voicebanks as part of his duties in planning.
Ichikawa is a marketer in the Digital Musical Instruments Strategy Planning Group, Digital Musical Instruments Division. In addition to working with synthesizer products, he has also been taking charge of VOCALOID marketing since 2022 and communicating its value to consumers around the world. Having enjoyed computer music production and DJ performances since his teenage years, he developed a desire to pursue a music-related career, which ultimately lead him to Yamaha Corporation.
*Bio as of the time of the interview
Three-Part Series: Making Self-Expression Accessible for Everyone
- #1 The Instrument Expanding the Freedom of Music Creators
- #2 The Universal Piano that Takes Just One Finger to Play
- #3 Musical Instruments That Set Your Creativity Free
More Stories in Harmony
Music Reaching Beyond Time
How can we reach across time and gift the joy of music to future generations? Real Sound Viewing and OTONOMORI are two ways Yamaha strives to do just that.
Feel-Good Sounds Made Possible by Technology
Instruments and car speakers may seem worlds apart, but Yamaha’s TransAcoustic Guitar and Automotive Sound System share a common theme – to achieve truly immersive sound.