Sound source separation is a signal processing technology that removes only the sounds you don’t want or extracts only the sounds you do want. Noise suppression and dereverberation technologies suppress unwanted sounds, such as the sound of fans picked up by the microphone, and reverberation that make it difficult to hear voices. These technologies can be used to emphasize specific sounds or suppress unwanted sounds in specific scenarios, using not only multiple microphones but also a single microphone.
Music is made up of many sounds mixed together. Vocals, guitars, bass, drums, etc., are all mixed together just right to create a single piece of music. For example, by removing only the vocals from a CD, you can sing in place of the vocalist of your dreams.
Air conditioners and projectors have fans that send out airflow. In a remote conference, these “stationary noises” are picked up by the microphone as unwanted sounds. With noise suppression technology, these unwanted noise is suppressed and the sound can be delivered clearly.
Room reverberation can enrich music, while making it difficult to hear speech. By suppressing, it is possible to convey the desired sound clearly. Extracting reverberation, it is possible to create surround components, which is called upmixing.
There are two types of sound source separation. One is based on multi-channel signal processing, the other is for monaural (single-channel) sound signals. These source separation techniques, as well as noise suppression and dereverberation for monaural sound signals, are realized with statistical signal processing in the frequency domain.