I’m interested in musical source separation from single track signal. I have an unpopular question that if we have some old mono recordings with recording quality of the old day of some virtuosi playing some unique instruments with particular techniques, is that possible to separate a particular track for reconstructing a signal to have the same quality as of today recordings?
As I'm quite new to audio signal processing, what I have known for now is that there are three possible approaches to this problem.
Pitch is usually the first cue that people use to separate different sound sources. Many solo unaccompanied compositions, especially for melodic instruments, use this trick to make listeners believe that there are many sources played by one instrument. Perhaps, that is the way we group the auditory components in forward direction. In order to separate the individual source with the pitch, we look at inversion of our perception. If we know the pitch, we group their associated auditory components as they are from the same source. However, this will lead to another problem. How can we know the right pitch?
Using frequency structure
Two instruments playing the same notes can be heard as two different sources. This is the example that we usually see in textbook to describe a special property of musical sound as known as timbre. There are many factors making musical instruments produce different timbres. However, frequency structure or a composition of harmonics alone may be enough for describing some differences in instrument timbres as we can see from Hemholtz’s rule of timbre.
Using timing structure
It is clearly that we compose music with respect to rhythm or timing synchronization. We easily separate instrument sources when they are played at different times (at least, for me). Another trick to make us perceive many different melody lines in polyphonic music is to manage different sub-timing synchronizations for different melody lines and this is a common practice in polyphonic composition. Hence, if we have a reliable technique to segment the timing structure of music, we will be able to use this cue to separate the source.