Call Today 716.688.4675

Particle Swarm Optimization (PSO) in Speech Enhancement

Particle Swarm Optimization (PSO) is an algorithm first introduced in 1995 that was an outgrowth of a study of the flocking of birds. In PSO, we have “particles” which move in a semi-random manner in search of the optimum value of a function ƒ. The algorithm requires evaluation of ƒ at the position of each particle and then each particle’s position needs to be updated based on its own history and the history of the entire group. The particles end up swarming around the optimum value of ƒ. In addition, there are many variants of PSO, like Improved Particle Swarm Optimization (IPSO) and Modified Particle Swarm Optimization (MPSO). These are designed to increase convergence speed, prevent getting caught in a local minimum, or decrease the computational complexity of the solution.

Speech Enhancement is the process of modifying an audio signal so as to improve its speech quality. This is made particularly challenging by the fact that the final decision on the quality of speech is the human ear, which is subjective. This means that a modification that improves the speech quality of one specific signal, may actually reduce the quality of another. Despite this, there are several general techniques, such as echo cancellation, signal separation, and noise reduction, that can improve the speech quality of a signal.

PSO helps to improve upon the noise reduction technique known as two-channel noise reduction. In this method, we have the speech signal to be enhanced in one channel, and a second channel with only noise in it. The noisy channel is used to adaptively design a filter which will remove the noise, and this filter is then applied to the channel which contains the speech. PSO can be used to improve upon the existing two-channel noise reduction technique by improving the speed at which the optimal noise reduction filter is found. In addition, PSO based speech enhancement can provide more noise reduction than traditional methods relying on gradient based algorithms, for example the Normalized Least Mean Squares (NLMS) approach. This is due to PSO’s greater resilience to becoming trapped in local minima as compared to the gradient based approaches.

VOCAL Technologies, Ltd.
520 Lee Entrance, Suite 202
Amherst New York 14228
Phone: +1-716-688-4675
Fax: +1-716-639-0713
Email: sales@vocal.com