## Maximum-Likelihood (ML) Algorithm

The fundamental mathematical tool for parameter estimation is the maximum-likelihood (ML) algorithm. The amount of computation required to perform ML is usually the main reason to drive engineers away. However, it is worthwhile to review and understand the roots of all parameter estimation problem formulations.
Given an independent observation vector $x_1, x_2 \cdots, x_K$, for example, a microphone array output, the probability density distribution can be assumed jointly Gaussian,

${f}\left(x_1, x_2\cdots,x_K\right)=\prod_{k=1}^K\frac{1}{\pi^N\left|R_x\right|}e^{-x^T_kR_x^{-1x_k}}$

The parameters, for example, the angle of arrival, $\Phi_k$, are hidden in the observation vectors, $x_1, x_2 \cdots , x_K$. Therefore, the maximum likelihood of the set of $\Phi_k$ that produces $x_1,x_2,\cdots, x_K,$

$ln f\left(x_1,x_2,\cdots, x_K\right) = -NKln\pi-Kln\left|R_x\right|-\sum_k=1^K x^T_kR_x^-1 x_k$,

For uniform linear array,

$X=AS+N$

where $X = [x1(t), x2(t), \dots , xM(t)]$, $S = [s1(t), s2(t), \dots , sN(t)]$, and
$A=\left[a_1\left(\theta_1\right),a_2\left(\theta_2\right),\cdots,a_M\left(\theta_N\right)\right]$.

Our goal is to achieve the best estimates of $\theta_1$ in some optimal minimum mean square sense.

The maximization is achieved by minimization of the following,

$\sum_{k=1}^K(x_k- AS) R_x^-1(x_k-AS)$,

and the minimum yields the optimal $A$,

$A=S^+X^H$

where $S$ is the pseudoinverse of $S$ and $XH$ is the Hermitian transpose of $X$.

Let us replace the estimated A in the optimization objective function,

$\sum_{k=1}^K(x_k-AS)\ R_x^-1(x_k-AS) = ||(I-SS^+)X^H||_F$

We can see that $(I-SS^+)$ is the projection matrix of the noise space. Therefore, the above gives the residual power of the received signal vector projected into the noise subspace.