Algorithm: Truncated Wirtinger Flow (TWF)

Motivation

Under a stochastic noise model with independent samples, a first impulse for solving y_i approx |langle a_i, x rangle|^2 is to seek the maximum likelihood estimate (MLE), namely,

 	min_{z} quad - sum_{i=1}^{m}ell_i (z),

where ell_i (z) denotes the log-likelihood of z given y_i. For instance, under the Poisson noise model y_i sim Poisson(|langle a_i, xrangle|^2), 1leq ileq m, one has

 	 ell_i  (z) = y_i log(|a_i^{*}z|^2) 				 - |a_i^{*}z|^2

If we follow the Wirtinger flow 1 approach or other gradient descent paradigms, we would proceed as

 	z^{(t+1)}= z^{(t)} + frac{mu_{t}}{m} sum_{i=1}^{m} nabla ell_i ( z^{(t)} )

for some appropriate initial guess z^{(0)}, where nabla ell_i( z ) = 2frac{y_i - |a_i^* z|^2}{z^* a_i} a_i denotes the Wirtinger derivative (or ordinary gradient for the real case). Unfortunately, this approach does not work for real-valued case, since some of the gradient components nabla_i ell(z) are abnormally large.

alt text 

Figure 1: the locus of - frac{1}{2}nabla ell_i(z) for all unit vectors a_i.

TWF methodology

TWF is a novel non-convex procedure that adopts a more subtle gradient flow, which proceeds in two stages:

(1) Truncated Spectral Initialization: compute an initial guess z^{(0)} by means of a spectral method applied to a subset of the observations {y_i} obeying


 

Estimate after 50 truncated power iterations

(2) Truncated Gradient Flow: for i=1: T,

 	z^{(t+1)}= z^{(t)} + frac{mu_{t}}{m} sum_{iin S_{t+1}} nabla ell_i ( z^{(t)} )

for some adaptive index set S_{t+1} determined by z^{(t)}; i.e. for any iin S_t,


In words, the adaptive subset S_{t+1} guarantees that both a_i^* z^{(t)} and y_i - |a_i^{*}z^{(t)}|^2 take typical values, and hence none of the gradient component nabla ell_i(z^{(t)}) are abnormally large. Here, the step size mu_t is either chosen to be a constant or determined by a backtracking line search.

 

Estimate after 50 TWF gradient iterations

Detailed algorithmic procedure

By default, the step size and the truncation thresholds are set to be mu_tequiv 0.2, alpha_z^{lb}=0.3, alpha_z^{ub}=5, alpha_h=5, and alpha_y=3.

alt text