From echo cancellation to price prediction

Echo cancellation is a technique to remove acoustic echo from audio signal. The procedure is generally applicable to any signal, for instance price time-series of an asset. This post proposes that the price of an asset at any moment can be seen as the echo of its prices at some previous moments. It then explains how price prediction problem can be formulated and solved using echo cancellation.

System Identification

System identification is the process of identifying an unknown system through observing its input and its output. The unknown system in this context is assumed to be linear time-invariant and is specified by a finite impulse response (FIR). It is here used to predict an asset price by looking into its price history. The FIR takes the price history as the input and generates an estimation future price. The block diagram of the price predictor is depicted below. The system is identified through an adaptive process: it starts with an initial FIR and adapts it every time a new price tick comes.

In each adaptive process the FIR is updated as little as possible such that if the price history is applied to the new FIR, it still generates the exact price. Let \(L\) be the length of price history and \(p(n)\) be the price tick at time point \(n\) then

\begin{equation*} \boldsymbol{p}(n)=\Big(p(n),p(n-1),\dotsc,p(n-L+1)\Big)^T \end{equation*}

is a vector containing \(L\) recent price ticks. The objective is to find impulse response

\begin{equation*} \boldsymbol{h}(n)=\Big(h_1(n),h_2(n),\dotsc,h_L(n)\Big)^T \end{equation*}

such that it is as close as possible to the previous FIR, i.e. \(\boldsymbol{h}(n-1)\), and at the same time perfectly predicts the next price tick, i.e. \(\boldsymbol{h}^T(n)\boldsymbol{p}(n)=p(n+1)\). This can be formulated as the following optimization problem:

\begin{equation} \label{eq:norm2} \min\quad\Big\|\boldsymbol{h}(n)-\boldsymbol{h}(n-1)\Big\|_2^2 \end{equation}

subject to

\begin{equation} \label{eq:pred} \boldsymbol{h}^T(n)\boldsymbol{p}(n)=p(n+1). \end{equation}

Using the method of Lagrange multiplier, it can be shown that

\begin{equation} \label{eq:raw:nlms} \boldsymbol{h}(n)=\boldsymbol{h}(n-1)+\frac{e(n)\boldsymbol{p}(n)}{\boldsymbol{p}^T(n)\boldsymbol{p}(n)} \end{equation}

where

\begin{equation} \label{eq:prederr} e(n)=p(n+1)-\boldsymbol{h}^T(n-1)\boldsymbol{p}(n) \end{equation}

This adaptive approach can be used to predict the price of an asset. The quality of prediction can be assessed by its convergence measured by \(\eqref{eq:norm2}\) and the prediction error estimated by \(\eqref{eq:prederr}\). When both of the criteria are small, the prediction is probably precise.

Relation to echo cancellation

Echo cancellation is the process of removing echo in a signal, for example removing acoustic echo in a telephone call. A popular technique to cancel acoustic echo is to identify echo via system identification, the same approach which is used in this post for price prediction. To turn our price predictor to an echo canceler, it is only enough to fill \(\boldsymbol{p}(n)\) vector with the last \(L\) audio samples coming from the speaker instead of price ticks and replace \(p(n+1)\) with microphone signal in \(\eqref{eq:pred}\). Then, \(\eqref{eq:raw:nlms}\) identifies the FIR of an acoustic echo canceler. This approach to echo cancellation is referred to as normalized least mean square method, which will be discussed in the next section. That is to say, the system which predicts the current price actually produces an echo of past prices that best matches the current priced.

Normalized least mean square

A price predictor, an echo canceler, or any other unknown system can be identified by normalized least mean square (NLMS) method. The procedure follows the computation given by \(\eqref{eq:raw:nlms}\) and \(\eqref{eq:prederr}\). It takes the signal value at each time point, computes the predicted system output, compares it with the real output to estimate error and then accordingly updates the impulse response. NLMS however adds two parameters to have finer control over recursion \(\eqref{eq:raw:nlms}\). How much the FIR is updated is also controlled by step size (\(\alpha\)) and regularization parameter (\(\delta\)).

\begin{equation} \label{eq:nlms} \boldsymbol{h}(n)=\boldsymbol{h}(n-1)+\frac{\alpha e(n)\boldsymbol{p}(n)}{\boldsymbol{p}^T(n)\boldsymbol{p}(n)+\delta} \end{equation}

Bigger step size causes \(\eqref{eq:nlms}\) to converge faster, likely with higher error estimation error. To decrease error, smaller step size should be taken, which subsequently requires more iterations until convergence. For numerical stability, \(\alpha\) must lie between 0 and 2. Regularization parameter \(\delta\) helps system identification performing well in case the signal is noisy.

Python implementation

The implementation of NLMS algorithm on price data is straightforward. It is enough to update the impulse response iteratively according to \(\eqref{eq:nlms}\). The algorithm has three parameters:

\(L\) is the length of price history
\(\alpha\) is the step size
\(\delta\) is the regularization parameter

These parameters are passed while constructing an object of system_identifier. This class provides system identification functionality.

In this class, update_filter takes the new price value, computes the error and updates the FIR via \(\eqref{eq:prederr}\) and \(\eqref{eq:nlms}\), respectively. At the end it returns the predicted price and the absolute value of update vector i.e.

\begin{equation} \label{eq:h:diff} \Big\|\boldsymbol{h}(n)-\boldsymbol{h}(n-1)\Big\|_2^2 \end{equation}

This is an indicator of convergence and together with the prediction error can be used to estimate the quantity of prediction.

To evaluate the algorithm, I use it to predict minutely price of bitcoin from 01.09.2021 to 12.09.2021. After running the algorithm, it takes some time for it to converge. Convergence means the estimated FIR does not fluctuate or change rapidly and thereby is stable for price prediction. The top diagram shows the convergence that is estimated by computing the distance between consecutive FIR as specified by \(\eqref{eq:h:diff}\). The lower values imply that the predicted FIR only needs to change very slightly. However, to judge the quality of convergence, we should look at the relative error in the bottom diagram. When both of these values are relatively small, the quality of prediction is good.

Conclusion

This post is proposed an algorithm for asset price prediction based on system identification problem. The algorithm adapts continuously with price change and provides indicators for prediction error and convergence. The indicators jointly can demonstrate how precise the price estimation for the asset is.