Dense Motion Estimation based on Polynomial expansion

Introduction

In this article we will look at dense motion estimation based on polymonial repsentation of image.The polynomial basis representation of the image is obtained by approximating the local neighborhood of image using quadratic polynomial basis.The displacement between adjacent frames can be obtained by equating the coefficients of the basis.

Introduction

This article describes a fast dense optical flow computation algorithm by [4] . In the earlier articles it was seen that a local neighborhood of image can be represented using polynomial basis. Using this representation estimation of dense optical flow is obtained at each point in the image.

2D basis function

Polynomial basis
Assuming how a polynomial transforms under translation using the polynomial expansion coefficients derived from current and previous frames and estimate of displacement vector is obtained.
The idea of polynomial expansion is to approximate a neighborhood of a point in a 2D function with a polynomial.Considering quadratic polynomial basis $1,x^2,y^2,x,y,xy$,pixel values in a neighborhood of image is represented by \[ f(x) ~ x^T A x + b^T x + c \] where A is a symmetric matrix,b is a vector and c is a scalar
The coefficients can be estimated by weighted least square estimate of pixel values about neighborhood as seen in the earlier article. As with all optical flow algorithm,the brightness constancy assumption is made. The brighness of a path of image in adjacent frames is constant.
Consider a translational motion $\mathbf{d}$ ecountered at point $\mathbf{(x,y)}$ in the image. \begin{eqnarray*} \hline \\ & f_1(x) = x^T A_1 x + b_{1}^Tx +c_1 \\ & f_2(x) = f_1(x-d) = (x-d)^T A_1 (x-d) + b_{1}^T(x-d) +c_1 \\ & f_2(x) = f_1(x-d) = (x)^T A_1 (x) + (b_{1}-2A_1d)^T(x) \\ & + d^TAd-b_1^Td+c_1 \\ & f_2(x) = f_1(x-d) = (x)^T A_1 (x) + b_2^T(x) + c_2 \\ & equating coefficients in the two polynomials \\ & Due to brightness constancy assumption \\ & A_2 = A_1 \\ & b_2=(b_{1}-2A_1d) \\ & c_2=d^TAd-b_1^Td+c_1 \\ & Assuming $A$ is non-singular \\ & d=-\frac{1}{2}A^-1(b_2-b_1) \\ \hline \end{eqnarray*} Thus by equating the coefficients of the polynomial the displacement vector can be obtained at each point in the image assuming there is overlap between the region of interest ie image neighborhood in adjacent frames.
Let us say we have the estimate of displacement $\bar{d}$ We extract the ROI about the neighborhood at point $P(x,y)$ and point $P(x+d.x,y+d.y)$ The polynomial basis are extracted and the computation that is shown above is performed.
The total displacement can be estimated as \begin{eqnarray*} \hline \\ &\bar{b}_2=b_{1}-2A_1 (\bar{d}) \\ & \text { we know $\bar{d}$,$A_1$,$b_1$} \\ & d = -0.5*A^{-1} (b_2+\bar{b}_2 - b_1) \\ \hline \end{eqnarray*}
Thus an iterative scheme can be used where in every successive iteration a better estimate of displacement vector is obtained.The iterations can be terminated when change is displacement vector is below is threshold in successive iterations or specific number of iterations have been completed. The intial estimate of displacement vector is assumed to be $\mathbf{(0,0)}$ Thus ROI or the image patch in the current and previous frames are at the same.
The method $\textbf{EstimateFlow}$ computes the coefficients $A,b_2,b_1$ required for displacentt field computation.The $\textbf{EstimateFlow}$ functions call the method $\textbf{UpdatePoly}$ for each pixel in the image.
The displacement field obtained may be discontinuous and contain noise and other atrifacts.Since it is reasonable to assume that if motion is encounted at a point, the neighborhood pixels also encounter the same motion. The displacement vector can be averaged over a neighborhood to get a better estimate of the displacement field.
The method $\textbf{AverageFlow}$ computes the average of coefficients $A_1,b_2,b_1$ and then computes the displacement flow field. \\
This approach may in case of large displaement.Hence a multi scale estimation is performed. The estimation of flow field is performed as the smallest resolution.The displacement computed at the lower resolution is used as estimate for peform displacement field computation at higher resolution.
Dense optial flow computed for two frames is shown in figure 2b
\setcounter{subfigure}{0}

Frame 1

Frame 2

Optical Flow

Displacement field

Conclusion
This article describes the theory and implementation details of the dense optical flow algorithm based on paper by [4].This code for the algorithm can be found at github repository https://github.com/pi19404/OpenVision in files DenseOf.cpp and DenseOf.hpp files. In the future article we will look at optimizing the code using SSE,NEOM and OpenCL optimizations to enable real time computation of the dense optical flow fields

References

Kenneth Andersson and Hans Knutsson. Continuous normalized convolution.In: ICME (1). IEEE, 2002, pp. 725728. isbn: 0-7803-7304-9. url: http://dblp.uni-trier.de/db/conf/icmcs/icme2002-1.html
Kenneth Andersson, Carl-Fredrik Westin, and Hans Knutsson. Prediction from o-grid samples using continuous normalized convolution. In: Signal Processing 87.3 (Mar. 22, 2007), pp. 353365. url: http://dblp.uni- trier.de/db/journals/sigpro/sigpro87.html
Gunnar Farnebäck. Motion-based Segmentation of Image Sequences. LiTH-ISY-EX-1596. MA thesis. SE-581 83 Linköping, Sweden: Linköping University, 1996.
Gunnar Farnebäck. Polynomial Expansion for Orientation and Motion Estimation. Dissertation No 790, ISBN 91-7373-475-6. PhD thesis. SE-581 83 Linköping,Sweden: Linköping University, Sweden, 2002.
Gunnar Farneback. Two-Frame Motion Estimation Based on Polynomial Expan-sion. In: SCIA. LNCS 2749. Gothenburg, Sweden, 2003, pp. 363370.