Gabor-Visual-Microphone

0
README.md

Visual Microphone: Phase-Based Audio Recovery

C++ OpenCV OpenMP

Overview

This repository contains a C++ implementation of the Visual Microphone algorithm. The project explores the passive recovery of acoustic signals by analyzing microscopic vibrations of objects in high-speed video recordings.

Unlike classic computer vision methods based on optical flow (intensity analysis), this approach utilizes local phase analysis with a bank of complex Gabor filters. By leveraging the Fourier Shift Theorem, we detect sub-pixel motions (10310^{-3} to 10210^{-2} pixels) invisible to the naked eye.


Methodology

The Physics

Sound waves cause physical objects to vibrate. These vibrations result in minute spatial displacements δ(t)\delta(t) of the object's surface on the video sensor.

Why Phase?

Traditional intensity-based methods fail when δ(t)\delta(t) is smaller than the quantization noise level. We move to the frequency domain using the Fourier Shift Theorem:

F{V(xδ(t))}=V^(ω)ejωδ(t)\mathcal{F}\{V(x - \delta(t))\} = \hat{V}(\omega) e^{-j \omega \delta(t)}

Global motion translates to a linear phase shift in the frequency domain. We extract this motion using a bank of complex Gabor filters:

Gs,θ(x,y)=exp(x2+y22σs2)exp(j(2πfsx+ψ))G_{s,\theta}(x, y) = \exp\left(-\frac{x'^2 + y'^2}{2\sigma_s^2}\right) \cdot \exp\left(j (2\pi f_s x' + \psi)\right)

The local phase φ(t)\varphi(t) of the filter response is linearly related to the object's displacement: φ(t)φ(0)ω0δ(t)\varphi(t) \approx \varphi(0) - \omega_0 \delta(t)


Results

Experimental setup: High-speed camera (2200 FPS), foil bag target, distance 2m.

1. Phase vs. Intensity

Comparison of raw pixel intensity analysis versus our Gabor phase demodulation method. The phase signal clearly reconstructs the acoustic wave, while intensity is lost in quantization noise.

Phase vs Intensity

2. Signal Localization (Heatmap)

The algorithm automatically weighs regions with high texture contrast. Red areas contribute most to the recovered sound.

Signal Heatmap

3. Recovered Spectrogram

Spectrogram of the recovered simple melody. Harmonic structures are clearly visible.

Spectrogram


Installation & Build

Prerequisites

  • C++ Compiler (supporting C++17)
  • CMake (>= 3.10)
  • OpenCV (4.x)
  • OpenMP (usually included with GCC/Clang)

For macOS with Homebrew OpenMP, you might need specific flags (see CMakeLists.txt)

Manual Compilation (Linux/macOS)


Usage

Run the executable with the input video path. Optional arguments include output filename and sampling rate.

Example:

The program will:

  1. Analyze the video using parallel batch processing.
  2. Extract phase variations across multiple scales and orientations.
  3. Reconstruct the audio signal.
  4. Apply a high-pass Butterworth filter.
  5. Save the result to .wav.

Authors

Saint Petersburg Electrotechnical University "LETI"
Faculty of Computer Science and Technology