Liz O'Gorman PhD Candidate in Neurobiology and MS Student in ECE at Duke University working with Dr. John Pearson on real-time adaptive experimental methods and deep learning.

# Decomposing Fourier transforms — an introduction to time-frequency decomposition

## The beauty of the Fourier series and Fourier transform

(Picture by Wonders of Physics, Public Domain)

The Fourier transform is one of many influential and revolutionary mathematics developed and discovered in the 19th century. In the 1800s, Jean-Baptiste Joseph Fourier claimed that perhaps any arbitrary function of a variable, whether continuous or discontinuous, can be expressed as a sum of sines and cosines. That is, perhaps an arbitrary time-dependent (or even space-dependent) signal $$x (t)$$ can be expressed as a superposition, i.e., linear combination, of sine and cosine functions. This expression of a function as a sum of infintely many sines and cosines is the Fourier series

Today, the Fourier transform is one of the most important algorithms in engineering, mathematics, physics, statistics, signal processing, and image and video processing. It is particularly important and useful for analyzing electrophysiological data like EEG, iEEG or ECoG, LFP, and MEG recordings.

The Fourier transform is valuable for two reasons:

1. It is a way to determine which frequencies are present in some temporal signal, or which wave lengths are present in some spatial pattern.
2. It is a straightforward way to solve many mathematical problems involving constant coefficient differential equations.

But, most importantly, it transforms signal from the time (or spatial) domain to the frequency domain and back!

The main result of the Fourier transform are Fourier coefficients used to compute a spectrum of power at various frequencies present in the signal: the power spectrum measures how strong is the contribution of frequencies present in the signal.

## Time domain $$\leftrightarrow$$ frequency domain

Consider an arbitrary function or signal varying over time.

The time domain is each data point in time. The data is represented as a series of evenly spaced samples over time, collected at some sampling frequency. Neural time series is represented by its amplitude at each time point.

The frequency domain is sine functions for each frequency. The data in this domain is represented as waves with particular strengths and phases at different frequencies. This can be used to reconstruct the signal.

Below shows the relationship between the time domain and the frequency domain of a function based on its Fourier transform.

The Fourier transform takes an input function $$\color{red}f$$ in the “time domain” and converts it into a new function $$\color{blue}\hat{f}$$ in the “frequency domain”.

$\colorbox{white}{\color{blue}{a_n \text{cos} (nx) + b_n \text{sin}(nx)}}$

In other words, the original function can be thought of as “amplitude given time”, and the Fourier transform of the function is “amplitude given frequency”.

This animation shows a 6-component approximation of the square wave decomposed into 6 sine waves. These component frequencies show as sharp peaks in the frequency domain of the function.

(GIF by Lucas Vieira, 23 February 2013. Public Domain, via Wikimedia Commons)

### A bit of math…the Fourier series and Euler’s formula

#### The Fourier Series

If $$\omega = 2 \pi / T = 2 \pi f$$ is the angular frequency $$\omega$$ corresponding to the period $$T$$ and frequency $$f = 1/T$$, then the function $$x(t)$$ can be written as a convergent infinite sum of sine and cosine functions each of period $$T$$:

\begin{aligned} x(t) &= \frac{1}{2} a_0 + \begin{align} &a_1 \text{cos}(\omega t) + a_2 \text{cos}(2 \omega t) + · · · \\ &+ b_1 \text{sin}(\omega t) + b_2 \text{sin}(2 \omega t) + · · · \\ \end{align} \\ &= a_0 + \sum_{n=1}^{\infty} (a_n \text{cos}(n \omega t) + b_n \text{sin}(n \omega t)) \end{aligned}

where $$a_n$$ and $$b_n$$ are the Fourier coefficients. Alternatively, the Fourier series can be written in terms of complex exponentials.

#### Euler’s Formula

Many mathematicians and scientists have expressed that Euler’s formula and Euler’s identity is the “most beautiful equation”. It is defined as

$\exp^{ix} = \text{cos}(x) + i \ \text{sin}(x),$

where $$x$$, or alternatively $$\phi$$, is a real number and $$i = \sqrt{-1}$$ is one of the two square roots of $$-1$$.

Below in the visualization, $$\exp^{i \phi}$$ is the unit circle in the complex plane for a real number $$\phi$$. Here $$\phi$$ is the angle between a line connecting the origin with a point on the unit circle makes and the positive real axis. The positive real axis is the $$\text{cos} \phi$$ axis. Correspondingly, the real Fourier coefficients are the coefficients of the cosine.

#### Visually…

(Image by Gunther, 29 May 2006. CC BY-SA 4.0, via Wikimedia Commons)

Skipping some math…a continuous signal of interest can be expressed as the following:

$x(t) = \sum_{-\infty}^{\infty} c_n \exp^{i n \omega t}$

where $$c_n = a_n + i b_n$$ are the complex-valued Fourier coefficients.

Now, the power spectrum $$P_n$$ is the magnitude squared of the complex Fourier coefficients such that

$P_n = {\lvert c_{n} \rvert}^2 = c_n c_{n}^* = a_{n}^{2} + b_{n}^{2}$

for $$n = 0, 1, 2, \cdots .$$ which corresponds to the frequency $$f_n = n/T$$ or angular frequency $$\omega n = 2 \pi \omega /T$$. The power spectrum is plotted as a function of frequency or angular frequency, plotting $$P(f_n) = P_n$$.

### A bit of plots

All code below is written in python and adapted from Lyndon Duong’s GitHub repository: Python implementations of code in Analyzing Neural Time Series by Dr. Mike X. Cohen and follows Mike X. Cohen’s book, Analyzing Neural Time Series Data: Theory and Practice (2014, MIT Press). These tutorials (originally written in MATLAB) are further explained and found in Dr. Mike X. Cohen’s data analysis lecturelets and on his YouTube channel.

The only dependencies needed for the following code are:

1
2
3
4
import matplotlib
from matplotlib import pyplot as plt
import numpy as np
import scipy.io


I customize the style of my plots by updating the “runtime configuration parameters”, matplotlib’s rcParams.

1
2
3
4
5
6
7
8
9
10
11
12
matplotlib.rcParams.update(matplotlib.rcParamsDefault)
plt.rcParams.update({'font.family' : 'Arial',
'lines.linewidth': 1,
'figure.dpi': 150,
'savefig.dpi': 500,
'axes.spines.right': False,
'axes.spines.top': False,
'axes.edgecolor': 'k',
'axes.linewidth': 1,
'axes.grid': False,
'axes.autolimit_mode': 'data',
'savefig.format': 'svg'})


#### Multiple sines of different frequencies can be combined to make a more complex signal

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Figure 11.2
srate = 500. #sampling rate in Hz

#create arrays of frequencies, amplitudes, and phases to plot
frex = np.array([3, 10, 5 ,15, 35])
amplit = np.array([5, 15, 10, 5, 7])
phases = np.pi*np.array([1/7., 1/8., 1., 1/2., -1/4.])
time = np.arange(-1,1 +1/srate,1/srate)

sine_waves = np.zeros([len(frex), len(time)])
for fi in range(len(frex)):
sine_waves[fi,:] = amplit[fi] * np.sin(2*np.pi*frex[fi]*time+phases[fi])

#plot each sine wave individually
fig = plt.figure()
fig.supylabel("Amplitude (a.u.)")

for fi in range(len(frex)):
plt.subplot(len(frex), 2, 2*(fi+1)-1)
plt.plot(sine_waves[fi,:], linewidth=1, color='b')
if fi == 0:
plt.title("Individual Sines")
if fi != len(frex)-1:
ax = plt.gca()
ax.axes.xaxis.set_ticklabels([])
elif fi == len(frex)-1:
ax = plt.gca()
ax.set_xlabel("Time")

#plot the sum of all sum waves
plt.subplot(1,2,2)
plt.plot(np.sum(sine_waves,axis=0), color='r')
plt.tight_layout()
_=plt.title("Sum of Sines")
_=plt.xlabel("Time")


1
2
3
4
5
6
7
# Figure 11.3
plt.figure()
noise = 5 * np.random.randn(np.shape(sine_waves)[1])
plt.plot( np.sum(sine_waves,axis=0) + noise, color='r')
_=plt.title("Sum of Sines+White Noise")
plt.xlabel("Time")
plt.ylabel("Amplitude (a.u.)") ;


#### Adding two signals preserves information of both signals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Figure 11.4
time = np.arange(-1, 1+1/srate, 1/srate)

#create three sine waves
s1 = np.sin(2*np.pi*3*time)
s2 = 0.5 * np.sin(2*np.pi*8*time)
s3 = s1 + s2

s_list = [s1, s2, s3]  # throw them into a list to analyze and plot

#plot the sine waves
fig = plt.figure(constrained_layout=True)

(subfig1, subfig2) = fig.subfigures(2, 1) # create 2 subfigures, total 2x3
ax1 = subfig1.subplots(1, 3, sharex='col', sharey='row') # create 1x3 subplots on subfig1
ax2 = subfig2.subplots(1, 3, sharex='col', sharey='row') # create 1x3 subplots on subfig2

for i in range(3):
ax1[i].plot(time, s_list[i], color='r')
ax1[i].axis([0, 1, -1.6, 1.6])
ax1[i].set_yticks(np.arange(-1.5, 2, .5))

#numpy implementation of the fft
f = np.fft.fft(s_list[i])/float(len(time))
hz = np.linspace(0, srate/2., int(np.floor(len(time)/2.)+1)) # we only have resolution up to SR/2 (Nyquist theorem)
ax2[i].bar(hz,np.absolute(f[:len(hz)]*2), color='b')
ax2[i].axis([0, 11, 0, 1.2])
ax2[i].set_xticks(np.arange(0, 11))

subfig1.suptitle("Time Domain")
subfig1.supxlabel("Time")
subfig1.supylabel("Amplitude (a.u.)")

subfig2.suptitle("Frequency Domain")
subfig2.supxlabel("Frequency (Hz)")
subfig2.supylabel("Power (a.u.)") ;


Here, two sines with different amplitudes and frequencies (from left to right: $$1$$ and $$0.5$$, and $$3$$ and $$8$$, respectively) are added together in the time domain. The relative “strength” of the frequencies present in the signal corresponding to sum of the two sines is resolved in the frequency domain.

#### Putting it all together: from signal to its power and phase spectrum

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Figure 11.5
from mpl_toolkits.mplot3d import Axes3D
N = 10 #length of sequence
og_data = np.random.randn(N) #create random numbers, sampled from normal distribution
srate = 200 #sampling rate in Hz
nyquist = srate/2 #Nyquist frequency -- highest frequency you can measure the data

#initialize matrix for Fourier output

frequencies = np.linspace(0, nyquist, N//2+1)
time = np.arange(N)/float(N)

#Fourier transform is dot product between sine wave and data at each frequency
fourier = np.zeros(N)*1j #create complex matrix
for fi in range(N):
sine_wave = np.exp(-1j *2 *np.pi*fi*time)
fourier[fi] = np.sum(sine_wave*og_data)

fourier_og = fourier/float(N)

fig = plt.figure(figsize=(8,8))

plt.subplot(221)
plt.plot(og_data,"-ro",)
plt.xlim([0,N+1])
plt.ylabel('Amplitude (a.u.)')
plt.xlabel('Time Point')
plt.title("Time Domain Representation of Data")

ax.plot(frequencies, np.angle(fourier_og[:N//2+1]), zs=np.absolute(fourier_og[:N//2+1])**2)
ax.view_init(20, 20)
ax.set_xlabel('Frequency (Hz)')
ax.set_ylabel('Phase')
ax.zaxis.set_rotate_label(False)
ax.set_zlabel('Power (a.u.)', rotation=90)
plt.title("3D representation of the Fourier transform")

plt.subplot(223)
plt.stem(frequencies, np.absolute(fourier_og[:N//2+1])**2, use_line_collection=True)
plt.xlim([-5, 105])
plt.xlabel("Frequency (Hz)")
plt.ylabel("Power (a.u.)")
plt.title("Power Spectrum")

plt.subplot(224)
plt.stem(frequencies, np.angle(fourier_og[:N//2+1]), use_line_collection=True)
plt.xlim([-5, 105])
plt.xlabel("Frequency (Hz)")
plt.ylabel("Phase Angle")
plt.yticks(np.arange(-np.pi, np.pi, np.pi/2.))

plt.title("Phase Spectrum")

plt.tight_layout()

1
2
3
4
/var/folders/1q/g9xwc9k128101dg5z58dbhm40000gn/T/ipykernel_22564/3676337223.py:40: MatplotlibDeprecationWarning: The 'use_line_collection' parameter of stem() was deprecated in Matplotlib 3.6 and will be removed two minor releases later. If any parameter follows 'use_line_collection', they should be passed as keyword, not positionally.
plt.stem(frequencies, np.absolute(fourier_og[:N//2+1])**2, use_line_collection=True)
/var/folders/1q/g9xwc9k128101dg5z58dbhm40000gn/T/ipykernel_22564/3676337223.py:47: MatplotlibDeprecationWarning: The 'use_line_collection' parameter of stem() was deprecated in Matplotlib 3.6 and will be removed two minor releases later. If any parameter follows 'use_line_collection', they should be passed as keyword, not positionally.
plt.stem(frequencies, np.angle(fourier_og[:N//2+1]), use_line_collection=True)


Note: the phase spectrum is derived from the coefficient of the sine, i.e., the argument of the complex number.

## The Discrete Fourier Transform (DFT)

(The following is adapted from blog posts written by Stuart Riffle and David Smith.)

The Discrete Fourier Transform is applied to signal of finite duration discretized at some sampling frequency.

It is defined as

$\color{purple}{X}_{\color{lime}{k}} = \color{magenta}{\sum_{n=0}^{N-1}} \color{cyan}{x_{n}} \color{red}{e^{-{i} \color{orange}{2 \pi} \color{lime}{k} \color{magenta}{\frac{n}{N}}}} \color{black} .$

Where,

• $$\color{purple} X$$ is amplitude or energy
• $$\color{lime}{k}$$ is frequency
• $$\color{cyan}{x_{n}}$$ is the signal
• $$\color{red}{e^-i}$$ comes from Euler’s formula
• One can think of this complex exponential as rotation of a point along the unit circle starting from (0, 1)…backwards, hence, the negative.
• $$\color{orange}{2 \pi}$$ is the circumfrence of the unit circle
• One can think of this as rotating a point along the full unit circle.
• $$\color{magenta}{\sum\limits_{n=0}^{N-1}}$$ and $$\color{magenta}{\frac{n}{N}}$$ is the sum and average of all points in the signal

As Stuart Riffle and David Smith state:

“To find the energy at a particular frequency, spin the signal around a circle at that frequency, and average a bunch of points along that path.”

### The algorithm

The signal $$x_{n}$$ is convolved (shout out to Pranjal Gupta, check out his post on convolutions) with a complex exponential $$\color{red}{e^{-i}} \color{orange}{2 \pi} \color{lime}{k}$$ for multiple frequencies $$k$$ over multiple time points $$n$$. In other words, the DFT algorithm is to compute dot products of the signal and complex exponentials.

This is most efficiently computed using Fast Fourier Transform (FFT) algorithms.

## Important Considerations and Issues

There are a few considerations and issues of which to be aware when using Fourier transforms to analyze real data.

One must consider sampling frequency (often constrained by hardware and software), signal properties, and appropriate parameters to set given data.

### Aliasing

Aliasing is frequency ambiguity in a power spectrum as a result of discretely sampling a continuous waveform.

The maximum sampling frequency $$f_{\max}$$ is called the Nyquist frequency given by

$f_{\max} = \frac{1}{2 \Delta t} = \frac{1}{2 f_s}.$

Frequencies above the Nyquist frequency do not disappear. They reappear as peaks in the power spectrum at lower frequencies in the range $$0 < f < f_{\max} = 1/(2 \Delta t)$$.

(GIF by Omnicron11, 7 April 2021. CC BY-SA 4.0, via Wikimedia Commons)

The upper left animation depicts sines. Each successive sine has a higher frequency than the previous. “True” signals are being sampled (dots) at a constant frequency $$\color{black}{f_s}$$.

The upper right animation shows the continuous Fourier transform of the sine. The single non-zero component, the actual frequency, means there is no ambiguity.

The lower right animation shows the discrete Fourier transform of just the available samples. The presence of two components means the samples can fit at least two different sines, one of which is the true frequency (the upper right animation).

The lower left animation uses the same samples and default reconstruction algorithm to produce lower-frequency sines.

### Stationarity

A stationary signal is a signal that has constant statistical properties, e.g., mean, variance, etc., over time.

Neural data is most definitely not stationary!

The power spectrum of a non-stationary signal is less “sharp”: it is more difficult to resolve what frequencies are present with what power at what time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
## Effects of non-stationarity

# Figure 11.9
#create array of frequencies, amplitudes, and phases
frex = np.array([3,10,5,7])
amplit = np.array([5,15,10,5])
phases = np.pi*np.array([1/7.,1/8.,1,1/2.])

#create a time series of sequenced sine waves

srate = 500.
time = np.arange(-1,1+1/srate,1/srate)
stationary = np.zeros(len(time)*len(frex))
nonstationary = np.zeros(len(time)*len(frex))

for fi in range(len(frex)):
#compute sine wave
temp_sine_wave = amplit[fi] * np.sin(2*np.pi*frex[fi]*time+phases[fi])

#enter into stationary time series
stationary = stationary + np.tile(temp_sine_wave,(1,len(frex)))

#optional change of amplitude over time
temp_sine_wave *= time+1

#start and stop indices for insertion of sine wave
start_idx = fi * len(time)
stop_idx = fi *len(time) + len(time)

#enter into non-stationary time series
nonstationary[start_idx:stop_idx] = temp_sine_wave

plt.figure()
plt.subplot(221)
plt.plot(stationary[0], color='olive')
plt.axis([0,stationary.shape[1],-30,30])
plt.title("Stationary Signal")
_,xticks=plt.xticks(np.arange(0,4000,500))
plt.setp(xticks,rotation=-45)
plt.xlabel("Time")
plt.ylabel("Amplitude (a.u.)")

plt.subplot(222)
plt.plot(nonstationary, color='m')
plt.xlim([1,len(nonstationary)])
plt.title("Non-stationary Signal")
_,xticks=plt.xticks(np.arange(0,4000,500))
plt.setp(xticks,rotation=-45)
plt.xlabel("Time")

frequencies = np.linspace(0,srate/2, len(nonstationary)//2+1)
fft_nonstationary = np.fft.fft(nonstationary)/len(nonstationary)
fft_stationary = np.fft.fft(stationary[0])/stationary.shape[1]

plt.subplot(212)
plt.plot(frequencies,np.absolute(fft_stationary[:len(frequencies)]*2),'olive')
plt.plot(frequencies,np.absolute(fft_nonstationary[:len(frequencies)]*2), 'm')
plt.xlim([0,np.max(frex)*2])
plt.xlabel("Frequency (Hz)")
plt.ylabel("Power (a.u.)")
plt.legend(["Power stationary","Power non-stationary"])

plt.tight_layout()


Adding zeros to the end of the signal adds more points in the time domain. This increases the frequency resolution, however, no new information is added.

Often one zero-pads to make the length of the signal (in terms of points) a power of two number. FFT algorithms work best, i.e., are most efficient, when the length of the signal or window is a power of two.

## The Continuous Fourier Transform

The continuous Fourier transform takes a continuous input function $$\color{red}{f(x)}$$ in the time domain and turns it into a new function $$\color{blue}{\hat{f}(\xi)}$$ in the frequency domain.

In the first part of the animation, the Fourier transform (as defined above) is applied to the rectangular function, rect, returning the normalized sinc function, sinc.

Note: Mathematical Definitions of rect and sinc: The rect function is defined as $$\text{rect} \ x = \Pi(x) = \left\{ \begin{array}{ll} 0, & \text{if} \lvert x \rvert > \frac{1}{2} \\ \frac{1}{2}, & \text{if} \lvert x \rvert = \frac{1}{2} \\ 1, & \text{if} \lvert x \rvert < \frac{1}{2}.\\ \end{array} \right.$$
The normalized sinc function is defined as $$\text{sinc} \ x = \frac{\text{sin} \pi x}{\pi x}.$$

In the second part, the transform is reapplied to the normalized sinc function to get back the rectangular function.

It takes four iterations of the Fourier transform to get back to the original function.

In this example, no imaginary, i.e., sine components, are displayed. Only the real, i.e., cosine components, are displayed.

Overlaid on the time domain curve, is a changing curve. This is the approximation using the components extracted from the frequency domain “found” so far, i.e., the cosines sweeping the surface.

The approximation is calculated by adding all the components, integrating along the entire surface, with the appropriate amplitude correction due to the specific Fourier transform and ranges used:

$\colorbox{white}{\color{blue}{\hat{f} (\xi) = {\int_{-\infty}^{\infty} f(x) \ {e^{ -{2 \pi} i x \xi}} \ dx}}}$

## The Short-time Fourier Transform (STFT)

The short-time Fourier transform is the Fourier transform computed over short time windows.

In short:

1. The time series is broken up in to multiple segments. Each segment is nperseg samples long.
2. The segments overlap by noverlap samples. This is the number of points to overlap between segments. The time resolution is then nperseg - noverlap, i.e, the time distance between neighboring segments. It’s often called the “hop size”.
3. Segments are multiplied with window. Window and segment must have the same length.
4. An FFT is performed over the windowed segment. The length of the FFT must be equal or longer than that of the segment (otherwise you would be truncating samples). If the FFT length is longer, the data will be zero-padded. The frequency resolution is the sample rate divided by the FFT length (not the segment length).

Overlap (or hop size) and FFT length allows changing the time and frequency resolution independently from each other and from the segment size.

The “standard” way of doing this is to use an FFT length that’s equal to the segment length and an overlap of 50%.

When the window size is large, more wide, the frequency over time is enhanced for accurate measurement.

• Longer length, high frequency resolution, low time resolution

When the window size is small, more narrow, the signal time changes are enhanced for accurate measurement.

• Shorter length, good time resolution, poor frequency resolution

## Sample EEG Data

How does sample EEG data, and the power spectral density and spectrogram of sample EEG data look?

### Main problem: EEG data = many overlapping noisy signals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from mpl_toolkits.axes_grid1 import make_axes_locatable
from scipy.signal import stft
from scipy.signal import welch

eegdata=np.squeeze(data["EEG"][0,0]["data"][46,:,0])
srate = float(data["EEG"][0,0]["srate"][0,0])

fig = plt.figure(figsize=(8,3))
plt.subplot(1,3,1)
plt.plot(eegdata)
_=plt.title("Raw EEG Data")
_=plt.xlabel("Time")
_=plt.ylabel("Amplitude (a.u.)")

plt.subplot(1,3,2)
freqs, psd = welch(eegdata)
plt.plot(freqs, np.log(psd))
_=plt.title("Power Spectral Density")
_=plt.xlabel("Frequency (Hz)")
_=plt.ylabel("Power (a.u.)")

f, t, Sxx = stft(eegdata, fs=srate, scaling='psd')
ax = plt.subplot(1,3,3)
im = plt.pcolormesh(t, f, np.abs(Sxx))
plt.title("Spectrogram")
plt.xlabel("Time")
plt.ylabel("Frequency (Hz)")
divider = make_axes_locatable(ax)
cbar = plt.colorbar(im, cax=cax)
cbar.set_label('Power (a.u.)')

plt.tight_layout()


## Sample Audio Data

How does sample audio data, and the power spectral density and spectrogram of sample audio data look?

### Main problem: audio data = many overlapping noisy signals and very non-stationary

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from scipy.io import wavfile
from scipy.io.wavfile import WavFileWarning
import warnings

audio_file = 'birdname_130519_113030.1.wav'

with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=WavFileWarning)

dur = len(audio)/fs
time = np.arange(0,dur,1/fs)

fig = plt.figure(figsize=(8,3))
plt.subplot(1,3,1)
plt.plot(time, audio)
_=plt.title("Audio Data")
_=plt.xlabel("Time (s)")
_=plt.ylabel("Amplitude (a.u.)")

plt.subplot(1,3,2)
# Pxx, freqs, line = plt.psd(audio, NFFT=512, Fs=fs/1000, noverlap=256, return_line=True) ;
freqs, psd = welch(audio, fs=fs, nperseg=512, noverlap=256)
# plt.plot(freqs/1000, np.sqrt(psd))
plt.semilogy(freqs/1000, np.sqrt(psd))
_=plt.title("Power Spectral Density")
_=plt.xlabel("Frequency (kHz)")
_=plt.ylabel("Power (a.u.)")

f, t, spec = stft(audio, fs=fs, nperseg=512, noverlap=256)
ax = plt.subplot(1,3,3)
im = plt.pcolormesh(t, f/1000, np.sqrt(np.abs(spec)))
# spectrum, freqs, t, im = plt.specgram(audio, NFFT=512, Fs=fs, noverlap=256, mode='psd')
plt.title("Spectrogram")
plt.xlabel("Time")
plt.ylabel("Frequency (kHz)")
divider = make_axes_locatable(ax)
cbar = plt.colorbar(im, cax=cax)
cbar.set_label('Power (a.u.)')

plt.tight_layout()


## The Inverse Fourier Transform (IFT)

The inverse Fourier transforms signal from the frequency domain to the time domain…literally the inverse of the Fourier transform!

Similar to the discrete Fourier transform, it is defined as

$\color{cyan}{x_{n}} = \color{magenta}{\frac{1}{N} \sum_{n=0}^{N-1}} \color{purple}{X}_{\color{lime}{k}} \color{red}{e^i} \color{orange}{2 \pi} \color{lime}{k} \color{magenta}{\frac{n}{N}}$

Notably, the Fourier transform and inverse Fourier transform are lossless, that is, no information is lost when transforming signal from the time to frequency domain and back!

Here, $$\color{red}{i}$$ in the complex exponential $$\color{red}{e^i}$$ is positive, as opposed to negative, as seen in the definition of the discrete Fourier transform.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Figure 11.6
N = 10 #length of sequence
time = np.arange(N)/float(N)
reconstructed_data = np.zeros(N)

for fi in range(N):
#scale sine wave by fourier coefficient
sine_wave = fourier_og[fi] * np.exp(1j *2 * np.pi * fi * time)

#sum the sine waves together, and take only the real part
reconstructed_data += np.real(sine_wave)

plt.figure()
plt.plot(og_data,'ko',linewidth=4)
plt.plot(reconstructed_data,'r-*')
_=plt.legend(["Original Data","Inverse Fourier transform"])
plt.xlabel("Time (ms)")
plt.ylabel("Amplitude (a.u.)") ;


…among lots of other readings…physical and digital, old and new!

(Picture by Wonders of Physics, Public Domain)

The Fourier transform is one of many influential and revolutionary mathematics developed and discovered in the 19th century. In the 1800s, Jean-Baptiste Joseph Fourier claimed that perhaps any arbitrary function of a variable, whether continuous or discontinuous, can be expressed as a sum of sines and cosines. That is, perhaps an arbitrary time-dependent (or even space-dependent) signal $$x (t)$$ can be expressed as a superposition, i.e., linear combination, of sine and cosine functions. This expression of a function as a sum of infintely many sines and cosines is the Fourier series

Today, the Fourier transform is one of the most important algorithms in engineering, mathematics, physics, statistics, signal processing, and image and video processing. It is particularly important and useful for analyzing electrophysiological data like EEG, iEEG or ECoG, LFP, and MEG recordings.

The Fourier transform is valuable for two reasons:

1. It is a way to determine which frequencies are present in some temporal signal, or which wave lengths are present in some spatial pattern.
2. It is a straightforward way to solve many mathematical problems involving constant coefficient differential equations.

But, most importantly, it transforms signal from the time (or spatial) domain to the frequency domain and back!

The main result of the Fourier transform are Fourier coefficients used to compute a spectrum of power at various frequencies present in the signal: the power spectrum measures how strong is the contribution of frequencies present in the signal.

## Time domain $$\leftrightarrow$$ frequency domain

Consider an arbitrary function or signal varying over time.

The time domain is each data point in time. The data is represented as a series of evenly spaced samples over time, collected at some sampling frequency. Neural time series is represented by its amplitude at each time point.

The frequency domain is sine functions for each frequency. The data in this domain is represented as waves with particular strengths and phases at different frequencies. This can be used to reconstruct the signal.

Below shows the relationship between the time domain and the frequency domain of a function based on its Fourier transform.

The Fourier transform takes an input function $$\color{red}f$$ in the “time domain” and converts it into a new function $$\color{blue}\hat{f}$$ in the “frequency domain”.

$\colorbox{white}{\color{blue}{a_n \text{cos} (nx) + b_n \text{sin}(nx)}}$

In other words, the original function can be thought of as “amplitude given time”, and the Fourier transform of the function is “amplitude given frequency”.

This animation shows a 6-component approximation of the square wave decomposed into 6 sine waves. These component frequencies show as sharp peaks in the frequency domain of the function.

(GIF by Lucas Vieira, 23 February 2013. Public Domain, via Wikimedia Commons)

### A bit of math…the Fourier series and Euler’s formula

#### The Fourier Series

If $$\omega = 2 \pi / T = 2 \pi f$$ is the angular frequency $$\omega$$ corresponding to the period $$T$$ and frequency $$f = 1/T$$, then the function $$x(t)$$ can be written as a convergent infinite sum of sine and cosine functions each of period $$T$$:

\begin{aligned} x(t) &= \frac{1}{2} a_0 + \begin{align} &a_1 \text{cos}(\omega t) + a_2 \text{cos}(2 \omega t) + · · · \\ &+ b_1 \text{sin}(\omega t) + b_2 \text{sin}(2 \omega t) + · · · \\ \end{align} \\ &= a_0 + \sum_{n=1}^{\infty} (a_n \text{cos}(n \omega t) + b_n \text{sin}(n \omega t)) \end{aligned}

where $$a_n$$ and $$b_n$$ are the Fourier coefficients. Alternatively, the Fourier series can be written in terms of complex exponentials.

#### Euler’s Formula

Many mathematicians and scientists have expressed that Euler’s formula and Euler’s identity is the “most beautiful equation”. It is defined as

$\exp^{ix} = \text{cos}(x) + i \ \text{sin}(x),$

where $$x$$, or alternatively $$\phi$$, is a real number and $$i = \sqrt{-1}$$ is one of the two square roots of $$-1$$.

Below in the visualization, $$\exp^{i \phi}$$ is the unit circle in the complex plane for a real number $$\phi$$. Here $$\phi$$ is the angle between a line connecting the origin with a point on the unit circle makes and the positive real axis. The positive real axis is the $$\text{cos} \phi$$ axis. Correspondingly, the real Fourier coefficients are the coefficients of the cosine.

#### Visually…

(Image by Gunther, 29 May 2006. CC BY-SA 4.0, via Wikimedia Commons)

Skipping some math…a continuous signal of interest can be expressed as the following:

$x(t) = \sum_{-\infty}^{\infty} c_n \exp^{i n \omega t}$

where $$c_n = a_n + i b_n$$ are the complex-valued Fourier coefficients.

Now, the power spectrum $$P_n$$ is the magnitude squared of the complex Fourier coefficients such that

$P_n = {\lvert c_{n} \rvert}^2 = c_n c_{n}^* = a_{n}^{2} + b_{n}^{2}$

for $$n = 0, 1, 2, \cdots .$$ which corresponds to the frequency $$f_n = n/T$$ or angular frequency $$\omega n = 2 \pi \omega /T$$. The power spectrum is plotted as a function of frequency or angular frequency, plotting $$P(f_n) = P_n$$.

### A bit of plots

All code below is written in python and adapted from Lyndon Duong’s GitHub repository: Python implementations of code in Analyzing Neural Time Series by Dr. Mike X. Cohen and follows Mike X. Cohen’s book, Analyzing Neural Time Series Data: Theory and Practice (2014, MIT Press). These tutorials (originally written in MATLAB) are further explained and found in Dr. Mike X. Cohen’s data analysis lecturelets and on his YouTube channel.

The only dependencies needed for the following code are:

1
2
3
4
import matplotlib
from matplotlib import pyplot as plt
import numpy as np
import scipy.io


I customize the style of my plots by updating the “runtime configuration parameters”, matplotlib’s rcParams.

1
2
3
4
5
6
7
8
9
10
11
12
matplotlib.rcParams.update(matplotlib.rcParamsDefault)
plt.rcParams.update({'font.family' : 'Arial',
'lines.linewidth': 1,
'figure.dpi': 150,
'savefig.dpi': 500,
'axes.spines.right': False,
'axes.spines.top': False,
'axes.edgecolor': 'k',
'axes.linewidth': 1,
'axes.grid': False,
'axes.autolimit_mode': 'data',
'savefig.format': 'svg'})


#### Multiple sines of different frequencies can be combined to make a more complex signal

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Figure 11.2
srate = 500. #sampling rate in Hz

#create arrays of frequencies, amplitudes, and phases to plot
frex = np.array([3, 10, 5 ,15, 35])
amplit = np.array([5, 15, 10, 5, 7])
phases = np.pi*np.array([1/7., 1/8., 1., 1/2., -1/4.])
time = np.arange(-1,1 +1/srate,1/srate)

sine_waves = np.zeros([len(frex), len(time)])
for fi in range(len(frex)):
sine_waves[fi,:] = amplit[fi] * np.sin(2*np.pi*frex[fi]*time+phases[fi])

#plot each sine wave individually
fig = plt.figure()
fig.supylabel("Amplitude (a.u.)")

for fi in range(len(frex)):
plt.subplot(len(frex), 2, 2*(fi+1)-1)
plt.plot(sine_waves[fi,:], linewidth=1, color='b')
if fi == 0:
plt.title("Individual Sines")
if fi != len(frex)-1:
ax = plt.gca()
ax.axes.xaxis.set_ticklabels([])
elif fi == len(frex)-1:
ax = plt.gca()
ax.set_xlabel("Time")

#plot the sum of all sum waves
plt.subplot(1,2,2)
plt.plot(np.sum(sine_waves,axis=0), color='r')
plt.tight_layout()
_=plt.title("Sum of Sines")
_=plt.xlabel("Time")


1
2
3
4
5
6
7
# Figure 11.3
plt.figure()
noise = 5 * np.random.randn(np.shape(sine_waves)[1])
plt.plot( np.sum(sine_waves,axis=0) + noise, color='r')
_=plt.title("Sum of Sines+White Noise")
plt.xlabel("Time")
plt.ylabel("Amplitude (a.u.)") ;


#### Adding two signals preserves information of both signals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Figure 11.4
time = np.arange(-1, 1+1/srate, 1/srate)

#create three sine waves
s1 = np.sin(2*np.pi*3*time)
s2 = 0.5 * np.sin(2*np.pi*8*time)
s3 = s1 + s2

s_list = [s1, s2, s3]  # throw them into a list to analyze and plot

#plot the sine waves
fig = plt.figure(constrained_layout=True)

(subfig1, subfig2) = fig.subfigures(2, 1) # create 2 subfigures, total 2x3
ax1 = subfig1.subplots(1, 3, sharex='col', sharey='row') # create 1x3 subplots on subfig1
ax2 = subfig2.subplots(1, 3, sharex='col', sharey='row') # create 1x3 subplots on subfig2

for i in range(3):
ax1[i].plot(time, s_list[i], color='r')
ax1[i].axis([0, 1, -1.6, 1.6])
ax1[i].set_yticks(np.arange(-1.5, 2, .5))

#numpy implementation of the fft
f = np.fft.fft(s_list[i])/float(len(time))
hz = np.linspace(0, srate/2., int(np.floor(len(time)/2.)+1)) # we only have resolution up to SR/2 (Nyquist theorem)
ax2[i].bar(hz,np.absolute(f[:len(hz)]*2), color='b')
ax2[i].axis([0, 11, 0, 1.2])
ax2[i].set_xticks(np.arange(0, 11))

subfig1.suptitle("Time Domain")
subfig1.supxlabel("Time")
subfig1.supylabel("Amplitude (a.u.)")

subfig2.suptitle("Frequency Domain")
subfig2.supxlabel("Frequency (Hz)")
subfig2.supylabel("Power (a.u.)") ;


Here, two sines with different amplitudes and frequencies (from left to right: $$1$$ and $$0.5$$, and $$3$$ and $$8$$, respectively) are added together in the time domain. The relative “strength” of the frequencies present in the signal corresponding to sum of the two sines is resolved in the frequency domain.

#### Putting it all together: from signal to its power and phase spectrum

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Figure 11.5
from mpl_toolkits.mplot3d import Axes3D
N = 10 #length of sequence
og_data = np.random.randn(N) #create random numbers, sampled from normal distribution
srate = 200 #sampling rate in Hz
nyquist = srate/2 #Nyquist frequency -- highest frequency you can measure the data

#initialize matrix for Fourier output

frequencies = np.linspace(0, nyquist, N//2+1)
time = np.arange(N)/float(N)

#Fourier transform is dot product between sine wave and data at each frequency
fourier = np.zeros(N)*1j #create complex matrix
for fi in range(N):
sine_wave = np.exp(-1j *2 *np.pi*fi*time)
fourier[fi] = np.sum(sine_wave*og_data)

fourier_og = fourier/float(N)

fig = plt.figure(figsize=(8,8))

plt.subplot(221)
plt.plot(og_data,"-ro",)
plt.xlim([0,N+1])
plt.ylabel('Amplitude (a.u.)')
plt.xlabel('Time Point')
plt.title("Time Domain Representation of Data")

ax.plot(frequencies, np.angle(fourier_og[:N//2+1]), zs=np.absolute(fourier_og[:N//2+1])**2)
ax.view_init(20, 20)
ax.set_xlabel('Frequency (Hz)')
ax.set_ylabel('Phase')
ax.zaxis.set_rotate_label(False)
ax.set_zlabel('Power (a.u.)', rotation=90)
plt.title("3D representation of the Fourier transform")

plt.subplot(223)
plt.stem(frequencies, np.absolute(fourier_og[:N//2+1])**2, use_line_collection=True)
plt.xlim([-5, 105])
plt.xlabel("Frequency (Hz)")
plt.ylabel("Power (a.u.)")
plt.title("Power Spectrum")

plt.subplot(224)
plt.stem(frequencies, np.angle(fourier_og[:N//2+1]), use_line_collection=True)
plt.xlim([-5, 105])
plt.xlabel("Frequency (Hz)")
plt.ylabel("Phase Angle")
plt.yticks(np.arange(-np.pi, np.pi, np.pi/2.))

plt.title("Phase Spectrum")

plt.tight_layout()

1
2
3
4
/var/folders/1q/g9xwc9k128101dg5z58dbhm40000gn/T/ipykernel_22564/3676337223.py:40: MatplotlibDeprecationWarning: The 'use_line_collection' parameter of stem() was deprecated in Matplotlib 3.6 and will be removed two minor releases later. If any parameter follows 'use_line_collection', they should be passed as keyword, not positionally.
plt.stem(frequencies, np.absolute(fourier_og[:N//2+1])**2, use_line_collection=True)
/var/folders/1q/g9xwc9k128101dg5z58dbhm40000gn/T/ipykernel_22564/3676337223.py:47: MatplotlibDeprecationWarning: The 'use_line_collection' parameter of stem() was deprecated in Matplotlib 3.6 and will be removed two minor releases later. If any parameter follows 'use_line_collection', they should be passed as keyword, not positionally.
plt.stem(frequencies, np.angle(fourier_og[:N//2+1]), use_line_collection=True)


Note: the phase spectrum is derived from the coefficient of the sine, i.e., the argument of the complex number.

## The Discrete Fourier Transform (DFT)

(The following is adapted from blog posts written by Stuart Riffle and David Smith.)

The Discrete Fourier Transform is applied to signal of finite duration discretized at some sampling frequency.

It is defined as

$\color{purple}{X}_{\color{lime}{k}} = \color{magenta}{\sum_{n=0}^{N-1}} \color{cyan}{x_{n}} \color{red}{e^{-{i} \color{orange}{2 \pi} \color{lime}{k} \color{magenta}{\frac{n}{N}}}} \color{black} .$

Where,

• $$\color{purple} X$$ is amplitude or energy
• $$\color{lime}{k}$$ is frequency
• $$\color{cyan}{x_{n}}$$ is the signal
• $$\color{red}{e^-i}$$ comes from Euler’s formula
• One can think of this complex exponential as rotation of a point along the unit circle starting from (0, 1)…backwards, hence, the negative.
• $$\color{orange}{2 \pi}$$ is the circumfrence of the unit circle
• One can think of this as rotating a point along the full unit circle.
• $$\color{magenta}{\sum\limits_{n=0}^{N-1}}$$ and $$\color{magenta}{\frac{n}{N}}$$ is the sum and average of all points in the signal

As Stuart Riffle and David Smith state:

“To find the energy at a particular frequency, spin the signal around a circle at that frequency, and average a bunch of points along that path.”

### The algorithm

The signal $$x_{n}$$ is convolved (shout out to Pranjal Gupta, check out his post on convolutions) with a complex exponential $$\color{red}{e^{-i}} \color{orange}{2 \pi} \color{lime}{k}$$ for multiple frequencies $$k$$ over multiple time points $$n$$. In other words, the DFT algorithm is to compute dot products of the signal and complex exponentials.

This is most efficiently computed using Fast Fourier Transform (FFT) algorithms.

## Important Considerations and Issues

There are a few considerations and issues of which to be aware when using Fourier transforms to analyze real data.

One must consider sampling frequency (often constrained by hardware and software), signal properties, and appropriate parameters to set given data.

### Aliasing

Aliasing is frequency ambiguity in a power spectrum as a result of discretely sampling a continuous waveform.

The maximum sampling frequency $$f_{\max}$$ is called the Nyquist frequency given by

$f_{\max} = \frac{1}{2 \Delta t} = \frac{1}{2 f_s}.$

Frequencies above the Nyquist frequency do not disappear. They reappear as peaks in the power spectrum at lower frequencies in the range $$0 < f < f_{\max} = 1/(2 \Delta t)$$.

(GIF by Omnicron11, 7 April 2021. CC BY-SA 4.0, via Wikimedia Commons)

The upper left animation depicts sines. Each successive sine has a higher frequency than the previous. “True” signals are being sampled (dots) at a constant frequency $$\color{black}{f_s}$$.

The upper right animation shows the continuous Fourier transform of the sine. The single non-zero component, the actual frequency, means there is no ambiguity.

The lower right animation shows the discrete Fourier transform of just the available samples. The presence of two components means the samples can fit at least two different sines, one of which is the true frequency (the upper right animation).

The lower left animation uses the same samples and default reconstruction algorithm to produce lower-frequency sines.

### Stationarity

A stationary signal is a signal that has constant statistical properties, e.g., mean, variance, etc., over time.

Neural data is most definitely not stationary!

The power spectrum of a non-stationary signal is less “sharp”: it is more difficult to resolve what frequencies are present with what power at what time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
## Effects of non-stationarity

# Figure 11.9
#create array of frequencies, amplitudes, and phases
frex = np.array([3,10,5,7])
amplit = np.array([5,15,10,5])
phases = np.pi*np.array([1/7.,1/8.,1,1/2.])

#create a time series of sequenced sine waves

srate = 500.
time = np.arange(-1,1+1/srate,1/srate)
stationary = np.zeros(len(time)*len(frex))
nonstationary = np.zeros(len(time)*len(frex))

for fi in range(len(frex)):
#compute sine wave
temp_sine_wave = amplit[fi] * np.sin(2*np.pi*frex[fi]*time+phases[fi])

#enter into stationary time series
stationary = stationary + np.tile(temp_sine_wave,(1,len(frex)))

#optional change of amplitude over time
temp_sine_wave *= time+1

#start and stop indices for insertion of sine wave
start_idx = fi * len(time)
stop_idx = fi *len(time) + len(time)

#enter into non-stationary time series
nonstationary[start_idx:stop_idx] = temp_sine_wave

plt.figure()
plt.subplot(221)
plt.plot(stationary[0], color='olive')
plt.axis([0,stationary.shape[1],-30,30])
plt.title("Stationary Signal")
_,xticks=plt.xticks(np.arange(0,4000,500))
plt.setp(xticks,rotation=-45)
plt.xlabel("Time")
plt.ylabel("Amplitude (a.u.)")

plt.subplot(222)
plt.plot(nonstationary, color='m')
plt.xlim([1,len(nonstationary)])
plt.title("Non-stationary Signal")
_,xticks=plt.xticks(np.arange(0,4000,500))
plt.setp(xticks,rotation=-45)
plt.xlabel("Time")

frequencies = np.linspace(0,srate/2, len(nonstationary)//2+1)
fft_nonstationary = np.fft.fft(nonstationary)/len(nonstationary)
fft_stationary = np.fft.fft(stationary[0])/stationary.shape[1]

plt.subplot(212)
plt.plot(frequencies,np.absolute(fft_stationary[:len(frequencies)]*2),'olive')
plt.plot(frequencies,np.absolute(fft_nonstationary[:len(frequencies)]*2), 'm')
plt.xlim([0,np.max(frex)*2])
plt.xlabel("Frequency (Hz)")
plt.ylabel("Power (a.u.)")
plt.legend(["Power stationary","Power non-stationary"])

plt.tight_layout()


Adding zeros to the end of the signal adds more points in the time domain. This increases the frequency resolution, however, no new information is added.

Often one zero-pads to make the length of the signal (in terms of points) a power of two number. FFT algorithms work best, i.e., are most efficient, when the length of the signal or window is a power of two.

## The Continuous Fourier Transform

The continuous Fourier transform takes a continuous input function $$\color{red}{f(x)}$$ in the time domain and turns it into a new function $$\color{blue}{\hat{f}(\xi)}$$ in the frequency domain.

In the first part of the animation, the Fourier transform (as defined above) is applied to the rectangular function, rect, returning the normalized sinc function, sinc.

When the window size is large, more wide, the frequency over time is enhanced for accurate measurement.

• Longer length, high frequency resolution, low time resolution

When the window size is small, more narrow, the signal time changes are enhanced for accurate measurement.

• Shorter length, good time resolution, poor frequency resolution

## Sample EEG Data

How does sample EEG data, and the power spectral density and spectrogram of sample EEG data look?

### Main problem: EEG data = many overlapping noisy signals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from mpl_toolkits.axes_grid1 import make_axes_locatable
from scipy.signal import stft
from scipy.signal import welch

eegdata=np.squeeze(data["EEG"][0,0]["data"][46,:,0])
srate = float(data["EEG"][0,0]["srate"][0,0])

fig = plt.figure(figsize=(8,3))
plt.subplot(1,3,1)
plt.plot(eegdata)
_=plt.title("Raw EEG Data")
_=plt.xlabel("Time")
_=plt.ylabel("Amplitude (a.u.)")

plt.subplot(1,3,2)
freqs, psd = welch(eegdata)
plt.plot(freqs, np.log(psd))
_=plt.title("Power Spectral Density")
_=plt.xlabel("Frequency (Hz)")
_=plt.ylabel("Power (a.u.)")

f, t, Sxx = stft(eegdata, fs=srate, scaling='psd')
ax = plt.subplot(1,3,3)
im = plt.pcolormesh(t, f, np.abs(Sxx))
plt.title("Spectrogram")
plt.xlabel("Time")
plt.ylabel("Frequency (Hz)")
divider = make_axes_locatable(ax)
cbar = plt.colorbar(im, cax=cax)
cbar.set_label('Power (a.u.)')

plt.tight_layout()


## Sample Audio Data

How does sample audio data, and the power spectral density and spectrogram of sample audio data look?

### Main problem: audio data = many overlapping noisy signals and very non-stationary

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from scipy.io import wavfile
from scipy.io.wavfile import WavFileWarning
import warnings

audio_file = 'birdname_130519_113030.1.wav'

with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=WavFileWarning)

dur = len(audio)/fs
time = np.arange(0,dur,1/fs)

fig = plt.figure(figsize=(8,3))
plt.subplot(1,3,1)
plt.plot(time, audio)
_=plt.title("Audio Data")
_=plt.xlabel("Time (s)")
_=plt.ylabel("Amplitude (a.u.)")

plt.subplot(1,3,2)
# Pxx, freqs, line = plt.psd(audio, NFFT=512, Fs=fs/1000, noverlap=256, return_line=True) ;
freqs, psd = welch(audio, fs=fs, nperseg=512, noverlap=256)
# plt.plot(freqs/1000, np.sqrt(psd))
plt.semilogy(freqs/1000, np.sqrt(psd))
_=plt.title("Power Spectral Density")
_=plt.xlabel("Frequency (kHz)")
_=plt.ylabel("Power (a.u.)")

f, t, spec = stft(audio, fs=fs, nperseg=512, noverlap=256)
ax = plt.subplot(1,3,3)
im = plt.pcolormesh(t, f/1000, np.sqrt(np.abs(spec)))
# spectrum, freqs, t, im = plt.specgram(audio, NFFT=512, Fs=fs, noverlap=256, mode='psd')
plt.title("Spectrogram")
plt.xlabel("Time")
plt.ylabel("Frequency (kHz)")
divider = make_axes_locatable(ax)
cbar = plt.colorbar(im, cax=cax)
cbar.set_label('Power (a.u.)')

plt.tight_layout()


## The Inverse Fourier Transform (IFT)

The inverse Fourier transforms signal from the frequency domain to the time domain…literally the inverse of the Fourier transform!

Similar to the discrete Fourier transform, it is defined as

$\color{cyan}{x_{n}} = \color{magenta}{\frac{1}{N} \sum_{n=0}^{N-1}} \color{purple}{X}_{\color{lime}{k}} \color{red}{e^i} \color{orange}{2 \pi} \color{lime}{k} \color{magenta}{\frac{n}{N}}$

Notably, the Fourier transform and inverse Fourier transform are lossless, that is, no information is lost when transforming signal from the time to frequency domain and back!

Here, $$\color{red}{i}$$ in the complex exponential $$\color{red}{e^i}$$ is positive, as opposed to negative, as seen in the definition of the discrete Fourier transform.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# Figure 11.6
N = 10 #length of sequence
time = np.arange(N)/float(N)
reconstructed_data = np.zeros(N)

for fi in range(N):
#scale sine wave by fourier coefficient
sine_wave = fourier_og[fi] * np.exp(1j *2 * np.pi * fi * time)

#sum the sine waves together, and take only the real part
reconstructed_data += np.real(sine_wave)

plt.figure()
plt.plot(og_data,'ko',linewidth=4)
plt.plot(reconstructed_data,'r-*')
_=plt.legend(["Original Data","Inverse Fourier transform"])
plt.xlabel("Time (ms)")
plt.ylabel("Amplitude (a.u.)") ;