TensorLearn
Back to Course
Deep Learning with PyTorch
Module 10 of 12

10. Audio Processing

1. Images of Sound

We don't feed raw waveforms to CNNs. We feed Spectrograms. It turns Time-Amplitude into Time-Frequency (like an image).

2. Wav2Vec & HuBERT

Self-supervised learning on audio. Masking parts of the sound and asking the model to guess the missing bits.

Mark as Completed

TensorLearn - AI Engineering for Professionals