samedi 10 août 2013

Resampling the audio track of a video

MOOC platforms like Coursera and MIT OpenCourseWare are fantastic. But sometimes the video provided are not really audible, especially when you are in noisy public transportation. This article shows how to solve this issue.

The issue

Take any course from MIT OCW. It is likely coming from a classroom, with some echo, little noise. The problem is that when you are in a bus and want to watch your video lessons, you are very easily disturbed by the environment (horns, brakes noises, people speaking loudly).

How can we transform a video so that the volume is higher ? ffmpeg+sox is a very efficient combination for that.

The process

First, we have to extract the audio track from the video, then amplify the volume, and finally rebuild the whole video using the amplified audio track.

Here, I assume that the file you want to modify is stored in ${video}.

To extract the audio track in a file named "${video}.wav", use this command :
ffmpeg -i $video ${video}.wav

Then, find the maximum amplification you can use without introducing noise :
sox ${video}.wav -n stat 2>&1 | grep "Volume adjustment"

This will return, for instance, "Volume adjustment : 2.041".

Use this value to amplify the audio track :
sox -v  ${video}.wav ${video}.louder.wav

Finally, rebuild the video using the amplified audio track :
ffmpeg -i $video -i ${video}.louder.wav \
  -map 0:0 -map 1:0 \
  -strict -2 \
  -acodec aac -ab 96k -vcodec copy resampled.${video}
Note that you may need to use the audio codec "libfaac", depending on your version of ffmpeg and how it was built. In that case, drop the parameter "-scrict -2" with is only required to enable the experimental audio codec "aac"

That's it. Now you can clean the temporary files that were generated :
rm ${video}.wav ${video}.louder.wav
mv ${video} ${video}.old
mv resampled.${video} ${video}

Final thoughts

That was simple, now you can enjoy/learn without being affected with a too-low volume problem.

Any other issue with MOOC or video encoding/decoding/resampling ? Let's discuss about that in the comments !

Aucun commentaire:

Enregistrer un commentaire