Editing Videos Using ffmpeg + TypeScript and Deno.

December 7, 2023 (Syndicated From dev.to)

TLDR: I wrote a module using typescript / deno that wraps ffmpeg and offers a better developer experience when editing a bunch of video clips together šŸ‘‰ deno/x/movie.

Iā€™ve been recording small clips of video on my old ā€œretroā€ šŸ¤£ ā€œvintageā€ šŸ˜­ Nikon point-and-shoot digital camera. Iā€™m trying to think more visually and thinking about using video as a medium to explore concepts Iā€™m experiencing in life, and evoke a feeling.

I kinda hate modern video editing software, mainly using iMovie. I briefly looked into other open-source alternatives to iMovie, but nothing stood out. Then I thought about ffmpeg, a command-line tool Iā€™ve used once or twice in the past to convert video or audio to another format.

I started a deep-dive in learning ffmpeg. How to do things like remove sound entirely from a clip. How to manipulate saturation, gamma, and contrast. How to raise the volume of a clip. All of these are done within ffmpeg as ā€œfiltersā€. Given that itā€™s a command-line tool every character matters and a lot of information is condensed into a short command. The syntax is a bit hard to understand. I found that breaking the command down using new lines and a \ was the only way to really read it.

Brief Explainer

Hereā€™s a simple script that merges two videos:

ffmpeg \
-i ./example/originals/DSCN3700.AVI \
-i ./example/originals/DSCN3701.AVI \
-filter_complex "\
concat=n=2:v=1:a=1[outv][outa]" \
-map "[outv]" -map "[outa]" \
./example/output.mp4 -y

Hereā€™s a simple breakdown of this command:

  • -i are inputs 0 and 1
  • 0:v:0 refers to input 0 and video feed 0
  • 0:a:1 refers to input 0 and audio track 0

This is because files can have different video and audio tracks, think about the feature of supporting a video with multiple language tracks. Track 0 may be English, track 1 may be Japanese.

This is a simple example, a more complex example would look like this:

ffmpeg \
-i ./example/originals/DSCN3700.AVI \
-i ./example/originals/DSCN3701.AVI \
-filter_complex "\
concat=n=2:v=1:a=1[outv][outa]" \
-map "[outv]" -map "[outa]" \
./example/output.mp4 -y
  • This does the same for imports
  • This adds in the notion of a ā€œfilterā€ assigned from a track to a ā€œvariableā€
  ^ the track -----------[ the filter(s) ]-- the new variable    ^

Each of these definitions are followed by a ;

Then we have the concatenation definition after the last ;:


This specifies this audio video pair from the definitions followed by the next and so on and so on.

So I view the -filter_complex value as this:

{all the filter transformations, declaring a new variable };
{the concatenations, paired audio + video}

I built a wrapper

Early when learning the filters I wanted a better way so I created some glue-code to wrap ffmpeg in a way that was easier to move clips around and be more flexible.


Hereā€™s the video I made with it šŸ‘‡šŸŽ¬šŸŽ„