Thomas Reggi's Profile Image

@thomasreggi 🌸

Editing Videos Using ffmpeg + TypeScript and Deno.

December 7, 2023 (Syndicated From dev.to)

TLDR: I wrote a module using typescript / deno that wraps ffmpeg and offers a better developer experience when editing a bunch of video clips together 👉 deno/x/movie.

I’ve been recording small clips of video on my old “retro” 🤣 “vintage” 😭 Nikon point-and-shoot digital camera. I’m trying to think more visually and thinking about using video as a medium to explore concepts I’m experiencing in life, and evoke a feeling.

I kinda hate modern video editing software, mainly using iMovie. I briefly looked into other open-source alternatives to iMovie, but nothing stood out. Then I thought about ffmpeg, a command-line tool I’ve used once or twice in the past to convert video or audio to another format.

I started a deep-dive in learning ffmpeg. How to do things like remove sound entirely from a clip. How to manipulate saturation, gamma, and contrast. How to raise the volume of a clip. All of these are done within ffmpeg as “filters”. Given that it’s a command-line tool every character matters and a lot of information is condensed into a short command. The syntax is a bit hard to understand. I found that breaking the command down using new lines and a \ was the only way to really read it.

Brief Explainer

Here’s a simple script that merges two videos:

ffmpeg \
-i ./example/originals/DSCN3700.AVI \
-i ./example/originals/DSCN3701.AVI \
-filter_complex "\
[0:v:0][0:a:0]\
[1:v:0][1:a:0]\
concat=n=2:v=1:a=1[outv][outa]" \
-map "[outv]" -map "[outa]" \
./example/output.mp4 -y

Here’s a simple breakdown of this command:

  • -i are inputs 0 and 1
  • 0:v:0 refers to input 0 and video feed 0
  • 0:a:1 refers to input 0 and audio track 0

This is because files can have different video and audio tracks, think about the feature of supporting a video with multiple language tracks. Track 0 may be English, track 1 may be Japanese.

This is a simple example, a more complex example would look like this:

ffmpeg \
-i ./example/originals/DSCN3700.AVI \
-i ./example/originals/DSCN3701.AVI \
-filter_complex "\
[0:a:0]volume=3,atrim=start=4:duration=4,asetpts=PTS-STARTPTS[audio0];\
[0:v:0]fps=25,crop=480:480:0:0,trim=start=4:duration=4,setpts=PTS-STARTPTS[video0];\
[1:a:0]volume=4,atrim=end=17:start=15,asetpts=PTS-STARTPTS[audio1];\
[1:v:0]fps=25,crop=480:480:0:0,trim=end=17:start=15,setpts=PTS-STARTPTS[video1];\
[video0][audio0]\
[video1][audio1]\
concat=n=2:v=1:a=1[outv][outa]" \
-map "[outv]" -map "[outa]" \
./example/output.mp4 -y
  • This does the same for imports
  • This adds in the notion of a “filter” assigned from a track to a “variable”
[0:a:0]volume=3,atrim=start=4:duration=4,asetpts=PTS-STARTPTS[audio0];\
  ^ the track -----------[ the filter(s) ]-- the new variable    ^

Each of these definitions are followed by a ;

Then we have the concatenation definition after the last ;:

[video0][audio0]\

This specifies this audio video pair from the definitions followed by the next and so on and so on.

So I view the -filter_complex value as this:

{all the filter transformations, declaring a new variable };
{the concatenations, paired audio + video}

I built a wrapper

Early when learning the filters I wanted a better way so I created some glue-code to wrap ffmpeg in a way that was easier to move clips around and be more flexible.

https://deno.land/x/movie@1.0.0

Here’s the video I made with it 👇🎬🎥