Why render a music video clip in the browser
Every Instagram Story that leans on a song is really a tiny music video: a 9:16 canvas with cover art, a title, and an animated visualizer that pulses along with the waveform. Building that in Premiere or After Effects is three hours of work for thirty seconds of output. Native apps have visualizers too, but they stamp a watermark across the corner. This tool is the middle ground — the live-preview editor plus an mp4 encoder, running entirely on your CPU.
What's happening under the hood
When you pick an audio file, the browser's own AudioContext decodes it into a raw AudioBuffer — the same floating-point samples your speakers would otherwise play. For every visualizer frame the tool sweeps a short window of samples around the current playhead and splits it into sixty sub-windows. The RMS of each sub-window becomes one bar height, which is why the bars breathe with the loudness of the music rather than jumping on every peak.
Rendering happens into an offscreen 1080 × 1920 canvas, thirty frames per second. Each frame is fed to mediabunny's CanvasSource, which hands it to the browser's WebCodecs H.264 encoder. The trimmed audio is re-encoded to AAC and muxed into the same mp4 container. The result is a file Instagram, TikTok and YouTube all accept without a second re-encode.
Cover art as a color source
Toggling "use colors from cover" downsamples the uploaded art to a 32 × 32 grid, computes the perceptual luminance of every pixel, and picks the darkest and brightest ones as gradient stops. The result usually mirrors the album art well enough that the Story feels like it belongs to the song, without forcing you to match colors by hand. The "blur cover as background" option is the other common pattern for music apps: a 40-pixel Gaussian blur of the cover fills the frame, dimmed enough that text stays readable on top.