Jumpcutter
⏩ Fast-forwards long pauses between sentences — watch lectures ~1.5x faster (browser extension)
Install / Use
/learn @WofWca/JumpcutterREADME
<img src="./src/icons/icon.svg" alt="Logo" height="32"/> Jump Cutter
[][chrome-web-store]
[
][addons-mozilla-org] <!-- [](https://liberapay.com/WofWca) -->
[
][weblate]
Download:
[
][chrome-web-store]
[][addons-mozilla-org]
[
][microsoft-edge-addons]
or from GitHub: Chromium / Gecko (Firefox)
Skips silent parts in videos, in real time.
Can be useful for watching lectures, stream recordings (VODs), webinars, podcasts, and other unedited videos.
Demo:
<!-- TODO refactor: put the file in the repo so it's set in stone? --> <!-- The source video: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-034-artificial-intelligence-fall-2010/lecture-videos/lecture-16-learning-support-vector-machines/ (or https://youtu.be/_PwhiWxHK8o). This video's license: CC BY-NC-SA 4.0 (https://creativecommons.org/licenses/by-nc-sa/4.0/). Not sure if I did comply with the license here. But I believe this use case would be considered "fair use" anyway. -->Inspired by this video by carykh.
How it works
Simple (mostly).
<!-- Idk where to put this part. It seems out of place as an introduction, because we don't really have to say anything about looking ahead to explain the simpler case, when "margin before" is 0. And both algorithms have their own pros and cons even with "margin before" being 0. With the current state of the web APIs, there is no direct way to inspect audio samples of a media file/stream anywhere other than at the current playback position of the media element. Otherwise it would be pretty easy to employ the algorithms used in the [analogous software](https://alternativeto.net/software/jump-cutter/), such as * [jump-cutter](https://github.com/jfkthame/jump-cutter) * <https://github.com/carykh/jumpcutter> * ExoPlayer ([SilenceSkippingAudioProcessor](https://github.com/google/ExoPlayer/blob/9c9f5a0599ec012d5cc46e3bd2e732a589adf61d/library/core/src/main/java/com/google/android/exoplayer2/audio/SilenceSkippingAudioProcessor.java)) * ffmpeg ([`silenceremove`](https://ffmpeg.org/ffmpeg-filters.html#toc-silenceremove)) So we have to work around that fact. -->Currently there are 2 separate algorithms in place.
The first one we call "the stretching algorithm", and it's in this file. It simply looks at the output audio of a media element, determines its current loudness and, when it's not loud, increases its playbackRate. (We're using Web Audio API's
createMediaElementSource
and AudioWorkletProcessor
for this).
But looking ahead (a.k.a. "Margin before") is important, because, for example, there are certain sounds in speech that you can start a word with that are not very loud. But it's not good to skip such sounds just because of that. The speech would become harder to understand. For example, "throb" would become "rob".
<!-- You'd probably still understand what's being said based on the context, but you'd need to use more mental effort. -->Here is where the "stretching" part comes in. It's about how we're able to "look ahead" and slow down shortly before a loud part. Basically it involves slightly (~200ms) delaying the audio before outputting it (and that is for a purpose!).
Imagine that we're currently playing a silent part, so the playback rate is higher. Now, when we encounter a loud part, we go "aha! That might be a word, and it might start with 'th'".
<!-- , which we might not have marked as loud, because 'th' is not that loud" -->As said above, we always delay (buffer) the audio for ~200ms before outputting it. So we know that these 200ms of buffered audio must contain that "th" sound, and we want the user to hear that "th" sound. But remember: at the time we recorded the said sound, the video was playing at a high speed, but we want to play back that 'th' at normal speed. So we can't just output it as is. What do we do?
What we do is we take that buffered (delayed) audio, and we slow it down (stretch and pitch-shift it) so that it appears to have been played at normal speed! Only then do we pass it to the system (which then passes it to your speakers).
And that, kids, is why we call it "the stretching algorithm".
For more details, you can check out the comments in its source code.
</details>The second algorithm is "the cloning algorithm", and it's here. It creates a hidden clone of the target media element and plays it ahead of the original element, looking for silent parts and writing down where they are. When the target element reaches a silent part,
we increase its playbackRate, or skip (seek) the silent part entirely.
Currently you can enable this algorithm by checking the "Use the experimental algorithm" checkbox.
We look for video elements by
injecting a script in all pages
and simply
document.getElementsByTagName('video').
But new video elements could get inserted
after the page has already loaded,
so we watch for new elements with a MutationObserver.
If below you see a block of text instead of a chart, go here.
graph
%%graph TD
%% TODO add links https://mermaid.js.org/syntax/flowchart.html#interaction
watchAllElements["watchAllElements
looks for media elements
on the page"]
click watchAllElements "https://github.com/WofWca/jumpcutter/blob/890a2b25948f39f1553cb9afb06c4cc10c9d2a19/src/entry-points/content/watchAllElements.ts"
AllMediaElementsController["AllMediaElementsController
the orchestrator,"]
click AllMediaElementsController "https://github.com/WofWca/jumpcutter/blob/44fadb1982fbe7dd20c64741ae9e754ba9261042/src/entry-points/content/AllMediaElementsController.ts"
watchAllElements -->|"onNewMediaElements(...elements)"| AllMediaElementsController
AllMediaElementsController -->|original HTMLMediaElement| chooseController{choose
appropriate
controller}
chooseController -->|original HTMLMediaElement| ElementPlaybackControllerCloning & ElementPlaybackControllerStretching
%% ElementPlaybackControllerCloning
%% subgraph "ElementPlaybackControllerCloning"
ElementPlaybackControllerCloning["ElementPlaybackControllerCloning
controls playbackRate
of the original
