The problem that I had overlooked until now (!!) due to only using small
exemplar videos for testing was that VideoActivityFeatureExtractor keeps
a fraction of small (0.5s) Features. This is not so problematic for a short
exemplar video, but ends up witha lot of jumpy Features in an actual source
video.
Fix approach:
Like LoudnessFE, keep a specific number and expand the duration (which I think I
will do for LoudnessFE too), dropping any that would be consumed in that range
Tries to drop the lowest-scoring Features until the target time (range) is
reached. This is not optimised and a relatively naïve approach- there are many
inputs which would result in a non-ideal pruning.
TargetTimeAdjuster will adjust a list of Features until it is within an optional
margin of a target total duration.
Helper functions:
- _determine_margin() :: figure out the max and min cutoff times, considering
margin and margin strategy (percent / absolute)
- _features_total_time() :: basic sum of list of Features' durations
TODO: rename to TargetDurationAdjuster ? rename 'strategy' ??
Adjusters will be used to modify a list of Features. This could either be:
- to modify the overall set (eg to target a time)
- to modify individual Features
The most important Adjuster will be one that targets an overall time, eg:
"modify this list of Features such that their times add up to 1 minute (either ±
a % or a hard limit)"
@see: feature_extractors.py::FeatureExtractor
To check we do not match any words that are not present in a media file with
speech, we use the English-language Harvard sentences for audio, and compare to
the classic French pangram: "Portez ce vieux whisky au juge blond qui fume" [1]
and ensure no Features are produced
[1]: "Take this old whisky to the blond judge who is smoking"
Found out that Whisper throws a hissy fit in the form of a RuntimeError if the
there is no speech in the audio. We should consider catching this.
> RuntimeError: stack expects a non-empty TensorList
> stdout: "No active speech found in audio"
For the moment we can check that no audio throws an error and leave this as a TODO