|
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322 |
- * Class Overview
-
- ** Prepipeline Class
-
- Placeholder, implied from [[*Post-Pipeline Actions]].
-
- User could action any setup needed here, eg mounting of cource file media.
-
- ** InputFiles
-
- Approach: Described elsewhere, collect files and if relevant map to options (which feature extractor, duration etc).
-
- Options: per-class options (eg ~config_file_path~ to be used with an ~InputFiles~-class which takes user input from a config file)
-
- *** Class Sketch
-
- #+begin_src plantuml :results output :file inputfiles.svg
- scale 1000 height
-
- InputFiles -- InputFilesArgs
- InputFiles -- InputFilesJSON
- InputFiles -- InputFilesYAML
-
- abstract class InputFiles {
- InputFilesOptions options
-
- {abstract} get_files(*args, **kwargs)
-
- }
-
- note right of InputFiles::get_files
- returns JSON
- end note
-
- class InputFilesArgs {}
-
- class InputFilesJSON {}
-
- class InputFilesYAML {}
- #+end_src
-
- #+RESULTS:
- [[file:inputfiles.svg]]
-
- InputFiles
- + InputFilesOptions
- (eg config_file_path for eg InputFilesJSON or InputFilesYAML)
-
- get_files(**kwargs): JSON
-
- ** FeatureExtractor Classes
-
- Basic approach is the usual:
-
- - setup / prepare / pre-run (eg create an audio file if the tool does not like AV media)
- - work / run (find the features)
- - teardown / cleanup / post-run
-
- The /work / run/ phase will need to either capture stdout or read files created by the tool to collect timestamps/intervals. [implementation detail for each FeatureExtractor]
-
- Options (with suggested defaults):
-
- - working directory (~/tmp/highlightgen/~)
- - cleanup temporary files (~True~)
- - log to stdout / file / none (~None~)
- - padding for minimum feature duration (ie lengthen any short features to this duration, ~5s~)
- - trimming for maximum feature duration (ie chop the end[s?] off anything longer than this, ~-1s~)
- - reject/drop features shorter than (~0.1s~)
- - reject/drop features longer than (~0.1s~)
- - reject/drop features with a lower `score' than (~-1~)
-
- Notes on options:
-
- - All options should be optional due to defaults
- - Not all options will apply to all FeatureExtractors (eg some may not produce a `score' or equivalent)
- - Options that are not relevant to an encoder can be specified but will be ignored (consider emitting a ~WARN~ loglevel)
-
- *** Class Sketch
-
- #+begin_src plantuml :results output :file featureextractor.svg
- scale 1000 height
-
- abstract class FeatureExtractor {
- {field} FeatureExtractorOptions options
- {field} Logger logger
-
- {abstract} setup()
-
- {abstract} run()
-
- {abstract} teardown()
- }
-
- '@dataclass
- struct FeatureExtractorOptions {
- + working_directory : str
- + do_cleanup : boolean
- + log_level : int / enum
- + minimum_feature_padding : float
- + maximum_feature_trimming : float
- + reject_shorter_than : float
- + reject_longer_than : float
- + reject_scoring_less_than : float
- }
-
-
- FeatureExtractor::options <|-- FeatureExtractorOptions
- #+end_src
-
- #+RESULTS:
- [[file:featureextractor.svg]]
-
- ** Consolidator
-
- /(tl;dr: clustering? aggregation?)/
-
- Basic approach: any time intervals produced earlier in the pipeline which overlap, or are within some specified /delta/ should be combined into one interval.
-
- Overlap example:
-
- ~(10 - 15, 13 - 20) → (10-20)~
-
- Delta = 5 example:
-
- ~(10 - 15, 18 - 25) → (10-25)~
-
- Non-overlap, non-within-delta=5 example:
-
- ~(10 - 15, 21 - 30) → (10-15, 21-30)~
-
- This is essentially reduction or transform on 1D data (time). It might make sense to consider two approaches (overlap, overlap after delta) separately.
-
- Since taking any action on consolidation (or whatever term) is potentially making an inaccurate or unwarranted value/content decision†, the option to skip this stage entirely (or effectively, in the form of a ~Consolidator~ class which replicates input to output unchanged) should be included.
-
- I am not sure if a ~Consolidator~ of any strategy should be permitted to output zero items / null. Similarly, I am not sure if trying to apply a ~Consolidator~ to any zero-sized / null set is well-defined.
-
- Would using some kind of set theory definition be useful, or just a distraction?
-
- *** Class sketch
-
- #+begin_src plantuml :results output :file consolidator.svg
- scale 1000 height
-
- abstract class Consolidator {
- ConsolidatorOptions options
- {method} run()
- }
- #+end_src
-
- #+RESULTS:
- [[file:consolidator.svg]]
-
- ~Consolidator~:
- - ConsolidatorOptions
- + ConsolidatorSpecificOptions
- - (eg delta)
-
- run()
-
- ** Other Operators
-
- For example:
-
- - ~Join~ (combine/group/associate time intervals -- ie produce one highlight video)
-
- *Note*: this and the next step needs some thinking as to how the output would 'look' for being passed to VideoProducer. I had originally envisioned temporary files being written by intermediate stages, but I then hoped to avoid this and only `produce' a video at the last possible moment. This last part is notionally possible but may be introducing unwarranted complexity.
-
- ** VideoProducer
-
- Approach: take definitions abolve and reify / actualise them- translate something along the lines of "take video /foo/bar.mp4 to produce and take segments A, B, C... and join them to produce a video file", expressed in representation/serialised class object/DSL-definition.
-
- On an implementation level, translate what we have to call out to a program or API, eg ffmpeg MLT libavuser (etc).
-
- Consideration: if video files (however temporary) can be produced earlier in the pipeline, there should perhaps be a ~VideoProducer~ that applies a 'nothing' definition -- that is, effectively it simply copies a (temporary) video to an output video (permanent).
-
- ** Post-Pipeline Actions
-
- Placeholder, not sure of any yet (maybe show log or info? something user-friendly but technically optional?)
-
- ** Additional Classes
- *** Logging
-
- Setup on init- eg ~FileLogger(dest="/path/to/file.log")~ or on ~.setup()~ method ?
-
- Used by classes via eg D-I.
-
- Sketch:
-
- #+BEGIN_SRC plantuml :results output :file /tmp/testuml.png
- '!theme spacelab
- scale 1000 height
-
- FileLogger <|-- Logger
- 'note "throws LoggingError" as LE
-
-
- abstract class Logger {
- {abstract} void log()
- }
-
- note right of Logger::log()
- throws LoggingError
- end note
-
- class FileLogger {
- -_dest : String
- }
- #+END_SRC
-
- #+RESULTS:
- [[file:/tmp/testuml.png]]
-
-
- *** Interval
-
- Convenience class for highlights, around some data like:
-
- #+begin_src json
- {
- "file": "/path/to/video",
- "start": 10,
- "end": 15,
- "duration": 5,
- "highlight_type": "laugh",
- "score": 0.8,
- }
- #+end_src
-
- Advantages include:
-
- - can set start and duration or end
- - makes it clearer what is being passed around
-
- Disadvantages:
-
- - class proliferation?
-
- ** Additional Considerations
-
- Would it be desirable to add custom/user pre/post steps for each part of the pipeline?
-
- Pros: lots of flexibility
- Cons: complexity for ?practical benefit (WLtH)
-
- ** Overview / Recap
-
- #+begin_src plantuml :results output :file pipeline-overview.svg
- scale 1000 height
- title Video Highlight Generation Pipeline
- allowmixing
-
- abstract class Logger {
- {abstract} log()
- }
-
- actor User
-
- User -> PrePipelineAction
- PrePipelineAction -> InputFiles
- InputFiles -> FeatureExtractor
- FeatureExtractor -> Consolidator
- Consolidator -> Operators
- Consolidator -> VideoProducer
- Operators -> VideoProducer
- VideoProducer -> PostPipelineAction
- VideoProducer -> User : <i>Output video(s)</i>
- PostPipelineAction --> User : <i>Output video(s)</i>
-
- abstract class PrePipelineAction {}
-
- abstract class InputFiles {}
-
- abstract class FeatureExtractor {}
- #+end_src
-
- #+RESULTS:
- [[file:pipeline-overview.svg]]
-
- #+begin_src plantuml :results output :file highlight-pipeline2.svg
- scale 1000 height
- !theme cerulean
-
- actor User
- action PrePipelineAction
- process InputFiles
- file Video as V1
- file Video as V2
- file Video as VN
- process FeatureExtractor
- collections Features
- process Consolidator
- collections "Consolidated Features" as ConsolidatedFeatures
- process Operators
- process VideoProducer
- file Highlight as H1
- file Highlight as H2
- file Highlight as H3
- process PostPipelineAction
-
-
- User -> PrePipelineAction
- PrePipelineAction -> InputFiles
- InputFiles <.. V1
- InputFiles <.. V2
- InputFiles <.. VN
-
- InputFiles -> FeatureExtractor
- FeatureExtractor .. Features
- FeatureExtractor -> Consolidator
- Consolidator .. ConsolidatedFeatures
- Consolidator -> Operators
- Consolidator -> VideoProducer
- Operators -> VideoProducer
- Operators .. ConsolidatedFeatures
- VideoProducer -> PostPipelineAction
- VideoProducer ..> H1
- VideoProducer ..> H2
- VideoProducer ..> H3
- #+end_src
-
- #+RESULTS:
- [[file:highlight-pipeline2.svg]]
|