  1. * Class Overview
  2. ** Prepipeline Class
  3. Placeholder, implied from [[*Post-Pipeline Actions]].
  4. User could action any setup needed here, eg mounting of cource file media.
  5. ** InputFiles
  6. Approach: Described elsewhere, collect files and if relevant map to options (which feature extractor, duration etc).
  7. Options: per-class options (eg ~config_file_path~ to be used with an ~InputFiles~-class which takes user input from a config file)
  8. *** Class Sketch
  27. InputFiles
  28. + InputFilesOptions
  29. (eg config_file_path for eg InputFilesJSON or InputFilesYAML)
  30. get_files(**kwargs): JSON
  31. ** FeatureExtractor Classes
  32. Basic approach is the usual:
  33. - setup / prepare / pre-run (eg create an audio file if the tool does not like AV media)
  34. - work / run (find the features)
  35. - teardown / cleanup / post-run
  36. The /work / run/ phase will need to either capture stdout or read files created by the tool to collect timestamps/intervals. [implementation detail for each FeatureExtractor]
  37. Options (with suggested defaults):
  38. - working directory (~/tmp/highlightgen/~)
  39. - cleanup temporary files (~True~)
  40. - log to stdout / file / none (~None~)
  41. - padding for minimum feature duration (ie lengthen any short features to this duration, ~5s~)
  42. - trimming for maximum feature duration (ie chop the end[s?] off anything longer than this, ~-1s~)
  43. - reject/drop features shorter than (~0.1s~)
  44. - reject/drop features longer than (~0.1s~)
  45. - reject/drop features with a lower `score' than (~-1~)
  46. Notes on options:
  47. - All options should be optional due to defaults
  48. - Not all options will apply to all FeatureExtractors (eg some may not produce a `score' or equivalent)
  49. - Options that are not relevant to an encoder can be specified but will be ignored (consider emitting a ~WARN~ loglevel)
  50. *** Class Sketch
  75. ** Consolidator
  76. /(tl;dr: clustering? aggregation?)/
  77. Basic approach: any time intervals produced earlier in the pipeline which overlap, or are within some specified /delta/ should be combined into one interval.
  78. Overlap example:
  79. ~(10 - 15, 13 - 20) → (10-20)~
  80. Delta = 5 example:
  81. ~(10 - 15, 18 - 25) → (10-25)~
  82. Non-overlap, non-within-delta=5 example:
  83. ~(10 - 15, 21 - 30) → (10-15, 21-30)~
  84. This is essentially reduction or transform on 1D data (time). It might make sense to consider two approaches (overlap, overlap after delta) separately.
  85. Since taking any action on consolidation (or whatever term) is potentially making an inaccurate or unwarranted value/content decision†, the option to skip this stage entirely (or effectively, in the form of a ~Consolidator~ class which replicates input to output unchanged) should be included.
  86. I am not sure if a ~Consolidator~ of any strategy should be permitted to output zero items / null. Similarly, I am not sure if trying to apply a ~Consolidator~ to any zero-sized / null set is well-defined.
  87. Would using some kind of set theory definition be useful, or just a distraction?
  88. *** Class sketch
  98. ~Consolidator~:
  99. - ConsolidatorOptions
  100. + ConsolidatorSpecificOptions
  101. - (eg delta)
  102. run()
  103. ** Other Operators
  104. For example:
  105. - ~Join~ (combine/group/associate time intervals -- ie produce one highlight video)
  106. *Note*: this and the next step needs some thinking as to how the output would 'look' for being passed to VideoProducer. I had originally envisioned temporary files being written by intermediate stages, but I then hoped to avoid this and only `produce' a video at the last possible moment. This last part is notionally possible but may be introducing unwarranted complexity.
  107. ** VideoProducer
  108. Approach: take definitions abolve and reify / actualise them- translate something along the lines of "take video /foo/bar.mp4 to produce and take segments A, B, C... and join them to produce a video file", expressed in representation/serialised class object/DSL-definition.
  109. On an implementation level, translate what we have to call out to a program or API, eg ffmpeg MLT libavuser (etc).
  110. Consideration: if video files (however temporary) can be produced earlier in the pipeline, there should perhaps be a ~VideoProducer~ that applies a 'nothing' definition -- that is, effectively it simply copies a (temporary) video to an output video (permanent).
  111. ** Post-Pipeline Actions
  112. Placeholder, not sure of any yet (maybe show log or info? something user-friendly but technically optional?)
  113. ** Additional Classes
  114. *** Logging
  115. Setup on init- eg ~FileLogger(dest="/path/to/file.log")~ or on ~.setup()~ method ?
  116. Used by classes via eg D-I.
  117. Sketch:
  135. *** Interval
  136. Convenience class for highlights, around some data like:
  137. #+begin_src json
  138. {
  139. "file": "/path/to/video",
  140. "start": 10,
  141. "end": 15,
  142. "duration": 5,
  143. "highlight_type": "laugh",
  144. "score": 0.8,
  145. }
  146. #+end_src
  147. Advantages include:
  148. - can set start and duration or end
  149. - makes it clearer what is being passed around
  150. Disadvantages:
  151. - class proliferation?
  152. ** Additional Considerations
  153. Would it be desirable to add custom/user pre/post steps for each part of the pipeline?
  154. Pros: lots of flexibility
  155. Cons: complexity for ?practical benefit (WLtH)
  156. ** Overview / Recap
