You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

highlight-pipeline-classes.org 8.3 KiB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322
  1. * Class Overview
  2. ** Prepipeline Class
  3. Placeholder, implied from [[*Post-Pipeline Actions]].
  4. User could action any setup needed here, eg mounting of cource file media.
  5. ** InputFiles
  6. Approach: Described elsewhere, collect files and if relevant map to options (which feature extractor, duration etc).
  7. Options: per-class options (eg ~config_file_path~ to be used with an ~InputFiles~-class which takes user input from a config file)
  8. *** Class Sketch
  9. #+begin_src plantuml :results output :file inputfiles.svg
  10. scale 1000 height
  11. InputFiles -- InputFilesArgs
  12. InputFiles -- InputFilesJSON
  13. InputFiles -- InputFilesYAML
  14. abstract class InputFiles {
  15. InputFilesOptions options
  16. {abstract} get_files(*args, **kwargs)
  17. }
  18. note right of InputFiles::get_files
  19. returns JSON
  20. end note
  21. class InputFilesArgs {}
  22. class InputFilesJSON {}
  23. class InputFilesYAML {}
  24. #+end_src
  25. #+RESULTS:
  26. [[file:inputfiles.svg]]
  27. InputFiles
  28. + InputFilesOptions
  29. (eg config_file_path for eg InputFilesJSON or InputFilesYAML)
  30. get_files(**kwargs): JSON
  31. ** FeatureExtractor Classes
  32. Basic approach is the usual:
  33. - setup / prepare / pre-run (eg create an audio file if the tool does not like AV media)
  34. - work / run (find the features)
  35. - teardown / cleanup / post-run
  36. The /work / run/ phase will need to either capture stdout or read files created by the tool to collect timestamps/intervals. [implementation detail for each FeatureExtractor]
  37. Options (with suggested defaults):
  38. - working directory (~/tmp/highlightgen/~)
  39. - cleanup temporary files (~True~)
  40. - log to stdout / file / none (~None~)
  41. - padding for minimum feature duration (ie lengthen any short features to this duration, ~5s~)
  42. - trimming for maximum feature duration (ie chop the end[s?] off anything longer than this, ~-1s~)
  43. - reject/drop features shorter than (~0.1s~)
  44. - reject/drop features longer than (~0.1s~)
  45. - reject/drop features with a lower `score' than (~-1~)
  46. Notes on options:
  47. - All options should be optional due to defaults
  48. - Not all options will apply to all FeatureExtractors (eg some may not produce a `score' or equivalent)
  49. - Options that are not relevant to an encoder can be specified but will be ignored (consider emitting a ~WARN~ loglevel)
  50. *** Class Sketch
  51. #+begin_src plantuml :results output :file featureextractor.svg
  52. scale 1000 height
  53. abstract class FeatureExtractor {
  54. {field} FeatureExtractorOptions options
  55. {field} Logger logger
  56. {abstract} setup()
  57. {abstract} run()
  58. {abstract} teardown()
  59. }
  60. '@dataclass
  61. struct FeatureExtractorOptions {
  62. + working_directory : str
  63. + do_cleanup : boolean
  64. + log_level : int / enum
  65. + minimum_feature_padding : float
  66. + maximum_feature_trimming : float
  67. + reject_shorter_than : float
  68. + reject_longer_than : float
  69. + reject_scoring_less_than : float
  70. }
  71. FeatureExtractor::options <|-- FeatureExtractorOptions
  72. #+end_src
  73. #+RESULTS:
  74. [[file:featureextractor.svg]]
  75. ** Consolidator
  76. /(tl;dr: clustering? aggregation?)/
  77. Basic approach: any time intervals produced earlier in the pipeline which overlap, or are within some specified /delta/ should be combined into one interval.
  78. Overlap example:
  79. ~(10 - 15, 13 - 20) → (10-20)~
  80. Delta = 5 example:
  81. ~(10 - 15, 18 - 25) → (10-25)~
  82. Non-overlap, non-within-delta=5 example:
  83. ~(10 - 15, 21 - 30) → (10-15, 21-30)~
  84. This is essentially reduction or transform on 1D data (time). It might make sense to consider two approaches (overlap, overlap after delta) separately.
  85. Since taking any action on consolidation (or whatever term) is potentially making an inaccurate or unwarranted value/content decision†, the option to skip this stage entirely (or effectively, in the form of a ~Consolidator~ class which replicates input to output unchanged) should be included.
  86. I am not sure if a ~Consolidator~ of any strategy should be permitted to output zero items / null. Similarly, I am not sure if trying to apply a ~Consolidator~ to any zero-sized / null set is well-defined.
  87. Would using some kind of set theory definition be useful, or just a distraction?
  88. *** Class sketch
  89. #+begin_src plantuml :results output :file consolidator.svg
  90. scale 1000 height
  91. abstract class Consolidator {
  92. ConsolidatorOptions options
  93. {method} run()
  94. }
  95. #+end_src
  96. #+RESULTS:
  97. [[file:consolidator.svg]]
  98. ~Consolidator~:
  99. - ConsolidatorOptions
  100. + ConsolidatorSpecificOptions
  101. - (eg delta)
  102. run()
  103. ** Other Operators
  104. For example:
  105. - ~Join~ (combine/group/associate time intervals -- ie produce one highlight video)
  106. *Note*: this and the next step needs some thinking as to how the output would 'look' for being passed to VideoProducer. I had originally envisioned temporary files being written by intermediate stages, but I then hoped to avoid this and only `produce' a video at the last possible moment. This last part is notionally possible but may be introducing unwarranted complexity.
  107. ** VideoProducer
  108. Approach: take definitions abolve and reify / actualise them- translate something along the lines of "take video /foo/bar.mp4 to produce and take segments A, B, C... and join them to produce a video file", expressed in representation/serialised class object/DSL-definition.
  109. On an implementation level, translate what we have to call out to a program or API, eg ffmpeg MLT libavuser (etc).
  110. Consideration: if video files (however temporary) can be produced earlier in the pipeline, there should perhaps be a ~VideoProducer~ that applies a 'nothing' definition -- that is, effectively it simply copies a (temporary) video to an output video (permanent).
  111. ** Post-Pipeline Actions
  112. Placeholder, not sure of any yet (maybe show log or info? something user-friendly but technically optional?)
  113. ** Additional Classes
  114. *** Logging
  115. Setup on init- eg ~FileLogger(dest="/path/to/file.log")~ or on ~.setup()~ method ?
  116. Used by classes via eg D-I.
  117. Sketch:
  118. #+BEGIN_SRC plantuml :results output :file /tmp/testuml.png
  119. '!theme spacelab
  120. scale 1000 height
  121. FileLogger <|-- Logger
  122. 'note "throws LoggingError" as LE
  123. abstract class Logger {
  124. {abstract} void log()
  125. }
  126. note right of Logger::log()
  127. throws LoggingError
  128. end note
  129. class FileLogger {
  130. -_dest : String
  131. }
  132. #+END_SRC
  133. #+RESULTS:
  134. [[file:/tmp/testuml.png]]
  135. *** Interval
  136. Convenience class for highlights, around some data like:
  137. #+begin_src json
  138. {
  139. "file": "/path/to/video",
  140. "start": 10,
  141. "end": 15,
  142. "duration": 5,
  143. "highlight_type": "laugh",
  144. "score": 0.8,
  145. }
  146. #+end_src
  147. Advantages include:
  148. - can set start and duration or end
  149. - makes it clearer what is being passed around
  150. Disadvantages:
  151. - class proliferation?
  152. ** Additional Considerations
  153. Would it be desirable to add custom/user pre/post steps for each part of the pipeline?
  154. Pros: lots of flexibility
  155. Cons: complexity for ?practical benefit (WLtH)
  156. ** Overview / Recap
  157. #+begin_src plantuml :results output :file pipeline-overview.svg
  158. scale 1000 height
  159. title Video Highlight Generation Pipeline
  160. allowmixing
  161. abstract class Logger {
  162. {abstract} log()
  163. }
  164. actor User
  165. User -> PrePipelineAction
  166. PrePipelineAction -> InputFiles
  167. InputFiles -> FeatureExtractor
  168. FeatureExtractor -> Consolidator
  169. Consolidator -> Operators
  170. Consolidator -> VideoProducer
  171. Operators -> VideoProducer
  172. VideoProducer -> PostPipelineAction
  173. VideoProducer -> User : <i>Output video(s)</i>
  174. PostPipelineAction --> User : <i>Output video(s)</i>
  175. abstract class PrePipelineAction {}
  176. abstract class InputFiles {}
  177. abstract class FeatureExtractor {}
  178. #+end_src
  179. #+RESULTS:
  180. [[file:pipeline-overview.svg]]
  181. #+begin_src plantuml :results output :file highlight-pipeline2.svg
  182. scale 1000 height
  183. !theme cerulean
  184. actor User
  185. action PrePipelineAction
  186. process InputFiles
  187. file Video as V1
  188. file Video as V2
  189. file Video as VN
  190. process FeatureExtractor
  191. collections Features
  192. process Consolidator
  193. collections "Consolidated Features" as ConsolidatedFeatures
  194. process Operators
  195. process VideoProducer
  196. file Highlight as H1
  197. file Highlight as H2
  198. file Highlight as H3
  199. process PostPipelineAction
  200. User -> PrePipelineAction
  201. PrePipelineAction -> InputFiles
  202. InputFiles <.. V1
  203. InputFiles <.. V2
  204. InputFiles <.. VN
  205. InputFiles -> FeatureExtractor
  206. FeatureExtractor .. Features
  207. FeatureExtractor -> Consolidator
  208. Consolidator .. ConsolidatedFeatures
  209. Consolidator -> Operators
  210. Consolidator -> VideoProducer
  211. Operators -> VideoProducer
  212. Operators .. ConsolidatedFeatures
  213. VideoProducer -> PostPipelineAction
  214. VideoProducer ..> H1
  215. VideoProducer ..> H2
  216. VideoProducer ..> H3
  217. #+end_src
  218. #+RESULTS:
  219. [[file:highlight-pipeline2.svg]]