add project preplanning - overview of pt q app & video highlights

hace 1 año · f9dd5f473a
--- a/project-preplanning.org
+++ b/project-preplanning.org
@@ -0,0 +1,89 @@
 * Patient Questionnaire App

 ** Elevator Pitch

 A simple app to let patients fill in relevant questionnaires -- eg DLQI & POEM for dermatology -- as an aid in monitoring their condition and facilitating discussion with clinicians.

 ** Prior Art

 Apps exist for both of the proposed questionnaires:

 - [[https://play.google.com/store/apps/details?id=uk.ac.cardiff.dlqi][DLQI app by Cardiff University]]
 - [[ https://play.google.com/store/apps/details?id=my.eczema.tracker][My Eczema Tracker]] (MET)

 I have downloaded and tried out both apps.

 ** How This Project Might Proceed

 While apps exist for the two questionnaires mentioned in the `pitch', there is still scope to do work which will improve upon what exists:

 - combined app :: most simply, there are two apps, having a single app for both questionnaires is surely preferable (/stretch goal/: define questionnaires in eg JSON, the app can dynamically expand without needing updated)
 - encryption :: even though the data is stored locally, it is still worthwhile to encrypt it at rest for privacy reasons
 - export‡ :: being able to export eg a PDF or similar would be a boon for sharing with clinicians (eg this could be printed off to be added to notes); other formats are possible too
 - patient notes† :: being able to take freetext notes, which could be associated with a questionnaire or be `freestanding', would aid memory for patients in consulations
 - graphing† :: since these questionnaires usually produce a score, this could be charted over time (handy for spotting patterns)
 - reminders† :: a periodic notification (eg weekly, bi-weekly, monthly) would help remind patients to track their symptoms

 This project would seek to put together an app using a simple mobile framework (eg JQueryUI, Cordova, etc) that implements as many of the features above as is feasible. Usability feedback could be sought by i) general users ii) clinicians.

 †: The My Eczema Tracker app looks like it has these features from its screenshots

 ‡: My Eczema Tracker seem to also offer this: ``You can also download all your results to your device for you to review or share with your healthcare professional.'' but it is unclear what format this is (seems to be CSV)

 While MET already has some features I would like to implement for the other questionnaire, it does have a minor usability irk insofar as the user needs to scroll to hit `next' (tested with an old OnePlus3).

 Rough outline:

 1. early phase: design key parts of the app (user story cards, MoSCoW, etc), investigate & decide which app framework to use
 2. mid phase: evolve & refine prototype & design usability tests, seek input
 3. late phrase: user testing, demo & write-up

 * Automatic Video Game Footage Highlight Generator

 ** Elevator Pitch

 Quite often the full video is less interesting than the highlights -- this might be a funny moment, an intense moment, etc -- but scanning through footage for these is a time-consuming and boring task; could some simple heuristics do a reasonable job of finding parts of a full video to use for highlights?

 ** General Approach

 Where the video is quiet it is unlikely to be a highlight. Interesting bits might be:

 1. where the *audio level peaks* (eg someone speaking under stress, multiple people speaking, loud part of a game)
 2. where there is *laughter*, something funny probably happened or was said
 3. where there is lots of *motion* something interesting may be happening:w

 To find these points:

 1. there surely exist tools for absolute loudness detection (and perhaps perceptual loudness)
 2. laughter detection might be feasible by training or tuning a model
 3. motion could be detected by parts of the video where the bitrate increases (if VBR) or where encoding artifacts are more prominent (if CBR) -- not sure if the latter are detectable programmatically

 ** Prior Art

 - laugh detector :: [[https://github.com/jrgillick/laughter-detection][jrgillick/laughter-detection]] -- from [fn:laughdetect]
 - loudness detection :: looks like python [[https://librosa.org/doc/main/index.html][librosa]] ([[https://librosa.org/doc/main/generated/librosa.feature.rms.html][RMS function]]) is [[https://stackoverflow.com/a/73255652][an option]]
 - detecting multiple speakers :: this is part of /speaker diarisation/, which there are options for (eg [[https://cmusphinx.github.io/wiki/speakerdiarization/][LIUM / CMUSphinx]], [[https://github.com/pyannote/pyannote-audio][pyannotate-audio]] (NB needs HuggingFace token), etc

 [fn:laughdetect] 2021 Jon Gillick, Wesley Deng, Kimiko Ryokai, and David Bamman, *``Robust Laughter Detection in Noisy Environments."* INTERSPEECH [[https://www.isca-archive.org/interspeech_2021/gillick21_interspeech.pdf][(PDF link)]]

 ** Potential Pitfalls

 There are a few caveats:

 - I don't know how well the `laughter detection' model works with the sample data (ie my own video files)
 - I don't know the first thing about training a model or tuning one (I suspect I would need several thousands of samples to train, perhaps fewer to tune?)
 - I lack hardware for tuning (my GPU is ancient and doesn't have features of newer GPUS)
 - Working with video files can be quite slow in general

 ** How This Project Might Proceed

 There are two main `strands' to this:

 - the techniques for finding good highlights
 - using these to actually generate videos either automatically or semi-automatically

 So the approach could be:

 1. early phase: write a short script or two that uses ffmpeg to extract ROI from videos (trivial); find out if there are other pre-trained audio models which could be tuned; get laughter-detection, librosa &co set up and see if they produce useful output
 2. mid phrase: refine- ie try and make process faster and more accurate
 3. late phase: tidy up- make process more user friendly (options: completely automated output; generate several and let user pick which to keep), write up