From aa82f3b756b497d17f9243e492faf3232f91c8b1 Mon Sep 17 00:00:00 2001 From: Rob Hallam <0504004h@student.gla.ac.uk> Date: Wed, 26 Jun 2024 14:04:55 +0100 Subject: [PATCH] [Meeting 1] add agenda and pre-meeting notes for 2024-06-26 --- meetings.org | 49 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/meetings.org b/meetings.org index f81adab..a3b52b1 100644 --- a/meetings.org +++ b/meetings.org @@ -35,3 +35,52 @@ RH mentioned a recent bereavement which happened during exams, with the knock-on MB's communication preference is email for anything needing specifically `actioned'. +* 2024-06-26 1530: Meeting 1 + +** Agenda + + - [Questionnaire app] :: v basic React PoC done; other features seem implementable in React based on checking; next steps: user stories, plan UI + - [Gaming highlight generator] :: laughter-detection does find laughs (when targeted: some FPs and FNs, when not targetd: many FPs, unknown FNs) + - [Next steps] :: (app) pretty standard dev workflow, more focus on rapid prototypes & early feedback before formal user testing / evaluation later on; (highlights) focus on latter parts of `pipeline' first instead of feature detection- ie processing timestamps (consolidating & into clips), maybe UI for user to adjust clips (selection, times) before highlights made + +** Pre-Meeting + +*** Highlight Generator Pipeline / Workflow + +[[file:~/downloads/highlightgeneration-process.svg]] + +([[https://roberthallam.com/files/highlightgeneration-process.svg][alternative link]]) + +*** Laughter Detection + +Tricky mix of specific package versions needed to get this working in 2024! Also needs a minor change (1 line) due to librosa API update. Can also be run on Colab: + +[[file:~/downloads/colab-laughdetect.png]] + +([[https://roberthallam.com/files/colab-laughdetect.png][alternative link]]) + +Observations: + + - running on a ~5 minute audio clip in AAC format takes ~30s (so a 3 hour video would take ~18 minutes) + - qualitative observation: default parameters have a reasonable mix of detecting obvious laughs with a small number of FPs (and seems to have a few FNs too) + +To test this I generated five audio clips- four were five minutes in duration, selected from longer clips, intended to be representative of obvious laughter, non-obvious / subtle laughter. The final audio clip was the full audio track of around 2½ hours. + +Results + +| Index | Duration | Context | № Detected | FPs | Comments | +|-------+----------+------------------------------------------------------+------------+-----+--------------------------------------------------------------------------| +| 1 | 5:06 | Multiple laughs from different speakers around ~1min | 15 | 3 | Seems inconsistent in detection when laughter is ongoing and overlapping | +| 2 | 5:08 | Mostly discussion, couple chuckles etc | 0 | N/A | Arguably ~4/5 FNs | +| 3 | 5:10 | TBC | 8 | 5 | Detected segments are short | +| 4 | 5:11 | One bit of obvious laughter | 2 | 1 | Detects the obvious bit of laughter | +| 5 | 2:37:11 | Full-length video of gaming session | 74 | 65 | Quite a lot of FPs! | + +/Note: Clips are not exactly five minutes due to the way ffmpeg cuts when doing a stream copy/ + +The results of that testing suggests two things about the default parameters: + + - laughter can be detected, even when it's coming from multiple speakers + - those parameters produce a lot of FPs when not targeted + +Given that, a two-pass approach might yield better results.