[Meeting 1] add agenda and pre-meeting notes for 2024-06-26

8 months ago · aa82f3b756
--- a/meetings.org
+++ b/meetings.org
@@ -35,3 +35,52 @@ RH mentioned a recent bereavement which happened during exams, with the knock-on

 MB's communication preference is email for anything needing specifically `actioned'.

 * 2024-06-26 1530: Meeting 1

 ** Agenda

 - [Questionnaire app] :: v basic React PoC done; other features seem implementable in React based on checking; next steps: user stories, plan UI
 - [Gaming highlight generator] :: laughter-detection does find laughs (when targeted: some FPs and FNs, when not targetd: many FPs, unknown FNs)
 - [Next steps] :: (app) pretty standard dev workflow, more focus on rapid prototypes & early feedback before formal user testing / evaluation later on; (highlights) focus on latter parts of `pipeline' first instead of feature detection- ie processing timestamps (consolidating & into clips), maybe UI for user to adjust clips (selection, times) before highlights made

 ** Pre-Meeting

 *** Highlight Generator Pipeline / Workflow

 [[file:~/downloads/highlightgeneration-process.svg]]

 ([[https://roberthallam.com/files/highlightgeneration-process.svg][alternative link]])

 *** Laughter Detection

 Tricky mix of specific package versions needed to get this working in 2024! Also needs a minor change (1 line) due to librosa API update. Can also be run on Colab:

 [[file:~/downloads/colab-laughdetect.png]]

 ([[https://roberthallam.com/files/colab-laughdetect.png][alternative link]])

 Observations:

 - running on a ~5 minute audio clip in AAC format takes ~30s (so a 3 hour video would take ~18 minutes)
 - qualitative observation: default parameters have a reasonable mix of detecting obvious laughs with a small number of FPs (and seems to have a few FNs too)

 To test this I generated five audio clips- four were five minutes in duration, selected from longer clips, intended to be representative of obvious laughter, non-obvious / subtle laughter. The final audio clip was the full audio track of around 2½ hours.

 Results

 | Index | Duration | Context                                              | № Detected | FPs | Comments                                                                 |
 |-------+----------+------------------------------------------------------+------------+-----+--------------------------------------------------------------------------|
 |     1 |     5:06 | Multiple laughs from different speakers around ~1min |         15 |   3 | Seems inconsistent in detection when laughter is ongoing and overlapping |
 |     2 |     5:08 | Mostly discussion, couple chuckles etc               |          0 | N/A | Arguably ~4/5 FNs                                                        |
 |     3 |     5:10 | TBC                                                  |          8 |   5 | Detected segments are short                                              |
 |     4 |     5:11 | One bit of obvious laughter                          |          2 |   1 | Detects the obvious bit of laughter                                      |
 |     5 |  2:37:11 | Full-length video of gaming session                  |         74 |  65 | Quite a lot of FPs!                                                      |

 /Note: Clips are not exactly five minutes due to the way ffmpeg cuts when doing a stream copy/

 The results of that testing suggests two things about the default parameters:

 - laughter can be detected, even when it's coming from multiple speakers
 - those parameters produce a lot of FPs when not targeted

 Given that, a two-pass approach might yield better results.