Phoneme-Level Text-to-Speech Synchronization (for Python)
In this module, we demonstrate how the Montreal Forced Aligner (MFA), an open-source tool, can be used to automatically align speech with its corresponding text transcription at both the word and phoneme levels.
By Shuguang Sheng, Davide Ahmar & Wim Pouw
EnvisionObjectAnnotator (Python App)
This modules provides a new python desktop app that leverages SAM2 to automatically track any object in videos and detect spatial overlaps between a speficied target and objects in the scene .
By Davide Ahmar, Babajide Owoyele & Wim Pouw
Audio Processing & Speech Analysis Suite (Python)
Comprehensive toolkit for audio analysis, speech transcription, and speaker identification using multiple Python libraries.
by Marianne de Heer Kloots
Audio Analysis with Parselmouth
Extract and visualize speech features using Praat's Python interface.
Speaker Diarization with pyannote
Identify who spoke when using pyannote-audio toolkit.
Speech-to-Text with Whisper
Generate automatic transcriptions using OpenAI's Whisper model.
EnvisionHGdetector Package Suite (Python)
Complete gesture detection and analysis toolkit for research and real-time applications using machine learning.
by Wim Pouw, Bosco Yung, Sharjeel Shaikh, James Trujillo, Gerard de Melo, Babajide Owoyele
Automatic Gesture Analysis & Visualization
Automatically annotate hand gesture stroke events, analyze kinematics, and create dashboard visualizations.
Real-Time Gesture Detection
Detect gestures in real-time from webcam feed using Light Gradient Boosting Machine.
SPUDNIG-PYTHON Motion-detection Assisted Gesture Annotation (Python)
This module uses motion detection to aid in automatic gesture annotation based on the tool SPUDNIG
By James Trujillo (based on SPUDNIG by Ripperda, Drijvers & Holler)
Full-body tracking, +masking/blurring, and movement tracing (Python)
This module shows how to track the face, hands, and body using MediaPipe, with the option of masking, blurring, and movement tracing.
By Wim Pouw & Sho Akamine
Post-synchronizing video and audio recordings from separate devices (Python)
This module provides a multi-purpose pipeline for post-synchronizing video and audio recordings of the same event, recorded separately on different devices.
by Hamza Nalbantoğlu & Šárka Kadavá
Automatic Stimuli Creation with Blocked Facial Information (Python)
This python module provides scripts for systematically masking of facial information at different intensities. This can be used to reduce communicative potential of mouthing in signed languages or articulatory gestures in spoken language.
by Wim Pouw & Annika Schiefner
Quantifying Interpersonal Synchrony (Python)
This module provides an introduction to calculating interpersonal movement synchrony, including time-lag assessment and pseudo-pair calculation.
by James Trujillo
Multi-person tracking with YOLO and computing social proximity (Python)
This module uses the very reliable YOLO ultralytics pose tracking for multiple persons for top view or other perspectives, and shows an simple calculation of interpersonal distance between two persons.
By Wim Pouw, Arkadiusz Białek, and James Trujillo
Visual Communication (ViCOM) tutorial with exercises: A complete kinematic feature analysis pipeline (Python)
This module contains a kinematic feature extraction pipeline with excercises for students of communicative motion analysis.
By Wim Pouw
Behavioral Classification Using Convolutional Neural Networks (Python)
This module takes you through training a model to automatically annotate bodily gestures.
By Wim Pouw
Decision Tree-Based Classification Algorithms (R)
This module takes you through using decision trees to make sense of high-dimensional data.
By Alexander Kilpatrick
Multimodal annotation distances (Python and R)
This module takes in annotations in ELAN and allows to compare the overlap between them using the multimodal-annotation-distance tool.
By Camila Antônio Barros
Creating video-embedded time series animations (Python)
This module takes in a video, and then creates movement-sound time series animations embedded in the video.
By Wim Pouw
Turn-Taking Dynamics and Entropy (Python)
This module introduces calculating turn-taking measures, such as gaps and overlaps, as well as entropy, from conversations with 2 or more speakers.
By James Trujillo
Gesture networks and DTW (Python)
This module demonstrates how to implement gesture networks and gesture spaces using dynamic time warping.
By Wim Pouw
Demo for OpenPose with 3D tracking with Pose2Sim (Python)
This module provides a python pipeline for openpose tracking and 3D triangulation with Pose2Sim.
By Šárka Kadavá & Wim Pouw
Dynamic visualization dashboard (Python)
This module provides an example of a dynamic dashboard that displays audio-visual and static data.
By Wim Pouw
Head rotation tracking by adapting mediapipe (Python)
This module shows a way to track head directions, next the face, hands, and body tracking using MediaPipe.
By Wim Pouw
Running OpenPose in batches (Batch script)
This module demonstrates how to use batch scripting to run OpenPose on a set of videos.
By James Trujillo & Wim Pouw
Recording from multiple cameras synchronously while also streaming to LSL
This module demonstrates how to record from multiple cameras synchronously, which is very helpful for creating your own 3D motion tracking pipeline.
By Šárka Kadavá & Wim Pouw
3D tracking from 2D videos using anipose and deeplabcut (Python)
This module shows how to set up a 3D motion tracking system with multiple 2D cameras, using anipose and human pose tracking with DeepLabCut.
By Wim Pouw
Aligning and pre-processing multiple data streams (R)
This module provides an overview of how to wrangle multiple data streams (motion tracking, acoustics, annotations) and preprocess them (smoothing) to create a single long time series dataset ready for further processing.
By Wim Pouw
Aligning and pre-processing multiple data streams (Python)
In this module an overview is provided how to wrangle multiple data streams (motion tracking, acoustics, annotations) and preprocess them (smoothing) so that you end up with one long timeseries dataset ready for further processing.
By Wim Pouw
Extracting a smoothed amplitude envelope from sound (R)
This module demonstrates how to extract a smoothed amplitude envelope from a sound file.
By Wim Pouw
Motion tracking analysis: Kinematic feature extraction (Python)
This module provides an example of how to analyze motion tracking data using kinematic feature extraction.
By James Trujillo
Feature extraction for machine classification & practice dataset SAGA (R)
This module introduces a practice dataset and provides R code for setting up a kinematic and speech acoustic feature dataset that can be used to train a machine classifier for gesture types.
By Wim Pouw
Cross-Wavelet Analysis of Speech-Gesture Synchrony (R)
This module introduces the use of Cross-Wavelet analysis as a way to measure temporal synchrony of speech and gesture (or other visual signals).
By James Trujillo