EnvisionObjectAnnotator (Python App)
This module provides a Python desktop app that leverages SAM2 to automatically track any object in videos and detect spatial overlaps between a specified target and scene objects.
By Davide Ahmar, Babajide Owoyele & Wim Pouw
Audio Processing & Speech Analysis Suite (Python)
Comprehensive toolkit for audio analysis, speech transcription, and speaker identification using multiple Python libraries.
by Marianne de Heer Kloots
Audio Analysis with Parselmouth
Extract and visualize speech features using Praat's Python interface.
Speaker Diarization with pyannote
Identify who spoke when using pyannote-audio toolkit.
Speech-to-Text with Whisper
Generate automatic transcriptions using OpenAI's Whisper model.
EnvisionHGdetector Package Suite (Python)
Complete gesture detection and analysis toolkit for research and real-time applications using machine learning.
by Wim Pouw, Bosco Yung, Sharjeel Shaikh, James Trujillo, Gerard de Melo, Babajide Owoyele
Automatic Gesture Analysis & Visualization
Automatically annotate hand gesture stroke events, analyze kinematics, and create dashboard visualizations.
Real-Time Gesture Detection
Detect gestures in real-time from webcam feed using LightGBM.
SPUDNIG-PYTHON Motion-detection Assisted Gesture Annotation (Python)
Motion detection to aid automatic gesture annotation (based on SPUDNIG).
By James Trujillo (based on Ripperda, Drijvers & Holler)
Full‑body tracking, masking/blurring & movement tracing (Python)
Track face, hands, and body using MediaPipe, with masking/blur and tracing.
By Wim Pouw & Sho Akamine
Post‑synchronizing video/audio from separate devices (Python)
Multi‑purpose pipeline for post‑synchronizing video and audio recordings from different devices.
by Hamza Nalbantoğlu & Šárka Kadavá
Automatic Stimuli Creation with Blocked Facial Information (Python)
Scripts for systematically masking facial information at different intensities.
by Wim Pouw & Annika Schiefner
Quantifying Interpersonal Synchrony (Python)
Calculating interpersonal movement synchrony, time lags, and pseudo‑pairs.
by James Trujillo
Multi‑person tracking with YOLO & social proximity (Python)
YOLO pose tracking for multiple persons + interpersonal distance calculation.
By Wim Pouw, Arkadiusz Białek, and James Trujillo
ViCOM tutorial with exercises: kinematic feature pipeline (Python)
Kinematic feature extraction pipeline with exercises for communicative motion analysis.
By Wim Pouw
Behavioral Classification Using CNNs (Python)
Train a model to automatically annotate bodily gestures.
By Wim Pouw
Decision Tree‑Based Classification Algorithms (R)
Use decision trees to make sense of high‑dimensional data.
By Alexander Kilpatrick
Multimodal annotation distances (Python & R)
Compare overlap between ELAN annotations using the multimodal‑annotation‑distance tool.
By Camila Antônio Barros
Video‑embedded time series animations (Python)
Create movement‑sound time series animations embedded in a video.
By Wim Pouw
Turn‑Taking Dynamics and Entropy (Python)
Calculate gaps/overlaps and entropy in conversations with 2+ speakers.
By James Trujillo
Gesture networks and DTW (Python)
Implement gesture networks and spaces using dynamic time warping.
By Wim Pouw
OpenPose with 3D tracking via Pose2Sim (Python)
Python pipeline for OpenPose tracking and 3D triangulation with Pose2Sim.
By Šárka Kadavá & Wim Pouw
Dynamic visualization dashboard (Python)
Example of a dynamic dashboard displaying audio‑visual and static data.
By Wim Pouw
Head rotation tracking (Python)
Track head directions using MediaPipe, alongside face/hand/body tracking.
By Wim Pouw
Running OpenPose in batches (Batch script)
Use batch scripting to run OpenPose on a set of videos.
By James Trujillo & Wim Pouw
Recording from multiple cameras synchronously while streaming to LSL
Record synchronously from multiple cameras—handy for DIY 3D motion tracking.
By Šárka Kadavá & Wim Pouw
3D tracking from 2D videos (Anipose + DeepLabCut, Python)
Set up a 3D motion tracking system with multiple 2D cameras.
By Wim Pouw
Aligning & pre‑processing multiple data streams (R)
Wrangle and preprocess multiple streams into a single time‑series dataset.
By Wim Pouw
Aligning & pre‑processing multiple data streams (Python)
Create a unified long time‑series dataset from multiple modalities.
By Wim Pouw
Extracting a smoothed amplitude envelope (R)
Extract a smoothed amplitude envelope from a sound file.
By Wim Pouw
Motion tracking analysis: Kinematic feature extraction (Python)
Example analysis of motion tracking data using kinematic features.
By James Trujillo
Feature extraction for classification & practice dataset SAGA (R)
Set up kinematic and speech‑acoustic features to train classifiers for gesture types.
By Wim Pouw
Cross‑Wavelet Analysis of Speech‑Gesture Synchrony (R)
Measure temporal synchrony of speech and gesture (or other visual signals).
By James Trujillo