New module on Phoneme-Level Speech-Text Synchronization using the Montreal Forced Aligner (for Python)
EnvisionBOX is a community-driven platform for sharing and learning solutions to analyze and work with multimodal data streams in various programming languages. It serves as a centralized repository for sharing software, codes, algorithms, and tools utilized in multimodal analysis. EnvisionBOX aims to increase the transparency and reproducibility of research, speed up the development process, and most importantly lower the entry barrier for newcomers.
At EnvisionBOX, we believe that everyone should have the opportunity to learn and contribute to the field of multimodal data analysis. On this platform learners and experts alike can share knowledge, collaborate on projects, and develop new solutions. You are invited to contribute your own coding module to this platform, feel free to contact us with your idea for a contribution.
We are also looking for new people to join the team. Do you have ideas to foster community building (e.g., news bulletin on new tools coming out; a lecture series that connects theory with methods), please feel free to contact us and join the team!
If you're interested in learning more about multimodal data analysis or contributing to the EnvisionBox community, we encourage you to subscribe to our newsletter.







Donders Research Stimulation fund (EnvisionBoxBaby), Donders Institute, Radboud University (2025; PIs Wim Pouw, Babajide Owoyele, Yayun Zhang, Caroline Rowland, Sabine Hunnius)
NWO VENI (VI.Veni 0.201G.047), Dutch Research Council (2020-2025; PI Wim Pouw)