Chair of Big Data Linguistics

The Chair of Big Data Linguistics explores the intersection of language, cognition, and data science. Our research focuses on understanding linguistic patterns through large-scale, data-driven approaches.
We combine insights from cognitive linguistics – particularly Construction Grammar and collocational phenomena such as collocations and collostructions – with methods from computational and corpus linguistics and lexicography. Central to our work is the use of large multimodal datasets, encompassing text, image, and video data, to study language in its full communicative context.
By applying data science techniques and machine learning, we aim to advance both theoretical and applied research in linguistics. The chair is also committed to building research infrastructures and open datasets that foster transparent, reproducible, and collaborative scholarship in the digital humanities.
Selected projects:
- “MULTIDATA” (ERASMUS+) in collaboration with the University of Murcia (Spain) is aimed at creating a previously-unavailable platform to turn video collections into multimodal datasets and data visualizations.
- “World Futures” (AHRC and DFG) in collaboration with the University of Oxford studies how the way people imagine futures enables the construction and communication of multimodal disinformation.