FAIR-FI-LD

Table of contents
Description
Moving towards a national FAIR-compliant ecosystem of Federated Infrastructure for Language Data, short FAIR-FI-LD, is a swissuniversities ORD-funded 12 months project (July 2024-June 2025) hosted by the University of Zurich, with the participation of CLARIN-CH, LiRI, ZHAW and USI.
In the last 5-10 years, Swiss higher education institutions (HEI) have been working on building national services for language data. They include, up to now, the Linguistic Research Infrastructure (UZH), the Swiss-AL Platform for Applied Sciences (ZHAW), a national repository for the publication and long-term preservation of language data LaRS@SWISSUbase (UNIL, UZH), and various smaller tools and services. These units however are not all interoperable, which reduces the potential for collaboration and data reuse. In addition, fields such as interactional linguistics or second language acquisition lack adequate infrastructure.
With the foundation of the CLARIN-CH consortium in 2020 (9 HEIs and the SAGW), the HEI's efforts took a new direction: Work together to build a FAIR-compliant, sustainable and expandable CLARIN-CH ecosystem of federated infrastructure to answer the needs of researchers and professionals using language data in Switzerland and beyond; an ecosystem that must be interoperable at the national and European levels.
Principal Investigators
- Dr. Cristina Grisot (CLARIN-CH)
- Prof. Dr. Noah Bubenhofer (LiRI)
- Prof. Dr. Julia Krasselt (ZHAW)
- Prof. Dr. Johanna Miecznikowski-Fuenfschilling (USI)
- Project coordinator: Dr. Letizia Volpin, Dr. Joanna Blochowiak
Main outcomes
The FAIR-FI-LD project has made important progress in advancing a federated, FAIR-compliant infrastructure for language data in Switzerland. Through close collaboration among key national partners—including CLARIN-CH, LiRI, ZHAW, and USI—the project delivered both technical and organizational outcomes that strengthened the foundation for sustainable language data services.
Creation of a corpus platform for multimodal interactional data
An instance of LiRI LCP-Videoscope was set up at USI with specific functionalities for multimodal interactional data, includes a viewer for video files, a timeline where time-aligned transcribed text and annotations are displayed in tiers, an area to write queries, an area to display query results and an area to show a visual schema of the data model: https://lcp.usi.ch/
Adaptation of CLARIN technologies to Swiss platforms
Implementation of Federated Content Search (FCS), enabling researchers to query distributed corpora while keeping the data securely hosted at their institutions. Each participating corpus infrastructure now provides an endpoint that bridges the central search portal and the local search engines—such as the Swiss-AL workbench and the Language Corpus Platform (LCP).
- Creation of a multilingual FCS landing page
- API deployment by LiRI and Swiss-AL for platform interoperability
Data Management & Metadata
- Working solution for metadata-only registration in SWISSUbase
- Increased discoverability of Swiss language resources
- Focus on interactional and multimodal language data
Capacity building & User support
- Creation of comprehensive documentation
- Distribution of training materials across institutions
- Enabling researchers to navigate tools and standards
Community Engagement & Collaboration
- Regular alignment meetings across the project
- Collaborative planning for future developments
- Final stakeholder workshop (June 23, 2025, USI)
FCS Technology
To enable researchers to search for specific patterns across collections of data, CLARIN offers a search engine that connects to the local text collections that are available in the centres. The data itself stays at the centre where it is hosted – which is why the underlying technique is called federated content search.
The search engine summarises and displays what is available, no login is required. An easy next step is to go to the centre's specialised search interface to perform a more sophisticated query.
Dissemination
- Grisot C., Craevschi A., Futter C., Vukovic T., Zehr J., Krasselt J., Dreesen P. (2025) The Swiss FAIR-compliant ecosystem of infrastructures 2.0. Extended abstract accepted for CLARIN Annual Conference.
- Grisot C., Craevschi A. (2025) CLARIN-CH: supporting Open Research Data management. Poster at Swiss NLP Expo.
Miecznikowski-Fuenfschilling J., Vukovic T., Zehr J. (2025) Modelling of Multi-Modal Data in LiRI Corpus Platform and Beyond. Training session, CLARIN-CH. https://clarin-ch.ch/event/lcp-modelling-of-multi-modal-data/
Miecznikowski-Fuenfschilling J. (2024). Project presentation at CLARIN-CH Day 2024, University of Neuchâtel (September 9, 2024). https://clarin-ch.ch/first-clarin-ch-day-in-the-books/