Header

Search

Upgrading the linguistic ORD-ecosystem UpLORD

Description

UpLORD is a swissuniversities ORD-funded 2-year project (2023-2024) hosted by the University of Zurich, with the support of the Zurich University Library and the CLARIN-CH Consortium. Since 2018, a consortium of partners has been working on building a national ecosystem of infrastructures, which covers the whole linguistic data lifecycle according to ORD requirements (FAIR principles: Findable, Accessible, Interoperable, Reusable) from data generating, processing and analyzing to data sharing and archiving. This ecosystem includes the national technology platform LiRI and the national repository for publishing and archiving linguistic data (SWISSUbase) as service providers, a database of Swiss media texts and a platform for hosting of and searching in large text and audio/video corpora. 

The project focuses on upgrading workflows and interoperability of existing infrastructure services, establishing working groups on the national level, documenting and promoting best practices, raising awareness and training about ORD practices in the context of teaching, research and publishing, and building a robust practice of data curation. In the long-term, this project will significantly contribute to a strong foundation for a sustainable ORD strategy for linguistic data in Switzerland.

Here (PDF, 172 KB) you can find details about the Steering Committee and the governance of the project.

Principal Investigators

Prof. Dr. Noah Bubenhofer (LiRI)

Dr. Andrea Malits (Universitätsbibliothek Zürich)

Dr. Cristina Grisot (CLARIN-CH)

Project coordinators: Dr. Letizia Volpin, Dr. Joanna Blochowiak

Main outcomes

In the context of the requirements of Open Science and of FAIR principles, on the one hand, and of that of more challenging data sets (such as, sensitive or with copyright issues), on the other hand, we identified several gaps regarding the current situation in Switzerland that are going to be addressed thanks to ORD project. Below you can find a panorama of the projects' main outcomes :

Technical implementations:   

  • Software development: the LiRI Corpus Platform with three interfaces for text, audio and video data 
  • Building of data convertors: read documentation and access code

  • Creation of complex annotation representations: read documentation

  • Development of a corpus language query, the DQD, to allow for complex and powerful queries  in text, audio and video data

  • Formulation of a concept for registering complex metadata for specific types of language data, such as sign language  and interactional linguistics data: reach out to LaRS team

  • Creation of the Swiss CMDI profile  useful for metadata interoperability with CLARIN 

  • Implementation of the CLARIN federated authentication system 

  • Implementation of the harvesting of the SWISSUbase repository by the CLARIN VLO  

  • Construction of API to push data from LiRI to SWISSUbase : read documentation

Community engagement, standardization and workflows:   

Training and documentation:  

  • Formulation of data curation and version control workflows: read documentation

  • Organisation of webinars about the management of sensitive data, copyright and intellectual property issues to inform the scientific community  

  • Organisation of a series of online training sessions to inform and form the scientific community about how to use the Swiss FAIR-compliant ecosystem of infrastructures 2.0 to enhance leur research.   

  • Publication in the CLARIN-CH Zenodo community of open educational resources based on the webinars series and training series    

  • Creation of the CLARIN-CH Documentation Platform about management of open research data  

Certifications

  • Acquisition of the CoreTrustSeal certification for the Language Repository of Switzerland LaRS 
  • Submission of the application of the Linguistic Research Infrastructure LiRI and the Language Repository of Switzerland LaRS for certification as CLARIN B-center (results expects early 2026)

Dissemination

  1. LCP workshop at SwissText 2025: Bring your own data! 
  2. Grisot C., Craevschi A., Futter C., Vukovic T., Zehr J., Krasselt J., Dreesen P. (2025) The Swiss FAIR-compliant ecosystem of infrastructures 2.0.  Extended abstract accepted for CLARIN Annual Conference.
  3. Grisot C., Craevschi A. (2025) CLARIN-CH: supporting Open Research Data management. Poster at Swiss NLP Expo.
  4. Grisot C. (2024) Presentation of CLARIN-CH ecosystem in the Tour de CLARIN
  5. Blochowiak J., Grisot C. (2025) Building up the CLARIN-CH Training Programme. Extended abstract accepted for CLARIN Annual Conference.
  6. Bubenhofer, N., Malits, A., Strebel, S., Gräen, J., Buerli, S., & Grisot, C. (2023, December). Building and consolidating a FAIR-compliant ecosystem of infrastructures. In CLARIN Annual Conference Proceedings (p. 95-99)
  7. Schaber, J., Graën, J., McDonald, D., Mustac, I., Rajovic, N., Schneider, G., ... & Kontino, T. (2023, October). The LiRI Corpus Platform. In CLARIN annual conference proceedings (pp. 145-149). 
  8. Schaber, J., Graën, J., Mustač, I., Rajović, N., Schneider, G., Zehr, J., & Bubenhofer, N. Swissdox@ LiRI–a large database of media articles made accessible to researchers. CLARIN annual conference proceedings (pp. 111-115). 
  9. Poster at Open Access Week 2023 at UZH 

  10. Presentation at 2023 SWISSUbase Annual event at UZH (November 2023)

  11. Presentation of UpLORD project (PDF, 1 MB) at 2024 CLARIN-CH Day at University of Neuchâtel (September 9, 2024) 

  12. Swissuniversities P5 Open Science closing event (November 18, 2024) 

Additional Information

CLARIN-CH logo with the text "Common Language Resources and Technology Infrastructure"

Common Language Resources and Technology Infrastructure

More about Common Language Resources and Technology Infrastructure

CLARIN is a pan-European research infrastructure aiming to render accessible all digital language resources and tools from all over Europe through a single sign-on online environment. Swiss academic institutions founded the CLARIN-CH consortium in 2020. 

SWISSUbase is a national repository that facilitates access to research data and projects across different disciplines and provides Swiss research institutions with a reliable data infrastructure.

Call for participation

Are you a member of the Swiss scientific community working with language resources and you feel concerned about the topics addressed in this project?

Would you like to get involved?

Please drop an email to Cristina Grisot.