Gerold Schneider
8050 Zürich
Campus Oerlikon
Gerold Schneider is Titulary Professor of Computational Linguistics and co-coordinator of LiRI's service area "Natural Language Processing". His doctoral degree is on large-scale dependency parsing, his habilitation on using computational models for corpus linguistics. His research interests include corpus linguistics, cognitive linguistics, statistical approaches, Digital Humanities, learner language, text mining, automated content analysis and language modeling. He has published over 130 articles on these topics, including a book on statistics for linguists available here.
He also works with NLP methods and hate speech detection for the URPP Digital Religion(s) project. Find out more about Gerolds work on his GoogleScholar page or his personal webpage.
Publications
ZORA Publication List
Publications
-
Combining Collocation Measures and Distributional Semantics to Detect Idioms In M. Laitinen & P. Rautionaho (Eds.), Data-Intensive Investigations of English (pp. 104–135). Cambridge University Press. https://doi.org/10.1017/9781009415682.005
-
SpiritRAG: A Q&A System for Religion and Spirituality in the United Nations Archive (I. Habernal, P. Schulam, & J. Tiedemann, Eds.; pp. 26–41). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.emnlp-demos.3
-
ESCMID workshop: Artificial intelligence and machine learning in medical microbiology diagnostics Microbes and Infection, 27, 105562. https://doi.org/10.1016/j.micinf.2025.105562
-
Measuring language complexity about European politics in Swiss parliamentary debates In A. Pawłowski, S. Embleton, J. Mačutek, & A. Xanthos (Eds.), Mathematical Modelling in Linguistics and Text Analysis (Vol. 370, pp. 191–206). John Benjamins Publishing. https://doi.org/10.1075/cilt.370
-
Entropy as a Lens: Exploring Visual Behavior Patterns in Architects Journal of Eye Movement Research, 18, 43. https://doi.org/10.3390/jemr18050043
-
Evaluating a transparent and interpretable approach to stance detection using linguistic markers in social media data International Journal of Corpus Linguistics, 30, 195–233. https://doi.org/10.1075/ijcl.24132.rev
-
PreClinIE: An Annotated Corpus for Information Extraction in Preclinical Studies (D. Demner-Fushman, S. Ananiadou, M. Miwa, & J. Tsujii, Eds.; pp. 74–87). Association for Computational Linguistics. https://doi.org/10.18653/v1/2025.bionlp-1.8
-
In patients’ words: natural language processing of reports from patients experiencing orofacial pain and dysfunction Journal of Headache and Pain, 26, 172. https://doi.org/10.1186/s10194-025-02095-z
-
Detecting and Mapping Hate in Religious Contexts In T. Schlag & K. Yadav (Eds.), Religious Communication, Interaction and Transformation in a Culture of Digitality : Insights into the Zurich University Research Priority Program “Digital Religion(s)” (pp. 153–183). De Gruyter. https://doi.org/10.1515/9783111721729
-
How stable are multivariate findings about register variation across varieties of English? On the replicability of Geometric Multivariate Analysis ICAME Journal, 49, 23–45. https://doi.org/10.2478/icame-2025-0003
-
The ‘Spiritual’ and the ‘Religious’ in the Twittersphere: A Topic Model and Semantic Map Journal of Religion, Media & Digital Culture, 14, 1–22. https://doi.org/10.1163/21659214-bja10123
-
Investigating Linguistic Abilities of LLMs for Native Language Identification (No. 14). 81. https://hdl.handle.net/10062/107173
-
Refining Established Practices for Research Question Definition to Foster Interdisciplinary Research Skills in a Digital Age: Consensus Study With Nominal Group Technique JMIR Medical Education, 11, e56369. https://doi.org/10.2196/56369
-
Linguistic Features Extracted by GPT-4 Improve Alzheimer’s Disease Detection based on Spontaneous Speech (O. Rambow, L. Wanner, M. Apidianaki, H. Al-Khalifa, B. Di Eugenio, & S. Schockaert, Eds.; pp. 1850–1864). Association for Computational Linguistics. https://aclanthology.org/2025.coling-main.126/
-
Robust Native Language Identification through Agentic Decomposition In C. Christodoulopoulos, Tanmoy Chakraborty, C. Rose, & Violet Peng (Eds.), Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (pp. 8398–8414). Association for Computational Linguistics. https://aclanthology.org/2025.emnlp-main.423/
-
Digital Dickens: An automated content analysis of Charles Dickens’ novels In S. Buschfeld, P. Ronan, T. Neumaier, A. Wellinghoff, & L. Westermayer (Eds.), Crossing Boundaries through Corpora: Innovative corpus approaches within and beyond linguistics (pp. 62–98). John Benjamins Publishing. https://doi.org/10.1075/scl.119
-
Automatically detecting directives with SPICE Ireland In M. Schweinberger & P. Ronan (Eds.), Socio-Pragmatic Variation in Ireland: Using Pragmatic Variation to Construct Social Identities (No. 378; pp. 205–234). De Gruyter. https://doi.org/10.1515/9783110791457-011
-
The LiRI Corpus Platform Linköping Electronic Conference Proceedings, 62–75. https://doi.org/10.3384/ecp210010
-
Evaluating Transformers on the Ethical Question of Euthanasia 241–246. https://aclanthology.org/2024.swisstext-1.55.pdf