Gerold Schneider

8050 Zürich
Campus Oerlikon
Gerold Schneider is Titulary Professor of Computational Linguistics and co-coordinator of LiRI's service area "Natural Language Processing". His doctoral degree is on large-scale dependency parsing, his habilitation on using computational models for corpus linguistics. His research interests include corpus linguistics, cognitive linguistics, statistical approaches, Digital Humanities, learner language, text mining, automated content analysis and language modeling. He has published over 130 articles on these topics, including a book on statistics for linguists available here.
He also works with NLP methods and hate speech detection for the URPP Digital Religion(s) project. Find out more about Gerolds work on his GoogleScholar page or his personal webpage.
Publications
ZORA Publication List
Publications
-
Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review. PLOS Digital Health, 2(10):e0000347.
-
Differences in syntactic annotation affect retrieval. International Journal of Corpus Linguistics, 28(3):378-406.
-
Evaluating the Effectiveness of Natural Language Inference for Hate Speech Detection in Languages with Limited Labeled Data. In: The 7th Workshop on Online Abuse and Harms (WOAH), Toronto, Canada, 13 July 2023. Association for Computational Linguistics, 187-201.
-
Detecting and Analysing Learner Difficulties Using a Learner Corpus Without Error Tagging. In: Harrington, Kieran; Ronan, Patricia. Demystifying Corpus Linguistics for English Language Teaching. Cham: Palgrave Macmillan, 229-257.
-
Replicable semi-supervised approaches to state-of-the-art stance detection of tweets. Information Processing & Management, 60(2):103199.
-
Do Non-native Speakers Read Differently? Predicting Reading Times with Surprisal and Language Models of Native and Non-native Eye Tracking Data. In: Busse, Beatrix; Dumrukcic, Nina; Kleiber, Ingo. Language and Linguistics in a Complex World. Berlin: De Gruyter, 153-188.
-
Scaling Native Language Identification with Transformer Adapters. In: 5th International Conference on Natural Language and Speech Processing (ICNLSP), Trento, 16 December 2022 - 17 December 2022, Cornell University.
-
Complementing Kernel Density Estimation and Topic Modelling to Visualise Political Discourse. In: Digital Research Data and Human Sciences DRDHum Conference 2022, Jyväskylä, Finland, 1 Dezember 2022 - 3 Dezember 2022. University of Jyväskylä, 12-27.
-
Assessing How Attitudes to Migration in Social Media Complement Public Attitudes Found in Opinion Surveys. SPELL: Swiss Papers in English Language and Literature, 41:119-153.
-
Systematically Detecting Patterns of Social, Historical and Linguistic Change: The Framing of Poverty in Times of Poverty. Transactions of the Philological Society, 120(3):447-473.
-
Hypothesis Engineering for Zero-Shot Hate Speech Detection. In: Proceedings of the Third Workshop on Threat, Aggression and Cyberbullying (TRAC 2022), Gyeongju, Republic of Korea, 12 October 2022 - 17 October 2022. ACL, 75-90.
-
Comparing the coverage of the “marriage for all” vote on Twitter and in the newspapers. In: 2nd Workshop on Computational Linguistics for Political Text Analysis (CPSS-2022), Potsdam, Germany, 12 September 2022. CPSS, 55-62.
-
Correlations and predictions of reading times using language models and surprisal. In: Krug, Manfred; Schützler, Ole; Vetter, Fabian; Werner, Valentin. Perspectives on Contemporary English : Structure, Variation, Cognition. Berlin, Bern, Bruxelles, New York, Oxford, Warszawa, Wien: Peter Lang, 209-243.
-
Medical topics and style from 1500 to 2018. In: Hiltunen, Turo; Taavitsainen, Irma. Corpus pragmatic studies on the history of medical discourse. Amsterdam: Benjamins, 49-78.
-
Recent changes in spoken British English according to spoken BNC2014. In: Flach, Susanne; Hilpert, Martin. Broadening the spectrum of corpus linguistics: New approaches to variability and change. Amsterdam: John Benjamins Publishing, 173-195.
-
Measuring Attitudes to Migration in the Media automatically with Complementary Data Sources and Methods. In: Ronan, Patricia; Ziegler, Evelyn. Approaches to Migration and Language Identity. Oxford, Bern, Berlin, Bruxelles, New York, Wien: Peter Lang, 207-252.
-
Comparing data-driven to corpus-based approaches for diachronic variation: document-classification and overuse metrics. In: Schlüter, Julia; Schützler, Ole. Data and Methods in Corpus Linguistics: Comparative Approaches. Cambridge: Cambridge University Press, 291-322.
-
Syntactic changes in verbal clauses and noun phrases from 1500 onwards. In: Los, Bettelou; Cowie, Claire; Honeybone, Patrick. English Historical Linguistics: Change in Structure and Meaning. Amsterdam: John Benjamins Publishing, 163-200.
-
Challenges and best practices for digital unstructured data enrichment in health research: a systematic narrative review. medRxiv 22278137, Cold Spring Harbor Laboratory.
-
With a little help from familiar interlocutors: real-world language use in young and older adults. Aging & Mental Health, 25(12):2310-2319.