Skip to main content




20 February 2024: Both of our submissions to LREC-COLING have been accepted! At the meeting in Torino (20-25 May) Fangru will present her paper on  “Probing Large Language Models for Scalar Adjective Lexical Semantics and Scalar Diversity Pragmatics” and Isabelle Lorge will present “STEntConv: Predicting Disagreement between Reddit Users with Stance Detection and a Signed Graph Convolutional Network”.


6-10 Dec 2023: Our group made a good showing at EMNLP 2023 (6 — 10 December in Singapore). Isabelle presented her paper on “Not wacky vs. definitely wacky..” at the BlackBox NLP workshop. Valentin was a coauthor on “Counting Bugs in ChatGPT’s Wugs …” Lab alum Paul Röttger was also represented with a paper on “The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values”.

28 Nov 2023: Fangru, Felix, Dragos, and Janet all participated in a full day workshop at the Alan Turing Institute on the evaluation of foundation models.

12 Oct 2023: Valentin is the coauthor of a paper in Frontiers in Artificial Intelligence, “Explaining pretrained language models’ understanding of linguistic structures using construction grammar”.

1 Oct 2023: A warm welcome to Hazel Kim, an ELLIS fellow in CS who is co-supervised by Yarin Gal.

1 Oct 2023: Congratulations to Valentin for starting his postdoc at the Allen Institute for Artificial Intelligence in Seattle.

28-29 June 2023: The NLP SoDaS Conference (Natural Language Processing for Social Data Science) featured a half-day tutorial on theory and coding, by Felix, Ved, and Dragos, as well as an invited talk by Janet on “Natural language processing versus the heterogenous world of natural language”.

17 June 2023: Congratulations to Fernando and Becky for their successful completion of their 4YP projects, and all the best for their new jobs in fintech.

16 June 2023: Fernando’s 4YP project on “UPSTREAMQA: Unanswerable and Paraphrased Questions for StreamingQA” has won a departmental best-poster award in the category of Business Innovation. Take a look for it on the display monitor in the OeRC front lobby!

May 2023: Congratulations for Paul for completing his DPhil! Paul is moving on to a postdoc position with Dirk Hovy at the MilaNLP Lab of Bocconi University.

May 2023: Paul’s HateCheck project won the Stanford AI Audit Challenge in the category of “Best Holistic Evaluation and Benchmarking!”,


9-11 December 2022: Paul presented his paper “Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages” at EMNLP 2022. A paper co-authored by Valentin Hofmann, “The Better Your Syntax, the Better Your Semantics? Probing Pretrained Language Models for the English Comparative Correlative” was presented by his collaborator Leonie Weissweiler.

7 December 2022: Paul was a panelist at the EMNLP 2022 birds-of-a-feather session on "Hate Speech Detection in Low-Resource Languages”.

7 December 2022: Congratulations to Valentin Hofmann for passing his confirmation of status!

14-15 November 2022: Valentin participated in the Rising Stars in Data Science Workshop at the University of Chicago Data Science Institute (DSI),

17 October 2022:The OxNLP Reading Group was launched with a presentation by Dragos Gorduza on a landmark paper by Yi R Fung et al.  “InfoSurgeon: Cross-Media Fine-grained Information Consistency Checking for Fake News Detection”

13 October 2022: Paul gave a talk on “Is Personalised Content Moderation a Good Idea” at the Conference on Truth and Trust Online (TTO 2022) in Boston.

10 October 2022: Welcome to new DPhil students Alba Su and Ved Mathai!

30 June 2022: Congratulations to Paul Rottger on passing his confirmation of status!

23 June 2022: Janet gave a talk on “Categorization and generalization in morphology.” at the Oxford Berlin Workshop on Morphology. She also gave a flash talk about NLP at the WiE  event celebrating International Women in Engineering Day.

22-23 June 2022: Felix and Janet were both involved in the OMI Conference on NLP in Economics and Finance. Felix co-taught a tutorial on “Using state-of-the-art language models for economic and financial modelling, and Janet opened tihe meeting with a talk on “The Successes and Failures of NLP in forecasting”. A great conference, that also included speakers from Bloomberg, Deepmind, Amazon, and numerous universities.

20 June 2022: Welcome to Li Zhang! Li Zhang finally got her visa and arrived from China to join Isabelle as PDRA on the "Exaggeration and Fragmentation..." project.

14-16 June 2022: Janet participated in a symposium held in York in honour of a grande dame of language acquisition research, Marilyn Vihman. Her talk was entiitled  “On the foundations of systematicity in the lexicon”.

31 May 2022: Congratulations to Alex and Mia for submitting excellent 4YP papers and doing a fine job in their vivas.

8 April 2022: 100% success with NAACL submissions!

  • Our members have had the following papers accepted for the North American Association for Computational Linguistics meeting in Seattle.
    • Drinkall, Zohren &  Pierrehumbert. “Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts.” (main session).
    • Röttger, Vidgen, Hovy, Pierrehumbert “Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks” (main session)
    • Kirk, Vidgen, Röttger, Thrush & Hale “Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate” (main session).
    • Hofmann, Pierrehumbert, Schütze, “Modeling Ideological Agenda Setting and Framing in Polarized Online Groups with Graph Neural Networks and Structured Sparsity “ (Findings).
    • Röttger, Seelawi, Nozza, Talat, Vidgen. "Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models."

4 April 2022: Welcome to Isabelle

  • Isabelle Lorge has joined us a PDRA in text-mining and experimental semantics. She has a Ph.D in Linguistics from Cambridge and experience in the high-tech sector working for Arabesque AI on analyzing news and social media for stock market prediction.

1 April 2022: Paul is off to Milan

  • Paul starts an extended visit to Dirk Hovy’s group at Bocconi University in Milan.

16 March 2022: ICWSM success.

  • Hofmann, Schütze and Pierrehumbert,  “The Reddit Politosphere: A Large-Scale Text and Network Resource of Online Political Discourse”  has been accepted for International AAAI Conference on Web and Social Media (ICWSM 2022).

4 March 2022: Congratulations to Alex

  • Alex Goldie has been offered and has accepted a DPhil place in the highly competitive Oxford AIMS  (Autonomous Intelligent Machines and Systems) CDT Program.

1 March 2022: Valentin to Deepmind

  • Valentin Hofmann starts his internship in DeepMind’s Language Team.

1 March 2022: Janet joins PNAS Editorial Board

  • Janet joins the Editorial Board of PNAS (Proceedings of the National Academy of Sciences), where she will handle submissions in cognitive science and natural language processing. Together with Nature and Science, PNAS is one of the very top interdisciplinary scientific journals.

24 February 2022: Two papers accepted for ACL 2022

  • Hofmann, Schütze and Pierrehumbert, “An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers"
  • Weissweiler, Hofmann, Sabet & Schütze “CaMEL: Case marker extraction without labels”

9 February 2022: MPP 2022

  • Janet delivered a plenary address entitled “Using orthographic data to explore the mental lexicon” at MPP 2022 (Morphology in Phonetics and Perception), held virtually over Gather in Dusseldorf.


November 2021: Paper at EMNLP 2021 (Findings)

  • Temporal adaptation of BERT and performance on downstream document classification: Insights from social media

August 2021: Three papers at ACL 2021

  • Dynamic contextualized word embeddings
  • HateCheck: Functional tests for hate speech detection models
  • Superbizarre is not superb: Derivational morphology improves BERT’s interpretation of complex words

June 2021: Paper at SwissText 2021

  • Predicting COVID-19 cases using Reddit posts and other online resources


November 2020: Paper at EMNLP 2020

  • DagoBERT: Generating derivational morphology with a pretrained language model

July 2020: Two papers at ACL 2020

  • A graph auto-encoder model of derivational morphology
  • Predicting the growth of morphological families from social and linguistic factors