Publications

Derivational morphology reveals analogical generalization in large language models

Hofmann V, Weissweiler L, Mortensen DR, Schütze H & Pierrehumbert JB (2025), Proceedings of the National Academy of Sciences, 122(19)

BibTeX

@article{derivationalmor-2025/5,
  title={Derivational morphology reveals analogical generalization in large language models},
  author={Hofmann V, Weissweiler L, Mortensen DR, Schütze H & Pierrehumbert JB},
  journal={Proceedings of the National Academy of Sciences},
  volume={122},
  number={e2423232122},
  publisher={National Academy of Sciences},
  year = "2025"
}

Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups

Drinkall F, Zohren S, McMahon M & Pierrehumbert JB (2025)

BibTeX

@misc{storiesthatarem-2025/2,
  title={Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups},
  author={Drinkall F, Zohren S, McMahon M & Pierrehumbert JB},
  year = "2025"
}

When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks

Drinkall F, Pierrehumbert JB & Zohren S (2025)

BibTeX

@misc{whendimensional-2025/2,
  title={When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks},
  author={Drinkall F, Pierrehumbert JB & Zohren S},
  year = "2025"
}

Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs

Drinkall F, Pierrehumbert JB & Zohren S (2025), Proceedings - International Conference on Computational Linguistics, COLING, 118-133

BibTeX

@inproceedings{forecastingcred-2025/1,
  title={Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs},
  author={Drinkall F, Pierrehumbert JB & Zohren S},
  pages={118-133},
  year = "2025"
}

One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks

Lin F, Mao S, La Malfa E, Hofmann V, de Wynter A et al. (2024)

BibTeX

@misc{onelanguagemany-2024/10,
  title={One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks},
  author={Lin F, Mao S, La Malfa E, Hofmann V, de Wynter A et al.},
  year = "2024"
}

Conspiracy detection beyond text: exploring the feasibility of adding psycho-linguistic features to enhance conspiracy detection models

George AR, Ahrens M, Pierrehumbert JB & McMahon M (2024), 32-45

BibTeX

@misc{conspiracydetec-2024/8,
  title={Conspiracy detection beyond text: exploring the feasibility of adding psycho-linguistic features to enhance conspiracy detection models},
  author={George AR, Ahrens M, Pierrehumbert JB & McMahon M},
  year = "2024"
}

Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs

Drinkall F, Pierrehumbert JB & Zohren S (2024)

BibTeX

@misc{forecastingcred-2024/7,
  title={Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs},
  author={Drinkall F, Pierrehumbert JB & Zohren S},
  year = "2024"
}

Time machine GPT

Drinkall F, Rahimikia E, Pierrehumbert J & Zohren S (2024)

BibTeX

@inproceedings{timemachinegpt-2024/6,
  title={Time machine GPT},
  author={Drinkall F, Rahimikia E, Pierrehumbert J & Zohren S},
  booktitle={2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
  year = "2024"
}

Probing large language models for scalar adjective lexical semantics and scalar diversity pragmatics

Lin F, Altshuler D & Pierrehumbert J (2024), Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 13033-13049

BibTeX

@inproceedings{probinglargelan-2024/5,
  title={Probing large language models for scalar adjective lexical semantics and scalar diversity pragmatics},
  author={Lin F, Altshuler D & Pierrehumbert J},
  booktitle={Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
  pages={13033-13049},
  year = "2024"
}

STEntConv: predicting disagreement between Reddit users with stance detection and a signed graph convolutional network

Lorge I, Zhang L, Dong X & Pierrehumbert J (2024), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), 15273-15284

BibTeX

@inproceedings{stentconvpredic-2024/5,
  title={STEntConv: predicting disagreement between Reddit users with stance detection and a signed graph convolutional network},
  author={Lorge I, Zhang L, Dong X & Pierrehumbert J},
  booktitle={2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings},
  pages={15273-15284},
  year = "2024"
}

Graph-enhanced large language models in asynchronous plan reasoning

Lin F, La Malfa E, Hofmann V, Yang R, Cohn AG et al. (2024), Proceedings of the 41st International Conference on Machine Learning (ICML 2024), 30108-30134

BibTeX

@inproceedings{graphenhancedla-2024/5,
  title={Graph-enhanced large language models in asynchronous plan reasoning},
  author={Lin F, La Malfa E, Hofmann V, Yang R, Cohn AG et al.},
  booktitle={41st International Conference on Machine Learning (ICML 2024)},
  pages={30108-30134},
  year = "2024"
}

Geographic adaptation of pretrained language models

Hofmann V, Glavaš G, Ljubešić N, Pierrehumbert JB & Schütze H (2024), Transactions of the Association for Computational Linguistics, 12, 411-431

BibTeX

@article{geographicadapt-2024/4,
  title={Geographic adaptation of pretrained language models},
  author={Hofmann V, Glavaš G, Ljubešić N, Pierrehumbert JB & Schütze H},
  journal={Transactions of the Association for Computational Linguistics},
  volume={12},
  pages={411-431},
  publisher={Massachusetts Institute of Technology Press},
  year = "2024"
}

Time Machine GPT

Drinkall F, Rahimikia E, Pierrehumbert JB & Zohren S (2024)

BibTeX

@misc{timemachinegpt-2024/4,
  title={Time Machine GPT},
  author={Drinkall F, Rahimikia E, Pierrehumbert JB & Zohren S},
  year = "2024"
}

Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

Lin F, La Malfa E, Hofmann V, Yang EM, Cohn A et al. (2024)

BibTeX

@misc{graphenhancedla-2024/2,
  title={Graph-enhanced Large Language Models in Asynchronous Plan Reasoning},
  author={Lin F, La Malfa E, Hofmann V, Yang EM, Cohn A et al.},
  year = "2024"
}

Not wacky vs. definitely wacky: a study of scalar adverbs in pretrained language models

Lorge I & Pierrehumbert JB (2023), Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, 296-316

BibTeX

@inproceedings{notwackyvsdefin-2023/12,
  title={Not wacky vs. definitely wacky: a study of scalar adverbs in pretrained language models},
  author={Lorge I & Pierrehumbert JB},
  booktitle={6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP},
  pages={296-316},
  year = "2023"
}

Unsupervised detection of contextualized embedding bias with application to ideology

Hofmann V, Pierrehumbert J & Schütze H (2022), Proceedings of the 39th International Conference on Machine Learning (ICML 2022), 162, 8796-8810

BibTeX

@inproceedings{unsuperviseddet-2022/7,
  title={Unsupervised detection of contextualized embedding bias with application to ideology},
  author={Hofmann V, Pierrehumbert J & Schütze H},
  booktitle={39th International Conference on Machine Learning (ICML 2022)},
  pages={8796-8810},
  year = "2022"
}

Modeling ideological salience and framing in polarized online groups with graph neural networks and structured sparsity

Hofmann V, Dong X, Pierrehumbert J & Schuetze H (2022), Findings of the Association for Computational Linguistics: NAACL 2022, 536-550

BibTeX

@inproceedings{modelingideolog-2022/7,
  title={Modeling ideological salience and framing in polarized online groups with graph neural networks and structured sparsity},
  author={Hofmann V, Dong X, Pierrehumbert J & Schuetze H},
  booktitle={2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022)},
  pages={536-550},
  year = "2022"
}

Two contrasting data annotation paradigms for subjective NLP tasks

Röttger P, Vidgen B, Hovy D & Pierrehumbert JB (2022), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 175-184

BibTeX

@inproceedings{twocontrastingd-2022/7,
  title={Two contrasting data annotation paradigms for subjective NLP tasks},
  author={Röttger P, Vidgen B, Hovy D & Pierrehumbert JB},
  booktitle={2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022)},
  pages={175-184},
  year = "2022"
}

Forecasting COVID-19 caseloads using unsupervised embedding clusters of social media posts

Drinkall F, Zohren S & Pierrehumbert JB (2022), Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1471-1484

BibTeX

@inproceedings{forecastingcovi-2022/7,
  title={Forecasting COVID-19 caseloads using unsupervised embedding clusters of social media posts},
  author={Drinkall F, Zohren S & Pierrehumbert JB},
  booktitle={ 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022)},
  pages={1471-1484},
  year = "2022"
}

An embarrassingly simple method to mitigate undesirable properties of pretrained language model tokenizers

Hofmann V, Schuetze H & Pierrehumbert J (2022), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022)

BibTeX

@inproceedings{anembarrassingl-2022/6,
  title={An embarrassingly simple method to mitigate undesirable properties of pretrained language model tokenizers},
  author={Hofmann V, Schuetze H & Pierrehumbert J},
  booktitle={60th Annual Meeting of the Association for Computational Linguistics (ACL 2022)},
  year = "2022"
}

Phonotactic and morphological effects in the acceptability of pseudowords

Needle JM, Pierrehumbert JB & Hay JB (2022), 79-112

BibTeX

@misc{phonotacticandm-2022/5,
  title={Phonotactic and morphological effects in the acceptability of pseudowords},
  author={Needle JM, Pierrehumbert JB & Hay JB},
  year = "2022"
}

The Reddit Politosphere: a large-scale text and network resource of online political discourse

Hofmann V, Schütze H & Pierrehumbert J (2022), Proceedings of the International AAAI Conference on Web and Social Media, 16(1), 1259-1267

BibTeX

@inproceedings{theredditpolito-2022/5,
  title={The Reddit Politosphere: a large-scale text and network resource of online political discourse},
  author={Hofmann V, Schütze H & Pierrehumbert J},
  booktitle={International AAAI Conference on Web and Social Media},
  pages={1259-1267},
  year = "2022"
}

Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts

Drinkall F, Zohren S & Pierrehumbert JB (2022)

BibTeX

@misc{forecastingcovi-2022/5,
  title={Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts},
  author={Drinkall F, Zohren S & Pierrehumbert JB},
  year = "2022"
}

Commentary on Chapter 11: Comparing the PENTA model to autosegmental-metrical phonology

Pierrehumbert J (2022), 408-424

BibTeX

@misc{commentaryoncha-2022/2,
  title={Commentary on Chapter 11: Comparing the PENTA model
to autosegmental-metrical phonology},
  author={Pierrehumbert J},
  year = "2022"
}

Temporal adaptation of BERT and performance on downstream document classification: insights from social media

Röttger P & Pierrehumbert JB (2021), Findings of the Association for Computational Linguistics: EMNLP 2021, 2400-2412

BibTeX

@inproceedings{temporaladaptat-2021/12,
  title={Temporal adaptation of BERT and performance on downstream document classification: insights from social media},
  author={Röttger P & Pierrehumbert JB},
  booktitle={2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)},
  pages={2400-2412},
  year = "2021"
}

Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks

Röttger P, Vidgen B, Hovy D & Pierrehumbert JB (2021)

BibTeX

@misc{twocontrastingd-2021/12,
  title={Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks},
  author={Röttger P, Vidgen B, Hovy D & Pierrehumbert JB},
  year = "2021"
}

Predicting COVID-19 cases using Reddit posts and other online resources

Drinkall F & Pierrehumbert JB (2021), Proceedings of the Swiss Text Analytics Conference 2021

BibTeX

@inproceedings{predictingcovid-2021/9,
  title={Predicting COVID-19 cases using Reddit posts and other online resources},
  author={Drinkall F & Pierrehumbert JB},
  booktitle={Swiss Text Analytics Conference 2021},
  year = "2021"
}

Superbizarre is not superb: derivational morphology improves BERT’s interpretation of complex words

Hofmann V, Pierrehumbert JB & Schuetze H (2021), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 3594-3608

BibTeX

@inproceedings{superbizarreisn-2021/8,
  title={Superbizarre is not superb: derivational morphology improves BERT’s interpretation of complex words},
  author={Hofmann V, Pierrehumbert JB & Schuetze H},
  booktitle={59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing},
  pages={3594-3608},
  year = "2021"
}

Dynamic contextualized word embeddings

Hofmann V, Pierrehumbert JB & Schütze H (2021), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 6970-6984

BibTeX

@inproceedings{dynamiccontextu-2021/8,
  title={Dynamic contextualized word embeddings},
  author={Hofmann V, Pierrehumbert JB & Schütze H},
  booktitle={ Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)},
  pages={6970-6984},
  year = "2021"
}

Familiarity, consistency, and systematizing in morphology.

Schumacher RA & Pierrehumbert JB (2021), Cognition, 212, 104512

BibTeX

@article{familiaritycons-2021/7,
  title={Familiarity, consistency, and systematizing in morphology.},
  author={Schumacher RA & Pierrehumbert JB},
  journal={Cognition},
  volume={212},
  number={ARTN 104512},
  pages={104512},
  year = "2021"
}

HateCheck: functional tests for hate speech detection models

Röttger P, Vidgen B, Dong N, Waseem Z, Margetts H et al. (2021), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 41-58

BibTeX

@inproceedings{hatecheckfuncti-2021/7,
  title={HateCheck: functional tests for hate speech detection models},
  author={Röttger P, Vidgen B, Dong N, Waseem Z, Margetts H et al.},
  booktitle={59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)},
  pages={41-58},
  year = "2021"
}

Familiarity, consistency, and systematizing in morphology

Schumacher RA & Pierrehumbert J (2021), Cognition, 212

BibTeX

@article{familiaritycons-2021/4,
  title={Familiarity, consistency, and systematizing in morphology},
  author={Schumacher RA & Pierrehumbert J},
  journal={Cognition},
  volume={212},
  number={104512},
  publisher={Elsevier},
  year = "2021"
}

Morphological convergence as on-line lexical analogy

PIERREHUMBERT J, Racz P, Beckner C & Hay JB (2020), Language (Washington), 96(4), 735-770

BibTeX

@article{morphologicalco-2020/12,
  title={Morphological convergence as on-line lexical analogy},
  author={PIERREHUMBERT J, Racz P, Beckner C & Hay JB},
  journal={Language (Washington)},
  volume={96},
  pages={735-770},
  publisher={Linguistic Society of America},
  year = "2020"
}

HateCheck: Functional Tests for Hate Speech Detection Models

Röttger P, Vidgen B, Nguyen D, Waseem Z, Margetts H et al. (2020)

BibTeX

@misc{hatecheckfuncti-2020/12,
  title={HateCheck: Functional Tests for Hate Speech Detection Models},
  author={Röttger P, Vidgen B, Nguyen D, Waseem Z, Margetts H et al.},
  year = "2020"
}

DagoBERT: generating derivational morphology with a pretrained language model

Hofmann V, Pierrehumbert J & Schütze H (2020), Proceedings of the Conference on Empirical Methods in Natural Language Processing (and forerunners) (EMNLP), 3848-3861

BibTeX

@inproceedings{dagobertgenerat-2020/11,
  title={DagoBERT: generating derivational morphology with a pretrained language model},
  author={Hofmann V, Pierrehumbert J & Schütze H},
  booktitle={The 2020 Conference on Empirical Methods in Natural Language Processing},
  pages={3848-3861},
  year = "2020"
}

Not All Indexical Cues Are Equal: Differential Sensitivity to Dimensions of Indexical Meaning in an Artificial Language

Rácz P, Hay JB & Pierrehumbert JB (2020), Language Learning, 70(3), 848-885

BibTeX

@article{notallindexical-2020/9,
  title={Not All Indexical Cues Are Equal: Differential Sensitivity to Dimensions of Indexical Meaning in an Artificial Language},
  author={Rácz P, Hay JB & Pierrehumbert JB},
  journal={Language Learning},
  volume={70},
  pages={848-885},
  publisher={Wiley},
  year = "2020"
}

Predicting the growth of morphological families from social and linguistic factors

Hofmann V, Pierrehumbert J & Schütze H (2020), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics(2020), 7273-7283

BibTeX

@inproceedings{predictingthegr-2020/7,
  title={Predicting the growth of morphological families from social and linguistic factors},
  author={Hofmann V, Pierrehumbert J & Schütze H},
  booktitle={2020 Annual Conference of the Association for Computational Linguistics},
  pages={7273-7283},
  year = "2020"
}

A graph auto-encoder model of derivational morphology

Hofmann V, Schütze H & Pierrehumbert JB (2020), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 1127-1138

BibTeX

@inproceedings{agraphautoencod-2020/7,
  title={A graph auto-encoder model of derivational morphology},
  author={Hofmann V, Schütze H & Pierrehumbert JB},
  booktitle={2020 Annual Conference of the Association for Computational Linguistics},
  pages={1127-1138},
  year = "2020"
}

Morphological convergence as on-line lexical analogy

Rácz P, Beckner C, Hay JB & Pierrehumbert JB (2020), Language, 96(4), 735-770

BibTeX

@article{morphologicalco-2020/,
  title={Morphological convergence as on-line lexical analogy},
  author={Rácz P, Beckner C, Hay JB & Pierrehumbert JB},
  journal={Language},
  volume={96},
  pages={735-770},
  publisher={Johns Hopkins University Press},
  year = "2020"
}

Word frequency effects in sound change as a consequence of perceptual asymmetries: An exemplar-based model.

Todd S, Pierrehumbert JB & Hay J (2019), Cognition, 185, 1-20

BibTeX

@article{wordfrequencyef-2019/4,
  title={Word frequency effects in sound change as a consequence of perceptual asymmetries: An exemplar-based model.},
  author={Todd S, Pierrehumbert JB & Hay J},
  journal={Cognition},
  volume={185},
  pages={1-20},
  year = "2019"
}

On hapax legomena and morphological productivity

Pierrehumbert J & Granell R (2018), Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology, 125-130

BibTeX

@inproceedings{onhapaxlegomena-2018/10,
  title={On hapax legomena and morphological productivity},
  author={Pierrehumbert J & Granell R},
  pages={125-130},
  year = "2018"
}

Gendered Associations of English Morphology

Needle JM & Pierrehumbert JB (2018), LABORATORY PHONOLOGY, 9(1)

BibTeX

@article{genderedassocia-2018/9,
  title={Gendered Associations of English Morphology},
  author={Needle JM & Pierrehumbert JB},
  journal={LABORATORY PHONOLOGY},
  volume={9},
  number={ARTN 14},
  year = "2018"
}

On Hapax Legomena and Morphological Productivity

Pierrehumbert J & Granell R (2018), 125-130

BibTeX

@inproceedings{onhapaxlegomena-2018/1,
  title={On Hapax Legomena and Morphological Productivity},
  author={Pierrehumbert J & Granell R},
  booktitle={Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology},
  pages={125-130},
  year = "2018"
}

The emergence of linguistic structure in an online iterated learning task

Beckner C, Pierrehumbert JB & Hay JB (2017), Journal of Language Evolution, 2(2), 160-176

BibTeX

@article{theemergenceofl-2017/4,
  title={The emergence of linguistic structure in an online iterated learning task },
  author={Beckner C, Pierrehumbert JB & Hay JB},
  journal={Journal of Language Evolution},
  volume={2},
  pages={160-176},
  publisher={Oxford University Press},
  year = "2017"
}

Prior expectations in linguistic learning: a stochastic model of individual differences

Schumacher RA & Pierrehumbert JB (2017), Proceedings of the 39th Meeting of the Cognitive Science Society

BibTeX

@inproceedings{priorexpectatio-2017/1,
  title={Prior expectations in linguistic learning: a stochastic model of individual differences},
  author={Schumacher RA & Pierrehumbert JB},
  year = "2017"
}

Social salience discriminates learnability of contextual cues in an artificial language

Rácz P, Hay JB & Pierrehumbert JB (2017), Frontiers in Psychology, 8, 51

BibTeX

@article{socialsalienced-2017/1,
  title={Social salience discriminates learnability of contextual cues in an artificial language},
  author={Rácz P, Hay JB & Pierrehumbert JB},
  journal={Frontiers in Psychology},
  volume={8},
  pages={51},
  publisher={Frontiers Media},
  year = "2017"
}

Gradient Maori phonotactics

Rácz P, Hay JB, Needle J, King J & Pierrehumbert JB (2016), Te Reo, 59, 3-21

BibTeX

@article{gradientmaoriph-2016/11,
  title={Gradient Maori phonotactics},
  author={Rácz P, Hay JB, Needle J, King J & Pierrehumbert JB},
  journal={Te Reo},
  volume={59},
  pages={3-21},
  publisher={Linguistic Society of New Zealand},
  year = "2016"
}

Variation in the strength of lexical encoding across dialects

Clopper CG, Tamati TN & Pierrehumbert JB (2016), Journal of Phonetics, 58, 87-103

BibTeX

@article{variationinthes-2016/7,
  title={Variation in the strength of lexical encoding across dialects},
  author={Clopper CG, Tamati TN & Pierrehumbert JB},
  journal={Journal of Phonetics},
  volume={58},
  pages={87-103},
  publisher={Elsevier},
  year = "2016"
}

Phonological representation: beyond abstract versus episodic

Pierrehumbert J (2016), Annual Review of Linguistics, 2(1), 33-52

BibTeX

@article{phonologicalrep-2016/1,
  title={Phonological representation: beyond abstract versus episodic},
  author={Pierrehumbert J},
  journal={Annual Review of Linguistics},
  volume={2},
  pages={33-52},
  publisher={Annual Reviews},
  year = "2016"
}

Using pronunciation-based morphological subword units to improve OOV handling in keyword search

He Y, Baumann P, Fang H, Hutchinson B, Jaech A et al. (2015), IEEE/ACM Transactions on Audio, Speech and Language Processing, 24(1), 79-92

BibTeX

@article{usingpronunciat-2015/10,
  title={Using pronunciation-based morphological subword units to improve OOV handling in keyword search},
  author={He Y, Baumann P, Fang H, Hutchinson B, Jaech A et al.},
  journal={IEEE/ACM Transactions on Audio, Speech and Language Processing},
  volume={24},
  pages={79-92},
  publisher={Institute of Electrical and Electronics Engineers},
  year = "2015"
}

Showing 50 publications by Janet B. Pierrehumbert

Derivational morphology reveals analogical generalization in large language models

Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups

When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks

Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs

One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks

Conspiracy detection beyond text: exploring the feasibility of adding psycho-linguistic features to enhance conspiracy detection models

Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs

Time machine GPT

Probing large language models for scalar adjective lexical semantics and scalar diversity pragmatics

STEntConv: predicting disagreement between Reddit users with stance detection and a signed graph convolutional network

Graph-enhanced large language models in asynchronous plan reasoning

Geographic adaptation of pretrained language models

Time Machine GPT

Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

Not wacky vs. definitely wacky: a study of scalar adverbs in pretrained language models

Unsupervised detection of contextualized embedding bias with application to ideology

Modeling ideological salience and framing in polarized online groups with graph neural networks and structured sparsity

Two contrasting data annotation paradigms for subjective NLP tasks

Forecasting COVID-19 caseloads using unsupervised embedding clusters of social media posts

An embarrassingly simple method to mitigate undesirable properties of pretrained language model tokenizers

Phonotactic and morphological effects in the acceptability of pseudowords

The Reddit Politosphere: a large-scale text and network resource of online political discourse

Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts

Commentary on Chapter 11: Comparing the PENTA model to autosegmental-metrical phonology

Temporal adaptation of BERT and performance on downstream document classification: insights from social media

Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks

Predicting COVID-19 cases using Reddit posts and other online resources

Superbizarre is not superb: derivational morphology improves BERT’s interpretation of complex words

Dynamic contextualized word embeddings

Familiarity, consistency, and systematizing in morphology.

HateCheck: functional tests for hate speech detection models

Familiarity, consistency, and systematizing in morphology

Morphological convergence as on-line lexical analogy

HateCheck: Functional Tests for Hate Speech Detection Models

DagoBERT: generating derivational morphology with a pretrained language model

Not All Indexical Cues Are Equal: Differential Sensitivity to Dimensions of Indexical Meaning in an Artificial Language

Predicting the growth of morphological families from social and linguistic factors

A graph auto-encoder model of derivational morphology

Morphological convergence as on-line lexical analogy

Word frequency effects in sound change as a consequence of perceptual asymmetries: An exemplar-based model.

On hapax legomena and morphological productivity

Gendered Associations of English Morphology

On Hapax Legomena and Morphological Productivity

The emergence of linguistic structure in an online iterated learning task

Prior expectations in linguistic learning: a stochastic model of individual differences

Social salience discriminates learnability of contextual cues in an artificial language

Gradient Maori phonotactics

Variation in the strength of lexical encoding across dialects

Phonological representation: beyond abstract versus episodic

Using pronunciation-based morphological subword units to improve OOV handling in keyword search