With the rise of large language models (LLMs), computational methods are becoming increasingly popular in psycholinguistics. This post gathers key resources on using LLM-based surprisal for syntactic ambiguity research, as well as tutorials on training and evaluating these models.
Word surprisal is a commonly used indicator of processing difficulty. Check out a unified interface for computing surprisal from language models and tools for calculating psycholinguistically-relevant metrics of language statistics using transformer language models.
Check out this paper for a large-scale dataset on using LLMs surprisal to explain syntactic disambiguation difficulty: Syntactic Ambiguity Processing Benchmark
This tutorials and resources on LLMs training and evaluation from Professor Suhas Arehalli can be helpful in model training and manipulation.
In psycholinguistic research, various paradigms are used to investigate the mechanisms underlying langauge processing. Once a paradigm is chosen, the next step is to design materials and conduct experiments.
Paradigms (keep updating)
When materials involve images, this can be helpful: Pictures for experimental tasks. For audio stimuli, this tutorials and praat script from Professor Danielle Daidone can be helpful for sound synthesis and manipulation.
When stimuli are ready to go, Pavlovia.org is a good platform for data collection.