Turning Text into Data

We will explore three methods to generate text-based variables, ranging from simple dictionary approaches to more complex, fine-tuned few-shot learning models using pre-trained LLMs. All examples in this post are implemented using FewShotX, a Python package for dictionary scoring, zero-shot, and few-shot learning in text classification. Explore the documentation and tutorials for hands-on notebooks. 1. Dictionary Methods Dictionary methods have been widely used in economics to transform textual data into quantitative indicators. A notable example is the Economic Policy Uncertainty (EPU) index developed by Baker, Bloom, and Davis (2016), where the frequency of specific terms in newspaper articles is used to capture policy-related uncertainty over time. ...

June 4, 2025 · 5 min · Renato Vassallo