Master'sBSE2025-2026

Forecasting and Nowcasting with Text as Data

Master's in Data Science for Decision Making · Barcelona School of Economics

2025-2026

We will learn how to transform unstructured text into usable signals, evaluate models in applied settings, and connect these tools to questions in the social sciences and policy.

The course covers a range of methods—from simple dictionary-based approaches to supervised models and modern LLM-based techniques—highlighting their strengths, limitations, and trade-offs. A central focus is on building text-based indicators to nowcast real-world events and forecast risks in applied contexts.

Particular emphasis is placed on fine-tuning, model evaluation, and threshold selection, with decisions guided by policy-relevant trade-offs (e.g., false positives vs. missed events). We also discuss why frequency mismatches in data matter, and introduce mixed-frequency methods to better integrate information from different sources.

By the end of the course, students will be able to construct text-based indicators using state-of-the-art methods that capture semantic and contextual information, and deploy them in decision-oriented applications.

Class Schedule

Office hours: Available by prior request.

Session 1

Thursday 30 April

15:00-17:00

24.009 (Ciutadella)

Session 2

Wednesday 06 May

08:30-12:30

24.112 (Ciutadella)

Session 3

Friday 08 May

12:30-14:30

24.009 (Ciutadella)

Environment Setup

Please complete this setup before Session 1.

We will use the materials in the GitHub repository below. Before class, please make sure you have Python 3.11, Git, and Visual Studio Code with the Python, Pylance, and Jupyter extensions installed.

Repository BSE-ForecastNLP

Step 01
Verify Python 3.11
Check that Python 3.11 is available on your machine.
python3.11 --version
If Python 3.11 is not installed, download it from the official Python website.
Step 02
Clone the course repository
git clone https://github.com/RenatoVassallo/BSE-ForecastNLP.git cd BSE-ForecastNLP
Step 03
Create and activate the virtual environment
macOS / Linux
python3.11 -m venv .venv source .venv/bin/activate
Windows
python -m venv .venv .venv\Scripts\activate
Step 04
Install dependencies
pip install --upgrade pip pip install -r requirements.txt
This step may take a few minutes depending on your system.
Step 05
Select the environment in VS Code
Open a notebook in VS Code and select the .venv environment as the kernel.
You can then open session1/test.ipynb and run it to confirm that the installation was successful.
Optional
uv setup
If you are already familiar with uv, you can use it instead.
uv sync

Detailed installation guide for macOS and Windows (PDF)

Course Materials

Slides, code, and additional resources will be posted below as the course progresses.

Introduction

Course framing, objectives, and workflow

Slides

Session 01

From text to signals

Dictionaries, model-based approaches and LLMs

Slides Data

Session 02

From signals to decisions

Fine-tuning and policy-oriented evaluation

Slides Data

Session 03

Mixed-frequency methods

Classic MIDAS and machine learning extensions

SlidesData

In-class assignment

20% of overall grade

Construct and evaluate text-based signals

Wednesday 06 May

Work in groups of up to 4 members.
Duration: 45-60 minutes, followed by a brief 5-minute presentation per group.
You will have access to three text corpora.
Select one corpus, construct a text-based signal using methods from Sessions 1-2, and apply it to a specific task (event detection, classification, monitoring, or forecasting).

Slides Data

Verify Python 3.11

Clone the course repository

Create and activate the virtual environment

Install dependencies

Select the environment in VS Code

uv setup

Introduction

From text to signals

From signals to decisions

Mixed-frequency methods

Construct and evaluate text-based signals