BBC Media Dataset for Research

A data resource enabling University of Manchester researchers to carry out qualitative, quantitative and computational media analyses over a large collection of BBC TV programmes.

Stacked rows of vintage black-and-white televisions with blank screens, arranged closely on shelves in a monochrome setting.

Overview

The BBC-AVS 10K Dataset has been recently shared with The University of Manchester (UoM) through the BBC Data Science Research Partnership (DSRP).

It consists of more than 10,000 TV programmes broadcast by the BBC between 2007 and 2017.

For each programme, audio, video, subtitle data and associated metadata  are included. 

Any student or researcher who is affiliated with UoM can access and utilise the dataset, subject to approval. The review will be carried out by a representative of the BBC and will be coordinated by Dr Riza Batista-Navarro, the UoM BBC DSRP Lead and Data Manager.

What's Included in the Dataset

The dataset consists of a total of 10,160 TV programmes broadcast by the BBC between 2007 and 2017 in the UK. The contents of these programmes were originally recorded between 1962 and 2017.

Due to compliance policies and editorial restrictions, some  programmes were excluded from the dataset. To find more information about the specific programmes, a spreadsheet is available to UoM students and staff.

For information on the metadata about the TV programmes included in the dataset, as well as file formats, please read the detailed description prepared by the BBC.

To illustrate how the dataset is organised/structured, the BBC have provided some readily downloadable sample data (accessible only to UoM students and staff).

A document summarising the coverage of the BBC-AVS 10K Dataset is available for UoM students and staff.

FAQs

Contact Information

If you have any further questions, please contact the UoM BBC DSRP Lead and Data Manager, Dr Riza Batista-Navarro.