Agenda-setting by the European Commission: Using Computer-Assisted Classification Methods to Classify Policy Documents


The European Commission is often considered to be the main agenda-setter in EU policymaking. The Commission produces hundreds of policy documents every year. However, reliable quantitative accounts of the agenda-setting activity of the Commission over extended periods of time and across several policy areas are not available. We need such accounts if we want to be able to adjudicate between competing theories of the role of the Commission in the European integration process. While information on the Commission’s agenda-setting activity is now in principle available in online databases such as PreLex, a major problem is the absence of a comprehensive policy classification scheme with mutually exclusive categories. For a considerable number of documents in PreLex, policy labels are either missing or several labels are assigned to a single document, making the classification ambiguous. In this study, I conduct classification experiments to investigate the performance of computer-assisted document classification methods for generating correct policy labels for Commission documents solely based on the words in their titles. The findings indicate that the support vector machine classifier is able to classify about 75 percent of the documents into the correct policy category, regardless of which combination of pre-processing options are chosen to generate the input matrix for the analysis.

Prepared for the workshop Quantifying Europe, 13 - 14 December, University of Mannheim.
Frank M. Häge
Frank M. Häge
Political Scientist

Senior Lecturer at the University of Limerick. Interested in Legislative Politics, European Union Politics, and Historical Political Economy.