In this tutorial, we will show how to individually recode and refine the MARPOR coding scheme for quasi-sentences accessed via the Manifesto Corpus. This is useful for research questions that need a more differentiated category scheme or which look at larger time periods, in which manifestos have been coded with different versions of the coding scheme.
We assume that you have already read First steps with manifestoR as well as Using the Manifesto Corpus with quanteda and that you are familiar with the pipe %>% operator.
Motivation for the tutorial
The newest version of the Manifesto Coding scheme contains 56 substantial categories. Over the years, the categories have changed and new sub-categories have been introduced. An overview of important changes between the different versions as well as the explicit Coding Instructions can be found here and here.
However, there are research questions that ask for a more finegrained category scheme or that take into account MARPOR-data from a wider time frame with different versions of coding schemes. For instances like these, this tutorial will show how you can modify the coding according to your research interest.
We will perform these steps:
1. Create a corpus
2. Create a shiny app to be used as a user interface for recoding
3. Keep track of the recoding by saving progress to a file
4. Once finished, read the codes in the file and update the corpus with them
manifestoR and shiny
We first use the usual “header” of a manifestoR script: loading packages, setting the api-key and fixing the corpus version (to ensure reproducibility).
library(manifestoR)
library(dplyr)
library(tidyr)
library(stringr)
library(shiny)
mp_setapikey(key.file = "manifesto_apikey.txt")
mp_use_corpus_version("2019-1")
This R code presumes that you have stored and downloaded the API key in a file named manifesto_apikey.txt in your current R working directory. Note that it is a security risk to store the API key file or a script containing the key in public repositories.
Example 1
Recode from one main category to one specific sub-category
From the tutorial Subcategories in the Manifesto Coding Scheme you know that since version 5 of the coding instructions there exist subcategories for 12 main categories.
In this first example we will recode quasi-sentences from the manifesto of the Social Democratic Party of Germany (SPD) for the 2013 election in Germany, which have been coded with category 202 (Democracy). In the newest version of the category scheme (Version 5), category 202 has four subcategories, which allow for a more finegrained analysis. So, if one of these sentences concerns direct democracy, it would be coded with 202.4 according to the newest coding scheme, which has been applied to the 2017 manifesto. The goal of this first example is to ensure comparability of two manifestos, which have originally been coded with different coding schemes.
In a first step we construct a corpus for the SPD manifestos from 2013 and 2017.
<- manifestoR::mp_corpus(countryname == "Germany" &
corpus_spd >= "2013" &
year == "41320") party
## Connecting to Manifesto Project DB API...
## Connecting to Manifesto Project DB API... corpus version: 2019-1
## Connecting to Manifesto Project DB API...
## Connecting to Manifesto Project DB API... corpus version: 2019-1
## Connecting to Manifesto Project DB API... corpus version: 2019-1
## Connecting to Manifesto Project DB API... corpus version: 2019-1
print(corpus_spd)
## <<ManifestoCorpus>>
## Metadata: corpus specific: 0, document level (indexed): 0
## Content: documents: 2
table(codes(corpus_spd))
##
## 000 101 103 104 105 106 107 108 109 110 201 201.1 201.2
## 37 4 4 24 46 68 216 194 6 12 75 22 37
## 202 202.1 202.3 202.4 203 204 301 302 303 304 305 305.1 305.3
## 129 83 1 6 13 6 95 37 51 21 78 31 2
## 401 402 403 404 405 406 408 409 410 411 412 413 414
## 28 159 323 4 56 2 25 20 90 408 173 22 40
## 416 416.2 501 502 503 504 505 506 601 601.1 601.2 602 602.2
## 30 77 204 137 517 483 7 286 29 24 16 3 24
## 603 604 605 605.1 605.2 606 606.1 607 607.1 607.2 607.3 608 608.1
## 26 61 67 151 4 94 65 34 7 29 1 7 2
## 608.2 701 703 703.1 704 705 706 H
## 2 392 15 18 2 12 69 107
We see that in the two manifestos 129 quasi-sentences have been coded with 202 and 6 with 202.4. The latter stem from the 2017 manifesto, which has already been coded according to the newest coding scheme.
In the next step we create a data frame with the quasi-sentences that have been allocated to the code(s) of interest. In this case we want to select quasi-sentences coded with 202.
<- corpus_spd[["41320_201309"]]$content %>%
to_be_recoded mutate(pos = row_number()) %>%
rename(code = cmp_code) %>%
select(text, code, pos) %>%
filter(code %in% c("202"))
text | code | pos |
---|---|---|
Demokratie | 202 | 11 |
Die SPD ist und bleibt die große politische Kraft für Demokratie und Emanzipation in Deutschland. | 202 | 19 |
Die Ablehnung des Ermächtigungsgesetzes der Nazis vor 80 Jahren durch die SPD ist bis heute ein beispielloser Ausweis für unsere demokratische Grundhaltung und Überzeugung. | 202 | 20 |
Zu dieser großen sozialdemokratischen Geschichte gehört auch die Gründung der SDP oder Ost-SPD im Oktober 1989, mit der Sozialdemokratinnen und Sozialdemokraten ihren Beitrag zur friedlichen Revolution in Deutschland geleistet haben. | 202 | 23 |
Wir leben Demokratie und werden dies weiter tun. | 202 | 24 |
Die Politik muss dem Gemeinwohl verpflichtet sein und nicht wirtschaftlichen Einzelinteressen. | 202 | 54 |
Die stärkste Lobby in Deutschland müssen endlich wieder die Bürgerinnen und Bürger sein. | 202 | 55 |
Wir werden die Probleme und Sorgen der Bürgerinnen und Bürger wieder in den Mittelpunkt der Politik stellen – und nicht die Interessen anonymer Finanzmärkte. | 202 | 103 |
Deshalb haben wir als erste Partei in Deutschland in einem breit angelegten Bürgerdialog die Menschen in Deutschland gefragt, was in unserem Land besser werden muss. | 202 | 104 |
Die Antworten und Projekte aus diesem Bürgerdialog sind in dieses Regierungsprogramm eingeflossen. | 202 | 105 |
Wir wollen das Gemeinwohl in den Mittelpunkt unserer Politik stellen. | 202 | 111 |
Und von einer Politik des Gemeinwohls, nicht einer des Egoismus und der Lobby- und der Sonderinteressen. | 202 | 119 |
Wir leben heute in einer radikal veränderten Welt. | 202 | 209 |
Deshalb wollen wir die Demokratie stärken und das Vertrauen daraus zurückgewinnen, dass demokratisches Engagement und demokratische Politik unser Zusammenleben besser und gerechter machen können. | 202 | 210 |
Deshalb sind vor allem wir Sozialdemokratinnen und Sozialdemokraten gefordert, auf neuen Wegen, die sozial und ökologisch ausgerichtet sind, unser historisches Projekt der Emanzipation neu zu begründen und zu verwirklichen. | 202 | 211 |
Mehr Demokratie, | 202 | 219 |
Wir werden deshalb nachweisen, wie hoch die zusätzlichen Einnahmen durch die genannten Steuererhöhungen sind | 202 | 249 |
brauchen wir eine stärkere Demokratisierung Europas: | 202 | 275 |
Europa gehört den Bürgerinnen und Bürgern. | 202 | 276 |
Das gilt auch für die Eurozone. | 202 | 277 |
Here we can see the first six of 129 quasi-sentences that have been coded with 202. Now we want to edit the codes, when a quasi-sentence concerns direct democracy to 202.4. For this we make use of a shiny-app.
A shiny-app is a web app written in R code. It consists of a UI function for the visible content and a server function containing the R code. For more information see the website.
We need to create the basic parts of a shiny app - the ui and server functions. See here for more info.
First, we create a function called createUIFunction
, which returns the UI (user interface) function for the app. The key shiny code is contained within the basicPage
function call.
Note that this includes some custom javascript, which is not within the scope of this tutorial to explain. You can take it as given.
# create the UI
<- function() {
createUIFunction function() {
<- '
customJS //called from within shiny R code
Shiny.addCustomMessageHandler("addListenersCallbackHandler", function(message) {
//when selector is changed
$("#df").on("change", "select", function(event) {
//get row number from first column
var row = parseInt($(event.target).closest("tr").children("td").first().text());
var code = $(this).find(":selected").val();
//pass code and row to shiny via input$selectedCode
Shiny.onInputChange("selectedCode", [row, code]);
});
});
'
basicPage(
# insert the javascript that attaches a listener to the html table,
# to listen for when a code is selected
$script(HTML(customJS)),
tagsdiv(style = "margin: 50px", actionButton("save", label = "Save Recoded")),
div(dataTableOutput("df")),
uiOutput("test")
)
} }
Next we create a function called createServerFunction
, which returns the server function for the app. The server function contains the “logic” for the app, written in R, which in our case includes modifying the codes as we select them and saving the changes to a working file. We can also export the progress to a CSV file called “recoded.csv”.
The function takes two arguments: codedDF
which is the coded data frame we want to edit, and codeOptions
which is a list of the possible codes we want to assign.
<- function(codeOptions, codedDF) {
createServerFunction function(input, output, session) {
<- "working_save.rds"
workingFilePath
<- dataTableProxy("df") # used to update the datatable on the client side (in the html)
proxy
if (is.null(codedDF)) {
# no working file found
if (!file.exists(workingFilePath)) {
stop("Please provide a coded dataframe!")
# read from the working file
} else {
<- readRDS(workingFilePath)
df
}else {
} if (!("code" %in% names(codedDF))) {
stop("Please provide a dataframe with a column named 'code'")
}<- data.frame(row = 1:nrow(codedDF), codedDF) # add a column called "row"
df
}
# generate the html for the input selectors
<- function(i, selected = "") {
selectorHTML as.character(selectInput(paste0("selectCode_", i),
label = NULL,
choices = c("", codeOptions), selected = selected
))
}$selector <- sapply(1:nrow(df), selectorHTML) # add the selectors as a column
df
# create a reactive values object so that we can keep a copy of the data frame in R
<- reactiveValues(df = df)
vals
# observe when an input is changed (passed in from javascript declared in ui function)
observeEvent(input$selectedCode, {
<- as.integer(input$selectedCode[1])
row <- input$selectedCode[2]
code <- vals$df
df $code[row] <- code # update code
df$selector[row] <- selectorHTML(row, code) # update selector
df# push the updated dataframe to the page
::replaceData(proxy, df, resetPaging = FALSE, rownames = FALSE)
DT$df <- df # save the updated dataframe in R
vals
})
$df <- renderDataTable({
output# isolated so that it doesn't refresh when we change the table
isolate({
::datatable(df,
DTescape = names(df) != "selector",
selection = "none", rownames = FALSE
)
})
})# every time the data frame is changed, save a working copy as an rds file
observeEvent(vals$df, {
saveRDS(vals$df, workingFilePath)
})# if the save button is clicked,
# save a csv of the data without the input selector HTML or "row" column
observeEvent(input$save, {
write.csv(vals$df[-which(names(vals$df) %in% c("selector", "row"))],
"recoded.csv",
row.names = FALSE, fileEncoding = "UTF-8"
)
})# send a message to the client side to attach the table listeners in javascript,
# as the HTML table now exists
observe({
$sendCustomMessage(type = "addListenersCallbackHandler", "")
session
})
} }
Now we create a function called launchApp
, which combines the ui and server functions into a shiny app and runs it. When called, it will launch the app in a web browser.
<- function(codeOptions, codedDF = NULL) {
launchApp # if you cannot/ don't want to edit the coding in one session
# and continue at a later point, leave codedDF==NULL
library(shiny)
library(DT)
shinyApp(ui = createUIFunction(), server = createServerFunction(codeOptions, codedDF))
}
Now we launch the app with our coded data frame and possible codes.
launchApp(c("202", "202.4"), to_be_recoded)
As we code, the app automatically saves progress to a working file. If we want to close the app and resume this later, we call launchApp
without specifying the second argument codedDF
, and it will load what we have done from the working file.
launchApp(c("202", "202.4"))
When you are done recoding, you click the Save Recoded
button to export the progress a csv file in your working directory. This is the file we will use to update the corpus. Once we have done this, we can read it in:
<- read.csv("recoded.csv", stringsAsFactors = FALSE, fileEncoding = "UTF-8") recoded_spd
text | code | pos |
---|---|---|
Demokratie | 202.0 | 11 |
Die SPD ist und bleibt die große politische Kraft für Demokratie und Emanzipation in Deutschland. | 202.0 | 19 |
Die Ablehnung des Ermächtigungsgesetzes der Nazis vor 80 Jahren durch die SPD ist bis heute ein beispielloser Ausweis für unsere demokratische Grundhaltung und Überzeugung. | 202.0 | 20 |
Zu dieser großen sozialdemokratischen Geschichte gehört auch die Gründung der SDP oder Ost-SPD im Oktober 1989, mit der Sozialdemokratinnen und Sozialdemokraten ihren Beitrag zur friedlichen Revolution in Deutschland geleistet haben. | 202.0 | 23 |
Wir leben Demokratie und werden dies weiter tun. | 202.0 | 24 |
Die Politik muss dem Gemeinwohl verpflichtet sein und nicht wirtschaftlichen Einzelinteressen. | 202.0 | 54 |
Die stärkste Lobby in Deutschland müssen endlich wieder die Bürgerinnen und Bürger sein. | 202.0 | 55 |
Wir werden die Probleme und Sorgen der Bürgerinnen und Bürger wieder in den Mittelpunkt der Politik stellen und nicht die Interessen anonymer Finanzmärkte. | 202.0 | 103 |
Deshalb haben wir als erste Partei in Deutschland in einem breit angelegten Bürgerdialog die Menschen in Deutschland gefragt, was in unserem Land besser werden muss. | 202.4 | 104 |
Die Antworten und Projekte aus diesem Bürgerdialog sind in dieses Regierungsprogramm eingeflossen. | 202.4 | 105 |
Wir wollen das Gemeinwohl in den Mittelpunkt unserer Politik stellen. | 202.0 | 111 |
Und von einer Politik des Gemeinwohls, nicht einer des Egoismus und der Lobby- und der Sonderinteressen. | 202.0 | 119 |
Wir leben heute in einer radikal veränderten Welt. | 202.0 | 209 |
Deshalb wollen wir die Demokratie stärken und das Vertrauen daraus zurückgewinnen, dass demokratisches Engagement und demokratische Politik unser Zusammenleben besser und gerechter machen können. | 202.0 | 210 |
Deshalb sind vor allem wir Sozialdemokratinnen und Sozialdemokraten gefordert, auf neuen Wegen, die sozial und ökologisch ausgerichtet sind, unser historisches Projekt der Emanzipation neu zu begründen und zu verwirklichen. | 202.0 | 211 |
Mehr Demokratie, | 202.0 | 219 |
Wir werden deshalb nachweisen, wie hoch die zusätzlichen Einnahmen durch die genannten Steuererhöhungen sind | 202.0 | 249 |
brauchen wir eine stärkere Demokratisierung Europas: | 202.0 | 275 |
Europa gehört den Bürgerinnen und Bürgern. | 202.0 | 276 |
Das gilt auch für die Eurozone. | 202.0 | 277 |
Then we replace the codes that have been changed in the original corpus.
"41320_201309"]]$content$cmp_code[recoded_spd$pos] <- recoded_spd$code corpus_spd[[
If we have another look at the used codes, we see that there now there are only 109 quasi-sentences coded with 202, whereas the number of quasi-sentences coded with 202.4 has risen from 6 to 26.
table(codes(corpus_spd))
##
## 000 101 103 104 105 106 107 108 109 110 201 201.1 201.2
## 37 4 4 24 46 68 216 194 6 12 75 22 37
## 202 202.1 202.3 202.4 203 204 301 302 303 304 305 305.1 305.3
## 109 83 1 26 13 6 95 37 51 21 78 31 2
## 401 402 403 404 405 406 408 409 410 411 412 413 414
## 28 159 323 4 56 2 25 20 90 408 173 22 40
## 416 416.2 501 502 503 504 505 506 601 601.1 601.2 602 602.2
## 30 77 204 137 517 483 7 286 29 24 16 3 24
## 603 604 605 605.1 605.2 606 606.1 607 607.1 607.2 607.3 608 608.1
## 26 61 67 151 4 94 65 34 7 29 1 7 2
## 608.2 701 703 703.1 704 705 706 H
## 2 392 15 18 2 12 69 107
Example 2
Recode from one main category to all sub- categories
In the first example we were only looking for quasi-sentences that concern direct democracy. However, the main category 202 actually has four sub-categories:
- 202.1 General: Positive
- 202.2 General: Negative
- 202.3 Representative Democracy: Positive
- 202.4 Direct Democracy: Positive
In order to select between all four sub-categories, the code for the app only has to be adapted by changing the first argument in the launchApp-function:
launchApp(c("202.1", "202.2", "202.3", "202.4"))
Example 3
Make your own categories
It might also be the case that you do not want to use MARPOR categories, but instead use your own code. For this, you just construct a dataframe with the quasi-sentences of interest and then give the app your own codes. Please be aware that you should not use numeric codes that are already used in the Coding scheme, but create new ones.
launchApp(c("a", "b", "tiger", "kitchen"))
Session Info
Tested with:
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.3 (2020-10-10)
## date 2021-06-15
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date lib source
## assertthat 0.2.0 2017-04-11 [NA] CRAN (R 4.0.3)
## base64enc 0.1-3 2015-07-28 [NA] CRAN (R 4.0.2)
## bookdown 0.22 2021-04-22 [NA] CRAN (R 4.0.2)
## cli 1.1.0 2019-03-19 [NA] CRAN (R 4.0.3)
## crayon 1.3.4 2017-09-16 [NA] CRAN (R 4.0.2)
## curl 3.2 2018-03-28 [NA] CRAN (R 4.0.3)
## digest 0.6.21 2019-09-20 [NA] CRAN (R 4.0.3)
## dplyr * 1.0.6 2021-05-05 [NA] CRAN (R 4.0.2)
## DT 0.7 2019-06-11 [NA] CRAN (R 4.0.3)
## ellipsis 0.3.2 2021-04-29 [NA] CRAN (R 4.0.3)
## evaluate 0.14 2019-05-28 [NA] CRAN (R 4.0.1)
## fansi 0.4.0 2018-10-05 [NA] CRAN (R 4.0.3)
## fastmap 1.0.0 2019-07-28 [NA] CRAN (R 4.0.3)
## foreign 0.8-70 2018-04-23 [NA] CRAN (R 4.0.3)
## functional 0.6 2014-07-16 [NA] CRAN (R 4.0.2)
## generics 0.0.2 2018-11-29 [NA] CRAN (R 4.0.2)
## glue 1.4.2 2020-08-27 [NA] CRAN (R 4.0.2)
## highr 0.6 2016-05-09 [NA] CRAN (R 4.0.3)
## hms 0.4.2 2018-03-10 [NA] CRAN (R 4.0.3)
## htmltools 0.4.0 2019-10-04 [NA] CRAN (R 4.0.3)
## htmlwidgets 1.5.3 2020-12-10 [NA] CRAN (R 4.0.2)
## httpuv 1.5.2 2019-09-11 [NA] CRAN (R 4.0.3)
## httr 1.3.1 2017-08-20 [NA] CRAN (R 4.0.3)
## jsonlite 1.6 2018-12-07 [NA] CRAN (R 4.0.3)
## knitr 1.33 2021-04-24 [NA] CRAN (R 4.0.2)
## later 1.0.0 2019-10-04 [NA] CRAN (R 4.0.3)
## lattice 0.20-35 2017-03-25 [NA] CRAN (R 4.0.3)
## lifecycle 1.0.0 2021-02-15 [NA] CRAN (R 4.0.2)
## magrittr 2.0.1 2020-11-17 [NA] CRAN (R 4.0.2)
## manifestoR * 1.5.0 2020-11-29 [NA] CRAN (R 4.0.2)
## mime 0.5 2016-07-07 [NA] CRAN (R 4.0.3)
## mnormt 1.5-5 2016-10-15 [NA] CRAN (R 4.0.3)
## nlme 3.1-131 2017-02-06 [NA] CRAN (R 4.0.3)
## NLP * 0.1-9 2016-02-18 [NA] CRAN (R 4.0.3)
## pillar 1.6.1 2021-05-16 [NA] CRAN (R 4.0.2)
## pkgconfig 2.0.2 2018-08-16 [NA] CRAN (R 4.0.3)
## promises 1.1.0 2019-10-04 [NA] CRAN (R 4.0.3)
## psych 1.8.3.3 2018-03-30 [NA] CRAN (R 4.0.3)
## purrr 0.3.2 2019-03-15 [NA] CRAN (R 4.0.3)
## R6 2.2.2 2017-06-17 [NA] CRAN (R 4.0.3)
## Rcpp 1.0.0 2018-11-07 [NA] CRAN (R 4.0.3)
## readr 1.3.1 2018-12-21 [NA] CRAN (R 4.0.3)
## rlang 0.4.10 2020-12-30 [NA] CRAN (R 4.0.2)
## rmarkdown 2.8 2021-05-07 [NA] CRAN (R 4.0.2)
## rmdformats 1.0.2 2021-04-19 [NA] CRAN (R 4.0.2)
## sessioninfo 1.1.1 2018-11-05 [NA] CRAN (R 4.0.2)
## shiny * 1.4.0 2019-10-10 [NA] CRAN (R 4.0.3)
## slam 0.1-40 2016-12-01 [NA] CRAN (R 4.0.3)
## stringi 1.1.7 2018-03-12 [NA] CRAN (R 4.0.3)
## stringr * 1.3.0 2018-02-19 [NA] CRAN (R 4.0.3)
## tibble 3.1.2 2021-05-16 [NA] CRAN (R 4.0.2)
## tidyr * 0.8.0 2018-01-29 [NA] CRAN (R 4.0.3)
## tidyselect 1.1.1 2021-04-30 [NA] CRAN (R 4.0.3)
## tm * 0.7-5 2018-07-29 [NA] CRAN (R 4.0.3)
## utf8 1.1.3 2018-01-03 [NA] CRAN (R 4.0.3)
## vctrs 0.3.8 2021-04-29 [NA] CRAN (R 4.0.3)
## withr 2.1.2 2018-03-15 [NA] CRAN (R 4.0.3)
## xfun 0.23 2021-05-15 [NA] CRAN (R 4.0.2)
## xml2 1.2.0 2018-01-24 [NA] CRAN (R 4.0.3)
## xtable 1.8-2 2016-02-05 [NA] CRAN (R 4.0.3)
## yaml 2.2.0 2018-07-25 [NA] CRAN (R 4.0.3)
## zoo 1.7-13 2016-05-03 [NA] CRAN (R 4.0.3)