In this post we delve into the get_mp() function and the parameters you can specify to query for more specific information.
The get_mp() function has the following arguments:
selection: options are “current”, “former” or “date”
when selecting “date”, “date_at” should be added as “YYYY-MM-DD”
fact: options are “bio”, “career”, “education”, “presences_commissions”, “presences_plenary”, “political_info”, and “raw”
use_parallel: Boolean of which the default value is set to TRUE, select FALSE in case you do not have multiple workers to speed up the calls made
To retrieve the demographics of the MPs currently in parliament (same as default), we specify the following: selection="current" and fact="bio". Note that you can each time store the data frame in your R-environment by including the following <- name data frame. This has the benefit of storing the data in a table containing rows and columns that you can use for further data manipulation and analysis.
The result of our query is a data frame with 124 observations, the exact number of MPs currently in office, and their demographics. You will see that the data frame is successfully stored under the tab ‘Environment’. This is how the data frame looks like:
You can, for instance, check the distribution of men and women in parliament. We make use of the pipe operator %>% (which lets you pass an interim result onto the next function) and the data manipulation options from the dplyr package. To count observations by group, we use group_by() and tally(). You can find more information on how to use dplyrhere.
Example 2 - Extracting demographics at specific dates
So let’s take this a step further and analyze whether the number of women in parliament has grown throughout the years. To do so, we will extract the demographics of MPs from past legislatures (from 1995 onwards). We make use of the get_legislatures() function to identify the starting dates of each legislature.
Next, we specify the exact date from which we wish to collect the MPs’ demographics. Here, we opt for 01/12 of the starting year of each legislature, to make sure all MPs are sworn in. Instead of selection="current", we now fill in selection="date"and add the date_at="YYYY-MM-DD" argument in the the get_mp() function.
Now, we will select the relevant columns to answer our question and add a column for year. We again make use of the pipe operator (which lets you pass an interim result onto the next function) and the data manipulation options from the dplyr package (here: select() and add_column()).
This data frame then allows us to answer our research question. Has the number of women MPs grown throughout the years? We opt to plot the results in a histogram, making use of the ggplot package. A good place to start learning about ggplot is here.
Normally, mp_bio is still saved in your R-environment. If not, rerun the above line of code with fact="bio" and store the data frame. Next, we will combine the mp_bio with the mp_polinfo data frame, by using left_join(). In most cases, you join two data frames horizontally (in other words, pasting them next to each other) by one or more common unique identifiers. In our case, these identifiers are id_mp.
The breadth of data available through the API is truly impressive. This also entails that we opted to return some data as nested lists (list of lists) storing multiple types of values. Yet, we are here to extract those values from these nested lists and construct a data frame ready for analysis. A great resource to learn about (un)nesting is here.
For this example, we analyze MPs accumulation of local political mandates. To obtain information about this, we start from the data frame mp_polinfo for which you filled in fact="political_info" in the get_mp() function. Next, we will unnest the column mandaat-andere containing the info about MPs other mandates. To do so, we use select() to only keep those variables/columns of interest and use unnest() to dig up information from the nested columns.
The data frame cumul now includes two new columns mandaatgroepnaam and parlmandaat. The latter column again being nested. So we still do not know the type of political mandate MPs hold at the local level. Are they a mayor, alder person or council member? To find out, we first filter out the local mandates using filter() on the column mandaatgroepnaam and then unnest()the column parlmandaat.
Next, we filter out the mandates MPs currently hold, leaving out former local functions by dropping all observations for which an end date (datumtot) is registered. Additionally, we also drop all other non-relevant columns using select().
Finally, we create a new column in the data frame to register whether an MP holds a local function as mayor, alder person or council member. The column mandaat holds text (string values), we therefore recode this text into a new categorical numerical variable via an ifelse statement. To do so, we make use of the %like% operator to search through the text and identify who holds what local function.
plot3<-ggplot(local_mand, aes(x=as.factor(local_m))) +geom_bar(stat="count", fill="lightgreen") +geom_text(stat='count', aes(label=after_stat(count)), vjust=-1) +labs(x="Type of local mandate", y="Count") +theme_classic() +scale_x_discrete(labels=c("1"="Mayor","2"="Alder","3"="Council", "4"="Other"))
And we can again join this data frame with another data frame. For example, to answer the following question: Are mayors more frequently absent in parliament than other MPs?, we need to extract additional data about MPs presences in parliament (here: commissions). This can be done through filling in fact="presences_commissions" in the get_mp() function. Alternatively, you can choose to look at MPs presences in plenary sessions via fact="presences_plenary".
We can then quickly explore these data to see which MPs are most often present/absent in the parliamentary committee sessions of which they are an effective member. We opted to focus on the top 10 most present/absent MPs (top_n(#, name column).
To answer the question, we first select the relevant columns and then join the mp_commissions data frame with the local_mand data frame. We also exclude those observations that are not an effective member of a certain commission and therefore got assigned NA when registering their commission presences.
Finally, we can visualize the results and conclude that mayors are not significantly more absent in commission sessions than other MPs. The data thus refutes the often heard claim that combining a local political mandate with a mandate in the Flemish parliament hampers parliamentary work.
plot4<-ggplot(mp_m_comm_omit, aes(x=factor(local_m), y=`vast-lid-aanwezigheid_afwezig`, group=factor(local_m), fill=factor(local_m))) +geom_boxplot(outlier.colour="red", outlier.shape=8, outlier.size=4) +labs(x="Type of local mandate", y="Absences in committee meetings") +theme_classic() +scale_x_discrete(labels=c("0"="None","1"="Mayor","2"="Alder","3"="Council", "4"="Other")) +theme(legend.position="none")