Disclaimer: The purpose of the Open Case Studies project is to demonstrate the use of various data science methods, tools, and software in the context of messy, real-world data. A given case study does not cover all aspects of the research process, is not claiming to be the most appropriate way to analyze a given dataset, and should not be used in the context of making policy decisions without external consultation from scientific experts.

This work is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0) United States License.

To cite this case study please use:

Wright, Carrie and Ontiveros, Michael and Jager, Leah and Taub, Margaret and Hicks, Stephanie. (2020). https://github.com/opencasestudies/ocs-bp-co2-emissions. Exploring CO2 emissions across time (Version v1.0.0).

To access the GitHub repository for this case study see here: https://github.com//opencasestudies/ocs-bp-co2-emissions.

You may also access and download the data using our OCSdata package. To learn more about this package including examples, see this link. Here is how you would install this package:

install.packages("OCSdata")

This case study is part of a series of public health case studies for the Bloomberg American Health Initiative.


The total reading time for this case study is calculated via koRpus and shown below:

Reading Time Method
70 minutes koRpus

Readability Score:

A readability index estimates the reading difficulty level of a particular text. Flesch-Kincaid, FORCAST, and SMOG are three common readability indices that were calculated for this case study via koRpus. These indices provide an estimation of the minimum reading level required to comprehend this case study by grade and age.

Text language: en 
index grade age
Flesch-Kincaid 9 14
FORCAST 10 15
SMOG 11 16

Please help us by filling out our survey.

Motivation


This case study explores how different countries have contributed to Carbon Dioxide (CO2) emissions over time and how CO2 emission rates may relate to increasing global temperatures and increased rates of natural disasters and storms. We used this report from the EPA as the basis for motivating this case study, as it provides background information about how CO2 emissions and other greenhouse gases have influenced the climate and weather patterns.

CO2 makes up the largest proportion of greenhouse gas emissions in the United States:

[source]

A variety of sources and sectors contribute to greenhouse gas emissions:

[source]

Transportation and Electricity contribute the most metric tons of CO2:

[source]

So why should we pay attention to greenhouse gases?

According to the US Environmental Protection Agency (EPA) Inventory of U.S. Greenhouse Gas Emissions and Sinks 2020 Report:

Greenhouse gases absorb infrared radiation, thereby trapping heat in the atmosphere and making the planet warmer. The most important greenhouse gases directly emitted by humans include carbon dioxide (CO2), methane (CH4), nitrous oxide (N2O), and several fluorine-containing halogenated substances. Although CO2, CH4, and N2O occur naturally in the atmosphere, human activities have changed their atmospheric concentrations. From the pre- industrial era (i.e., ending about 1750) to 2018, concentrations of these greenhouse gases have increased globally by 46, 165, and 23 percent, respectively (IPCC 2013; NOAA/ESRL 2019a, 2019b, 2019c).

* IPCC stands for the Intergovernmental Panel on Climate Change

In fact, there are many signs that our planet is experiencing warmer temperatures:

[source]

The connection between greenhouse gas levels and global temperatures and the influence of increased global temperatures on human health are motivated by these reports:

The National Climate Assessment Report states that:

Heat-trapping gases already in the atmosphere have committed us to a hotter future with more climate-related impacts over the next few decades. The magnitude of climate change beyond the next few decades depends primarily on the amount of heat-trapping gases that human activities emit globally, now and in the future.

See the following links for more information about how greenhouse gases have influenced global temperatures: 1) The EPA report on green house gases
2) The National Climate Assessment (NCA) summary from 2014) 3) The World101 website about how countries are adapting to climate change

Main Questions


Our main questions:

  1. How have global CO2 emission rates changed over time? In particular for the US, and how does the US compare to other countries?
  2. Are CO2 emissions in the US, global temperatures, and natural disaster rates in the US associated?

Learning Objectives


In this case study, we will explore CO2 emission data from around the world. We will also focus on the US specifically to evaluate patterns of temperatures and natural disaster activity.

This case study will particularly focus on how to use different datasets that span different ranges of time, as well as how to create visualizations of patterns over time. We will especially focus on using packages and functions from the tidyverse, such as dplyr, tidyr, and ggplot2.

The tidyverse is a library of packages created by RStudio. While some students may be familiar with previous R programming packages, these packages make data science in R especially legible and intuitive.

The skills, methods, and concepts that students will be familiar with by the end of this case study are:

Data Science Learning Objectives:

  1. Importing data from various types of Excel files and CSV files
  2. Apply action verbs in dplyr for data wrangling
  3. How to pivot between “long” and “wide” datasets
  4. Joining together multiple datasets using dplyr
  5. How to create effective longitudinal data visualizations with ggplot2
  6. How to add text, color, and labels to ggplot2 plots
  7. How to create faceted ggplot2 plots

Statistical Learning Objectives:

  1. Introduction to correlation coefficient as a summary statistic
  2. Relationship between correlation and linear regression
  3. Correlation is not causation


We will begin by loading the packages that we will need:

library(here)
library(readxl)
library(readr)
library(dplyr)
library(magrittr)
library(stringr)
library(purrr)
library(tidyr)
library(forcats)
library(ggplot2)
library(directlabels)
library(ggrepel)
library(broom)
library(patchwork)
library(OCSdata)

Packages used in this case study:

Package Use in this case study
here to easily load and save data
readxl to import the Excel file data
readr to import the csv file data
dplyr to view and wrangle the data, by modifying variables, renaming variables, selecting variables, creating variables, and arranging values within a variable
magrittr to use and reassign data objects using the %<>%pipe operator
stringr to select only the first 4 characters of date data
purrr to apply a function on a list of tibbles (tibbles are the tidyverse version of a data frame)
tidyr to drop rows with NA values from a tibble
forcats to reorder the levels of a factor
ggplot2 to make visualizations
directlabels to add labels to plots easily
ggrepel to add labels that don’t overlap to plots
broom to make the output form statistical tests easier to work with
patchwork to combine plots
OCSdata to access and download OCS data files

The first time we use a function, we will use the :: to indicate which package we are using. Unless we have overlapping function names, this is not necessary, but we will include it here to be informative about where the functions we will use come from.

Context


Now we will describe a bit more background about greenhouse gas emissions and the potential influence of these emissions on public health.

Greenhouse gas emissions are due to both natural processes and anthropogenic (human-derived) activities.

These emissions are one of the contributing factors to rising global temperatures, which can have a great influence on public health as illustrated in the following image:

[source]

According to the US Environmental Protection Agency (EPA) Inventory of U.S. Greenhouse Gas Emissions and Sinks 2020 Report:

Gases in the atmosphere can contribute to climate change both directly and indirectly. Direct effects occur when the gas itself absorbs radiation. Indirect radiative forcing occurs when chemical transformations of the substance produce other greenhouse gases, when a gas influences the atmospheric lifetimes of other gases, and/or when a gas affects atmospheric processes that alter the radiative balance of the earth (e.g., affect cloud formation or albedo).

The Global Warming Potential (GWP) compares the ability of a greenhouse gas to trap heat in the atmosphere relative to another gas.

The GWP of a greenhouse gas is defined as the ratio of the accumulated radiative forcing within a specific time horizon caused by emitting 1 kilogram of the gas, relative to that of the reference gas CO2 (IPCC 2013). Therefore GWP-weighted emissions are provided in million metric tons of CO2 equivalent (MMT CO2 Eq.)

[source]

CO2 is actually the least heat-trapping gas of the greenhouse gases:

[source]

However, because CO2 is so much more abundant and stays in the atmosphere so much longer than other greenhouse gases, it has been the largest contributor to global warming. See here for more details.

It is also important to keep in mind that there is a lag between greenhouse gas emissions and temperature changes that we experience because much of Earth’s thermal energy (and CO2) gets stored in the ocean.

Due to a process called thermal inertia, the heat stored in the ocean will eventually be transfered to the surface of the Earth long after the gases were emitted that resulted in the increased ocean temperature.

See here for more explanation.

Furthermore, rising CO2 levels in the ocean also influence ocean acidity:

[source]

As CO2 levels rise in the ocean, the pH becomes more acidic, which makes it difficult for organisms to maintain their shells or skeletons that are made of calcium carbonate, thus making it more difficult for these organisms to survive and impacting their role in the ecosystem and food chain.

Furthermore, greenhouse gas emissions are believed to influence weather patterns as shown in this report.

Indeed, events with high levels of precipitation which can induce flooding and property damage are generally increasing around the country: