Motivation

According to the United States Geological Survey, scientists predict that increasing global surface temperatures will increase the frequency of droughts and intensity of storms globally. Recent studies examining sea-level rise patterns and projections (Hauer, 2017; Kulp & Strauss, 2019) have suggested that sea-level rise could induce large-scale migration in the United States away from unprotected coastlines and towards inland areas, redistributing population density across the country. The potential impacts of climate change are far-reaching, ranging from human migration and displacement to increased pressure on inland areas and land disturbance.

Initial Questions

We first want to know if the total count of natural disasters are increasing every five years between 1953-2018 in the United States. We also want to know if average, minimum, and maximum land temperatures are increasing with certain natural disasters over time, examining the states where the frequency of natural disasters does significantly increase with temperature in greater detail. We will fit a multivariable Poisson regression model to address our question of whether the United States overall is seeing more natural disasters over time, then create multiple data visualizations to assess state-by-state differences.

Data Source

Our team used a publicly available dataset called the “Disaster Declarations Summary” (Version 2) from the Federal Emergency Management Agency (FEMA). This dataset contains summary level data on disaster declarations from 1953 to March 2019 by state and includes all three types of disaster declarations: major disaster, emergency, and fire management assistance.

For the climate change indicators, we will use publicly available data from the National Oceanic and Atmospheric Administration (NOAA). We will pull data on average, maximum and minimum temperature, and total precipitation by month by state from 1895. These values came from NOAA’s Climate at a Glance data. Please note that we chose the appropriate “parameter” on this webpage and then selected the “Download All Months/Years” option under the map for all four climate indicators.

Data Cleaning

disasters = read_csv("./data/DisasterDeclarationsSummaries2.csv") %>% 
  janitor::clean_names() %>% 
  mutate(disaster = factor(incident_type)) %>% 
  rename(
     year = fy_declared) %>% 
  filter(year >= 1953 & year <= 2018) %>% 
  count(state, year, disaster) %>%
  group_by(year, state) %>% 
  mutate(
    n_state = sum(n)) %>% 
  group_by(year) %>% 
  mutate(
    n_total = sum(n)) %>% 
  group_by(year, disaster) %>% 
  mutate(
    n_type = sum(n)) %>% 
  rename("n_disaster_state" = "n") 

Disaster Declarations Summary: This dataset was downloaded as a .csv file and imported into R. After cleaning the variable names, we renamed several variables, filtered to the appropriate year range, and kept information on state, disaster year, and disaster type. From there, we created several counts of disasters: by year, disaster type by year, per state by year, and disaster type by state by year. Some visualizations rely on count of disasters by region which was created as needed.

ave_temp = read_csv("https://www.ncdc.noaa.gov/cag/statewide/mapping/110-tavg.csv", skip = 3) %>% 
  janitor::clean_names() %>% 
  separate(date, into = c("year", "month"), sep = 4) %>%
  mutate(
    year = as.numeric(year), 
    state_code = setNames(state.abb, state.name)[location]) %>% 
  filter(year >= 1953 & year <= 2018) %>% 
  group_by(location, year) %>% 
  mutate(
    mean_temp = mean(value)) %>% 
  dplyr::select(state_code, location, year, mean_temp) %>% 
  distinct() %>% 
  ungroup()

tmin = read_csv("https://www.ncdc.noaa.gov/cag/statewide/mapping/110-tmin.csv", skip = 3)  %>% 
  janitor::clean_names() %>% 
  separate(date, into = c("year", "month"), sep = 4) %>%
 mutate(
    year = as.numeric(year), 
    state_code = setNames(state.abb, state.name)[location]) %>% 
  filter(year >= 1953 & year <= 2018) %>% 
  group_by(location, year) %>% 
  mutate(
    min_temp = min(value)) %>% 
  dplyr::select(state_code, location, year, min_temp) %>% 
  distinct() %>% 
  ungroup()

tmax = read_csv("https://www.ncdc.noaa.gov/cag/statewide/mapping/110-tmax.csv", skip = 3) %>% 
  janitor::clean_names() %>% 
  separate(date, into = c("year", "month"), sep = 4) %>%
 mutate(
    year = as.numeric(year), 
    state_code = setNames(state.abb, state.name)[location]) %>% 
  filter(year >= 1953 & year <= 2018) %>% 
  group_by(location, year) %>% 
  mutate(
    max_temp = max(value)) %>% 
  dplyr::select(state_code, location, year, max_temp) %>% 
  distinct() %>% 
  ungroup()

precip = read_csv("https://www.ncdc.noaa.gov/cag/statewide/mapping/110-pcp.csv", skip = 3) %>% 
  janitor::clean_names() %>% 
  separate(date, into = c("year", "month"), sep = 4) %>%
 mutate(
    year = as.numeric(year), 
    state_code = setNames(state.abb, state.name)[location]) %>% 
  filter(year >= 1953 & year <= 2018) %>% 
  group_by(location, year) %>% 
  mutate(
    total_precip = sum(value)) %>% 
  dplyr::select(state_code, location, year, total_precip) %>% 
  distinct() %>% 
  ungroup()

Climate Indicators: These datasets were downloaded from NOAA’s website directly and read as .csv files. Data on average temperature, minimum temperature, maximum temperature, and total precipitation were recorded monthly by state since 1953 (excluding Hawaii and Alaska). After pulling the data and cleaning variable names, we separated the year variable from their “yyyymm” formatted date variable. Next, data were grouped by state and year and summarized as follows:

  • Average Temperature The average temperature, in degrees Fahrenheit, by state across all 12 months per year was summarized.

  • Minimum Temperature: The minimum temperature, in degrees Fahrenheit, by state across all 12 months per year was summarized.

  • Maximum Temperature: The maximum temperature, in degrees Fahrenheit, by state across all 12 months per year was summarized.

  • Total Precipitation: The total precipitation, in inches, by state across all 12 months per year was totaled.

Final Dataset

final_data = 
  inner_join(ave_temp, precip, by = c("location" = "location", "year" = "year", "state_code" = "state_code")) %>% 
  inner_join(tmax, by = c("location" = "location", "year" = "year", "state_code" = "state_code")) %>% 
  inner_join(tmin, by = c("location" = "location", "year" = "year", "state_code" = "state_code")) %>% 
  full_join(disasters, by = c("year" = "year", "state_code" = "state"))

Data on climate change indicators across four datasets were inner-joined by state and year. This created a dataset with average temperature, maximum temperature, minimum temperature, and total precipitation by state and year. Then, this dataset was full-joined to the disaster dataset which had rows for each disaster by year and state.

Data Challenges

There were a couple data challenges along the way. First, our climate data had full state names (“New York”) while our disaster data had FIPS codes and state abbreviations (“NY”). The data was joined after converting the full state name to state codes in the climate data. Additionally, we did have a major issue with data availability. From 11/25/2019-12/1/2019, the NOAA data was unavailable due to scheduled maintenance. Because we pulled all of our climate data directly from the NOAA website, we were unable to work on this final project for the entire Thanksgiving break. While this created some last minute stress, it did not ultimately affect our work product.

Exploratory Analysis

The Natural Disasters & Climate Change Tool is an interactive dashboard that allows users to examine patterns of natural disasters and average temperatures in a certain US state between the years 1953-2018. Average temperature is used as a proxy for climate change in this instance.

The Interactive Map allows users to select different years (1953-2018) and variables, to demonstrate the changes in trends in disasters or climate indicators over time and to visualize differences across the country. Variable definitions can be seen above in the “Data Cleaning” section.

  • Darker areas represent higher counts or temperatures in that state across all 12 months of the year that was summarized.

  • Hovering over each state will display the statistic of the chosen variable for that state.

  • Blank states means there is no data available in that state for that year.

Summary Visualizations were generated to look at trends by region for disasters, average temperature, and total precipitation from 1953 to 2018. Specific regions had greater numbers of certain disasters and a greater frequency of all disasters. Overall, the number of disasters, average temperature, and total precipitation increased from 1953 to 2018. By hovering over the plots, we can see the number of disasters, average temperature, or precipitation amount for the year and region.

Statistical Analysis

Justification for Poisson Regression

After inspecting the summary visualizations and plotting the distribution of the data, we decided to run a Poisson regression model to formally test the hypothesis that the count of natural disasters by US region has increased from 1953 to 2018. The distribution illustrating the number of disasters per year indicated that our data was not normally distributed. Rather, our data followed a Poisson distribution, thereby justifying a Poisson model with disaster count data. Motivated by our exploratory visualizations, we also decided it was necessary to control for average temperature and precipitation.

Final Model

Our final model was:

\[ log(\lambda Count \ of \ Disasters) = \beta_0 + \beta_1 Year + \beta_2 Region + \beta_3 Average \ temp + \beta_4 Precipitation \]

We ran cross validation to confirm that a Poisson Model fit the data better than a Negative Binomial Model. While a visual comparison of the residual mean squared errors did not prove helpful, we were able to choose a model using the Goodness of Fit value, AIC.

Main Findings

Fitting the Poisson regression model, the results were significant at the 5% level of significance. After adjusting for region in the US, average temperature, and precipitation, we found that for every increase in year, the expected count of natural disasters increased by 1.034 times. This supported our hypothesis that the count of natural disasters has increased from 1953 to 2018.

Discussion

Exploratory Analysis:

Through our various visualizations, we found that both count of disasters and mean temperature increased from 1953-2018. It is also apparent that type of disaster and climate indicators vary significantly be region. Overall, the Southeast had much higher temperatures and precipitation, and a greater frequency of disasters compared to the West and Northeast. Looking at each region independently, overall, the frequency of disasters also increased.

Statistical Analysis:

We found that average temperature, region, total precipitation, and year were associated with the log of expected count of natural disasters. Overall, there was a positive association between count of natural disasters and year, which supports our hypothesis that the number of natural disasters is increasing. We found significant differences between region and count of natural disasters, most likely due to the large number of severe storms in the Midwest. Additionally, there was a positive association between precipitation and count natural disasters, but a negative association between temperature and count of natural disasters. Reflecting on the different types of natural disasters, this is justified. For example, a decrease in temperature and increase in precipitation can be associated with an increase in count of severe storms.

Some large limitations are important to consider:

  • Year: We assumed a linear relationship with the continuous variable Year (1953-2018) and log of expected count of natural disasters. This is most likely not correct and further research must consider how to best use a time related variable.

  • Type of Natural Disaster: We grouped all natural disasters together. It is important for future researchers to consider how changing temperatures and precipitation could be associated differently with different types of disasters (drought vs blizzard).

  • FEMA Reporting: We used data from FEMA reported disaster declarations which included county level reporting. For this reason, one large blizzard could be reported by 10 counties and over counted in our data.

  • Missing Data: NOAA did not have climate indicator variables for the states of Hawaii and Alaska.

  • Other Variables: Our team considered only a subset of possible climate indicator variables. Future researchers should consider how else to represent climate change.

Climate Change and Natural Disasters

We did confirm that natural disasters are increasing between 1953-2018; however, much more needs to be done to confirm that climate change indicators (temperature, precipitation, and more) are associated with this trend.

Contributors: Holly Finertie | Matt Curran | Jana Lee | Ashley Tseng | Lulu Zhang