Q & A, Discussion, Integrated Assignments, and Working with Your Own Data
In this section we provide some additional exercises covering a range of topics to reinforce concepts and topics throughout this course series. We encourage students to attempt to do these exercises on their own. We have provided hints and an answer for each exercise however these should be used only as a last resort, students should first try searching for solutions throughout this course and other available resources throughout the web.
In 1854 there was cholera epedemic in the Soho district of London kown as the Golden square outbreak. Ultimately a particularly virulent strain of the disease caused the deaths of 616 individuals. At this time there were two competing theories as to the cause of the outbreak. The commonly held miasma theory postulated that foul air from decaying organic matter was the cause of the disease. A physician by the name of John Snow had published years earlier the competing germ theory, specifically postulating that cholera was caused by the presence of as yet unknown germ cells which contaminated water. The Golden square outbreak allowed John Snow with the help of Henry Whitehead to map the deaths of the outbreak in relation to public water pumps around the area. Eventually this work led to the debunking of miasma theory. In this exercise try and recreate the famous map originally created by John Snow to support his theory, an example of which is shown below. You’ll need to install the package cholera and use the data frames specified below.
- topics covered: ggplot2, basic R
- difficulty: 3/5
install.packages(cholera) data(roads) data(pumps) data(fatalities.address) data(pump.case)
You shouldn't need to alter the roads dataframe to plot it with ggplot, take a look at the group aesthetic!
you need to merge the fatalities.address and pump.case data frames but first you'll need to convert pump.case to a data frame, look at the stack() function!
Download an Rscript with the answer Here.
- roads: Data frame providing the x/y coordinates for road start and end points grouped by street.
- pumps: Data frame providing coordinates and names for water pumps.
- fatalities.address: Data frame providing coordinates for each anchor case address for a case of cholera (i.e. address of the first cholera case at an address)
- pump.case: list of vectors associating each anchor case with a water pump id.
Module 6 Lecture