#install.packages("naniar")
library(naniar)
#install.packages("Hmisc")
library(Hmisc)
Missing data is an advanced topic, and this document does not provide a comprehensive treatment of how to handle missing data in statistical modeling. These are just two tools that I’m fond of. For this document, I’ll be using the built-in airquality
dataset. Before you proceed, make sure to install the naniar
and Hmisc
R packages.
The naniar
package provides lots of tools for understanding the nature of the missingness.
data("airquality")
vis_miss(airquality)
naniar
also includes the upset plot to explore the patterns of missingness. We can clearly see there are only two observations that are missing both Ozone and Solar.R.
gg_miss_upset(airquality)