Introducing the Furniture R Package!
01 Aug 2017This has been updated to work with the most recent version of the furniture
package.
Introducing furniture
This R package provides “furniture” for quantitative researchers.
Furniture is meant to be used and enjoyed. - Natalie Morales
Natalie Morales is right. Furniture is meant for our enjoyment. This package provides functions that are just like furniture–they provide something to look at but they are also there to make your life better.
I know there are over 9,000 packages on CRAN alone but there are many reasons to pay some attention. furniture
contains functions that are particularly useful for both exploratory data analysis and publishing your results. In conjunction with the tidy tools that Hadley Wickham and the RStudio team have developed, furniture
becomes a valuable tool to understand your data and communicate it.
I’ll demonstrate, on data from the 2011-2012 release of the NHANES data, how we can explore the relationship between demographic characteristics and dietary and other health behaviors in children and adolescents. I have provided the data here.
Before we start, you can download the package in two ways. The first, the stable version (1.0.1) on CRAN can be downloaded via:
The second option is the developmental version (1.1.0), which can be downloaded from GitHub via:
Example
Using NHANES, we are going to explore the relationship between asthma and activity level in children and adolescence. Activity level will be measured by hours spent watching TV each day and by the number of times a week the child is activity for 60 minutes during the day.
We will start by setting the working directory (wherever you downloaded the data to…) and loading some packages that will be useful for us here.
After that, we are going to import the data. Below, I show the code using the tidyverse
framework using the %>%
operator. You can learn more about that using this cheetsheat. There are many other resources to look into. Hadley Wickham and RStudio are the developers to look into.
Now we have a data frame d
that has our variables and only contains the children and adolescents in the data. We are going to demonstrate 3 of the functions in furniture
:
washer
table1
First, washer takes a variable and several values and changes them to another value (the default is NA
). Here we are replacing place holder values in the data with NA
.
Second, we are showing table1
. This is a powerful function that takes a data frame and creates a table of descriptive statistics. We gave it 4 variables to get descriptives on, stratified by our asthma variable. We also set the test = TRUE
, providing bivariate tests of significance. This is a great function to use to get an early idea of relationships in the data. I recommend doing this early on to get a good idea of how the data look. (In fact, I recommend doing exploratory data analysis early in any project. This can be done using ggplot2
package for visualization and by using table1()
.)
We also model the data using a poisson distribution and a log link (our outcomes are counts so this type of model generally fits the data very well). We get the average marginal effects (based on the derivative) and then we adjusted the reported average marginal effect to reflect values in minutes instead of hours.
No Yes Test P-Value
Observations 3160 638
act60 T-Test: -0.97 0.33
6.2 (1.7) 6.29 (1.58)
tv_hrs T-Test: -2.56 0.011
2.04 (1.48) 2.22 (1.55)
gender Chi Square: 14.91 0.00
male 1550 (49.1%) 367 (57.5%)
female 1610 (50.9%) 271 (42.5%)
age T-Test: -6.73 0.00
8.75 (5.39) 10.28 (5.21)
I hope this demonstrated to a small degree, the benefits the furniture
package offers. I will certainly post on more in depth uses of each of the functions in furniture
(e.g., the many ways to use the table1
function and use it to produce a publish-ready table).
To review:
- There is a simple data cleaning tool in
furniture
(i.e.,washer
). - There is a great exploratory data analysis and communicating tool in
table1
. It provides a simple function to get important information about means and counts of the variables of interest and an understanding of the the relationships in the data. Further, it is well formatted for easy reporting, potentially in a publishable report.
If you have suggestions, or find a bug, please comment below or email me: t.barrett@aggiemail.usu.edu.