About Me

I am an #rstats and open science enthusiast. I currently work at Highmark Health, managing a team of researchers. Outside of my day job, I serve as a methodologist on several research teams, and develop accessible data tools. This site contains my blog and other open resources for learning data analysis.


Recent Posts

`dtplyr` and `tidyfast` are teaming up (well, at least in this blog post)

With the advent of a more cohesive and complete dtplyr, I’ve been wanting to write about how it can be used with tidyfast to use the syntax of the tidyverse while relying on the speed and efficiency of data.table. This workflow is already being adopted by some, including Ivan Leung, who posted:

Six Things I Learned While Making `tidyfast`

This post highlights six major themes of what I learned while creating the tidyfast R package. This process taught me about the tidyverse, data.table, R, and data science in general.

Guest Post: Interpreting Interactions in Multilevel Models

This is a guest post by Jeremy Haynes, a doctoral student at Utah State University.

Fast and Readable 'If Else' in R

As I’ve spent time learning about different approaches to working with data, I’ve seen several subtle, but important, differences in how to do things. This very short post is presenting how one can perform vectorized “if else” functions in R. The idea of “if else” basically is:

Data Joins: Speed and Efficiency of `dplyr` and `data.table`

This short post is looking at data joins for both dplyr and data.table. There are a lot of moving parts when assessing these things, so the results here are just for this situation. It may differ in others. However, the results here are quite instructive.

Comparing Efficiency and Speed of `data.table`: Adding variables, filtering rows, and summarizing by group

As of late, I have used the data.table package to do some of my data wrangling. It has been a fun adventure (the nerd type of fun). This was made more meaningful with the renewed development of the dtplyr package by Hadley Wickham and co. I introduce some of the different behavior of data.table here.