reproducible research

Income Assistance in Alberta

Graphing the trends of assured income for severely handicapped (AISH)

Florida Suicides (2) - Youth Trends

Graphing the trends of suicides in Florida from 2006 to 2017 among youth between 10 and 24 years of age.

Florida Suicides (1) - General Trends

Graphing the trends of suicides in Florida from 2006 to 2017, exploring the differences in age, gender, and race among persons 10 years and older

Florida Demographic Growth

This blogposts shows how to extract population estimates data reported by the Florida Department of Health and prepare them for analysis, specifically, for exploring the trends in demograph growth between 2006 and 2020

Managing Data Analysis with RStudio

Recent example of 1) interpreting models through graphs rather than parameters 2) using self-contains RMarkdown notebook vs .R + .Rmd split

Managing Data Analysis with RStudio

The workshop introduces R and RStudio and makes the case for project-oriented workflows for applied data analysis. Using logistic regression on Titanic data as an example, the participants will learn to communicate statistical findings more effectively, and will evaluate the advantages of using computational notebooks in RStudio to disseminate the results

Implementing Reproducible Visualizations

Visualising results of statistical modeling is a key component of data science workflow. Statistical graphs often is the best means to explain and promote research findings. However,in order to find that one graph that tells the story worth sharing, we sometimes have to try out and sift through many data visualizations. How should we approach such a task? What can we do to make it easier from both production and evaluation perspectives?

What Lies Beyond Acute Care Data

Using service utilization data of 4,067 residents of Vancouver Island with sever alcohol addiction we demonstrate the cross-continuum terrain of health services in Vancouver Island Health Authority.

Visualizing Logistic Regression

Visualising results of statistical modeling is a key component of data science workflow. Statistical graphs are often the best means to explain and promote research findings. However, in order to find that one graph that tells the story worth sharing, we sometimes have to try out and sift through many data visualizations. How should we approach such a task? What can we do to make it easier from both production and evaluation perspectives?

When notebooks are not enough

Abstract While computational notebooks offer scientists and engineers many helpful features, the limitations of this medium make it but a starting point in creating software - the practical goal of data science. Where do we go from computational notebooks if our projects require multiple interconnected scripts and dynamic documents? How do we ensure reproducibility amidst growing complexity of analyses and operations? I will use a concrete analytical example to demonstrate how constructing workflows for reproducible analyses can serve as the next step from computational notebooks towards creating an analytical software.