Inference for High-dimensional Data

The Jackson Laboratory

Online

July 11, 12, 14, 18, 19 & 21, 2022

1-3pm Eastern Daylight Time

Instructors: Mitch Kostich, Dan Gatti, Sue McClatchy, Fernando Cervantes Sanchez

Helpers: Olaitan Awe, Ahmed Sadeque, Neil Kindlon

Some adblockers block the registration window. If you do not see the registration box below, please check your adblocker settings.

General Information

High-throughput technologies have changed basic biology and the biomedical sciences from data poor disciplines to data intensive ones. A specific example comes from research fields interested in understanding gene expression. In the 1990s, the analysis of gene expression data amounted to spotting black dots on a piece of paper or extracting a few numbers from standard curves. With high-throughput technologies this suddenly changed to sifting through tens of thousands of numbers. Biologists went from using their eyes or simple summaries to categorize results, to having thousands (and now millions) of measurements per sample to analyze. In this lesson we will focus on statistical inference in the context of high-throughput measurements. Specifically, we focus on the problem of detecting differences in groups using statistical tests and quantifying uncertainty in a meaningful way. We also introduce exploratory data analysis techniques that should be used in conjunction with inference when analyzing high-throughput data. Lesson material is derived from the HarvardX Biomedical Data Science series, part of which is published as the book Data Analysis for the Life Sciences (Irizarry & Love, 2016).

Who: The course is aimed to researchers at the Jackson Laboratory and elsewhere who want to analyze large-scale data. You need to have R programming experience to get the most from this workshop. Statistical Inference for Biology is recommended as well.

Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.

When: July 11, 12, 14, 18, 19 & 21, 2022. Add to your Google Calendar.

Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.). They should have a few specific software packages installed (listed below).

Accessibility: We are dedicated to providing a positive and accessible learning environment for all. Please notify the instructors in advance of the workshop if you require any accommodations or if there is anything we can do to make this workshop more accessible to you.

Contact: Please email susan.mcclatchy@jax.org for more information.

Roles: To learn more about the roles at the workshop (who will be doing what), refer to our Workshop FAQ.


Collaborative Notes

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.


Surveys

Please be sure to complete this survey after the workshop.

Post-workshop Survey


Schedule

Monday, July 11

1:00 Overview and introductions
1:15 Example Gene Expression Datasets
2:00 Coffee/tea break
2:10 Basic inference for high-throughput data
2:55 Wrap-up
3:00 END

Tuesday, July 12

1:00 Procedures for Multiple Comparisons
1:50 Error Rates
2:00 Coffee/tea break
2:10 The Bonferroni Correction
2:55 Wrap-up
3:00 END

Thursday, July 14

1:00 False Discovery Rate
2:00 Coffee/tea break
2:10 Direct Approach to FDR and q-values
2:55 Wrap-up
3:00 END

Monday, July 18

1:00 Basic EDA for high-throughput data
2:00 Coffee/tea break
2:10 Principal Components Analysis
2:55 Wrap-up
3:00 END

Tuesday, July 19

1:00 Statistical Models
2:00 Coffee/tea break
2:10 Statistical Models (continued)
2:45 Wrap-up
2:50 Post-workshop Survey
3:00 END

Setup

To participate in a workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Install the videoconferencing client

If you haven't used Zoom before, go to the official website to download and install the Zoom client for your computer.

Set up your workspace

Like other Carpentries workshops, you will be learning by "coding along" with the Instructors. To do this, you will need to have both the window for the tool you will be learning about (a terminal, RStudio, your web browser, etc..) and the window for the Zoom video conference client open. In order to see both at once, we recommend using one of the following set up options:

This blog post includes detailed information on how to set up your screen to follow along during the workshop.