Machine Learning with Python

The Jackson Laboratory for Genomic Medicine

Feb 13, 14, 20 & 21, 2020

10:00 am - 3:00 pm

Instructors: Asli Uyar, Asa Thibodeau

Helpers:

General Information

Machine learning (ML) extracts knowledge from data and focuses on prediction. ML learns from our data how to make decisions for future observations. It is widely used and common in everyday interactions – on Facebook, Google, Amazon, or your favorite automated teller machine (ATM). Equally important are applications in science, such as personalized cancer treatment, medical diagnoses, and drug discovery. ML is essential in data driven sciences. This machine learning workshop series is composed of 4 sessions of hands-on practice with the Python scikit-learn library.

At the end of this course, participants will be able to:

Who: This workshop is aimed at graduate students and other researchers who would like to learn more about machine learning for biomedical data. This workshop is open to those from neighboring institutions. Prerequisite: Competence with Python and the basics of the Pandas and NumPy libraries.

Where: Holt Conference Room, 10 Discovery Drive, Farmington, Connecticut. Get directions with OpenStreetMap or Google Maps.

When: Feb 13, 14, 20 & 21, 2020. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.). They should have the most recent version of Python installed (see below).

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email susan.mcclatchy@jax.org for more information.


Code of Conduct

Everyone who participates in Carpentries activities is required to conform to the Code of Conduct.


Collaborative Notes

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.


Please complete this brief anonymous post-workshop survey.


Schedule

All lesson materials including notebooks and data can be found in this Github repository.

Thursday, Feb 13

1:00 Python review
3:30 Break
3:45 Introduction to Machine Learning and Classification
5:00 END

Friday, Feb 14

10:00 Data pre-processing
12:00 Lunch
1:00 Model performance
3:00 END

Thursday, Feb 20

10:00 Classification algorithms
12:00 Lunch
1:00 Unsupervised learning
3:00 END

Friday, Feb 21

10:00 Neural networks
12:00 Lunch
1:00 Deep learning
3:00 END

Course Materials

Mouse protein 2f2c data

Course presentations, notebooks, and documentation


Setup

To participate in a workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Python

Python is a popular language for research computing, and great for general-purpose programming as well. Installing all of its research packages individually can be a bit difficult, so we recommend Anaconda, an all-in-one installer.

Regardless of how you choose to install it, please make sure you install Python version 3.x (e.g., 3.6 is fine).

We will teach Python using the Jupyter Notebook, a programming environment that runs in a web browser (Jupyter Notebook will be installed by Anaconda). For this to work you will need a reasonably up-to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, are not).

Video Tutorial
  1. Open https://www.anaconda.com/distribution/#download-section with your web browser.
  2. Download the Anaconda for Windows installer with Python 3. (If you are not sure which version to choose, you probably want the 64-bit Graphical Installer Anaconda3-...-Windows-x86_64.exe)
  3. Install Python 3 by running the Anaconda Installer, using all of the defaults for installation except make sure to check Add Anaconda to my PATH environment variable.
  1. Open https://www.anaconda.com/distribution/#download-section with your web browser.
  2. Download the Anaconda Installer with Python 3 for Linux.
    (The installation requires using the shell. If you aren't comfortable doing the installation yourself stop here and request help at the workshop.)
  3. Open a terminal window and navigate to the directory where the executable is downloaded (e.g., `cd ~/Downloads`).
  4. Type
    bash Anaconda3-
    and then press Tab to autocomplete the full file name. The name of file you just downloaded should appear.
  5. Press Enter. You will follow the text-only prompts. To move through the text, press Spacebar. Type yes and press enter to approve the license. Press Enter to approve the default location for the files. Type yes and press Enter to prepend Anaconda to your PATH (this makes the Anaconda distribution the default Python).
  6. Close the terminal window.