This lesson is still being designed and assembled (Pre-Alpha version)

Machine Learning for Biomedical Science: Glossary

Key Points

Introduction
  • Machine learning predicts outcomes from data.

  • As examples, machine learning can be used to discover new kinds of cancers or predict drug response in biomedical studies.

Clustering
  • We have used the euclidean distance to define the disimilarity between samples; however, we can use other metrics according to the prior knowledge we have from our data.

Conditional Probabilities and Expectations
  • For categorical/discrete variables we have used estrict conditions (i.e. X=x); however, conditioning can be applied to continuous vriables by using ranges instead (e.g. X>=x, X<=x, or a<X<b)

Smoothing
  • The smoothing methods work well when used inside the range of predictor values seen in the training set, however them are not suitable for extrapolation the prediction outside those ranges.

Class Prediction
  • Data quality matters. Garbage in, Garbage out!

Cross-validation
  • The mean validation error obtained from cross-validation is a better approximation of the test error (real world data) than the training error iteself

Glossary

FIXME