640:338:01 Discrete and Probabilistic Models in Biology
Spring 02

IMPORTANT! This is the NEW Math 338.

General Information

See the Main course page for general information about this course.

THE SPRING 2002 semester will focus on probabilistic and dynamic programming methods in the analysis of biological sequences. This is the first time that the course is being offered and the syllabus is an experiment. It may contain more than we can actually do so we may end up editing it as we go along. But I hope we can hit all the high points.

The first part of the course is an introduction; we study elementary probability models arising in coverage analysis of DNA sequencing strategies, and we introduce the basic statistical notions of hypothesis testing and maximum likelihood estimation. The probability models will give an opportunity to review the use of important models and techniques from probability and to introduce the Poisson process. This part of the course will rely on class lecture notes and handouts.

The second part of the course will study alignment of biological sequences using Markov chains and dynamic programming. Dynamic programming will be introduced as a technique for solving the combinatorial optimization problem of optimal alignment. Markov chains and hidden Markov chains are defined at an elementary level and explored as models for sequences and for study of alignment. This part of the course will be based on chapters 2-6 of the text.

The final part of the course will feature more advanced Markov chain theory; possible applications are to models for evolution of sequences, the BLAST search method, and some elementary population genetics models.

This course will be more lecture driven than text driven. Much of the course will develop material outside of the text. The text has much interesting material, but does not cover all the topics of interests and lacks some of the mathematical development needed.

  1. Class time and place: Tuesday/Thursday 4:30-5:50 PM (6th period), in SEC-218
  2. Text: Durbin, Eddy, Krogh, and Mitchinson, Biological Sequence Analysis: Probabilistic models of proteins and nucleic acids, Cambridge University Press, 1998; ISBN: 0 521 62971 3.
  3. Instructor: Daniel Ocone - 518 Hill Center - ocone@math. Office hours: T5, Th2, or by appointment.
  4. Teaching Assistant. Pieter Blue. Office hours, M6, Th4 in Hill, 624.

Information and Resources

Syllabus and homework assignments

Course notes and handouts

Problem solutions and lecture slides; an additional link for Additional problem solutions, Problem Sets 1-4.