# Discrete and Probabilisitc Models in Biology [MATH 338] (Section 01), Spring 2019

## General Information

• Instructor: James Greene
• Email: j{dot}c{dot}greene{at}rutgers{dot}edu
• Office: 216 Hill Center
• Office hours: M: 11-12, Th: 12-1, or by appointment
• Class location: HLL-009 (Busch campus)
• Class times: Tuesday and Thursday, 5-6:20 pm
• Textbook: Professor Ocone's free notes, which can be found in Chapter form in Resources. Note that I will probably deviate from the notes a bit as the class progresses.

## Exam Dates (tenative)

• Exam 1: Thursday, February 28th, in class

Note that the review problems are not all encompassing, and you are responsible for all material covered in class. These are just a sample of some types of problems to expect. Also, this is much longer than the actual exam will be, so do not interpret it as a practice exam. If you find any issues with any of the problems (such as you think they are incorrect as stated), please let me know. This document was written in haste.

• Exam 2: April 11th, in class

The same as stated for Exam 1 above holds for this set of review problems.

• Exam 3 (Final): May 15th, 4-7 pm, HLL-009

Most (but not all) problems are related to material since Exam 2. Please see the previous two review sheets for more problems related to earlier material.

• ## Catalog Description

Models for biological processes based on discrete mathematics (graphs, combinatorics) and probabilistics and optimization methods, such as Markov chains and Markov fields, Monte-Carlo simulation, maximum-likelihood estimation, entropy and information. Applications selected from epidemiology, inheritance and genetic drift, combinatorics and sequence alignment of nucleic acids, energy optimization in protein structure prediction, topology of biological molecules.

## Prerequisites

MATH 250 MATH 251, AND (MATH 477 OR CS 206 OR STAT 381). In laymen’s terms, you should have a good understanding of the standard engineering calculus sequence through multivaiable calculus, as well as linear algebra. More importantly, you should have a solid foundation in probability theory, as most of the analysis and models in this course will be based in stochastic processes. Of course, I will review theory as we encounter it, but it most likely won’t be sufficient as a stand-alone treatment. For a good self-check, see the material (and more importantly, exercises) appearing in Chapter 2 of the notes. If you have concerns about the prerequisites, please feel free to contact me.

Formally, grades will be determined by homework and three exams. The breakdown is as follows:

 Homework 25% Quizzes 10% Midterm I 20% Midterm II 20% Final Exam 25%
Standard grading cutoffs apply (A ≥ 90%, 80% ≤ B < 90%, etc.), but will possibly be adjusted (in the student’s favor only) for “borderline cases” and in the case I deem a curve necessary. Note that any curving will be calculated at the end of the semester, but distributions for exams and homework will be provided.

Homework Homework will be assigned weekly, and will typically be due one week after assignment. All assignments will be distributed on the webpage, and should be turned in by the end of class on the due date. The two lowest grades will be dropped. Late homework will not be graded. Assignments will consist of a combination of theoretical and computational problems. Some assignments may require the use of computer software to simulate random processes. I recommend using MATLAB, although any programming software is acceptable. You will be expected to write your own code, but I will provide samples, and am very happy to talk about MATLAB programming in general. For a basic tutorial, see the Resources section below. Based on past experience, I strongly recommend learning MATLAB immediatly, as to avoid becoming overwhlemed later in the course; in general programming expectations will increase as the course progresses. For accessing MATLAB, see Resources below. Problems requiring MATLAB will be explicitly denoted by an M; otherwise the problem is expected to be worked by hand.

Note on collaboration You are encouraged to work together on HW assignments, but problem sets should be both written up and turned in individually. Furthermore, only submit work that is your own. Plagiarism is a serious offense, and will result in a score of zero for the assignment, as well as possible punishment from the University. You must also cite any reference you use and clearly mark any quotation or close paraphrase that you include. Such citation will not lower your grade, although extensive quotation might.

Quizzes Quizzes will be given periodically throughout the course. In general, they will be in-class and shouldn't last more than 20 minutes. I will use them to assess the group's understanding of different topics covered, in that they will be useful in dictating both pace and material. You will be expected to interpret and analyze different models, and the material will be chosen from lecture and assigned readings. Their difficulty should be below that of the homework.

Exams The Final Exam will take place during the Exam week; it appears we are scheduled for May 15th, from 4-7 pm, in our normal classroom (HLL-009). Note that it will be cumulative, but possibly more heavily weighted to the material since Midterm II. The other two exams will take place in class, and will be announced at least two weeks prior to their administration, both in class and on the webpage (Course Calendar). In general they will be closed book, and no calculators or other electronic devices will be permitted. You will be allowed one sheet of notes (back and front), where you can summarize anything you want. Material there has to be written by you; you are not permitted to paste material directly from the course notes.

## Attendance and Participation

Students are expect to attend class regularly, as well as actively participate in discussions. Although not defined as a fixed portion of your final grade, participation will influence your overall performance in the class, especially in "border cases." If a class period is missed, you are expected to read and understand the material on your own; see the Course Calendar for detailed information and readings covered that day. For information on absence reporting and missing exams, see Absences and Make-ups below.

## Tenative Schedule

This class will follow Prof. Ocone’s notes fairly carefully, although material will at times be supplemented from other sources. I will keep an updated class schedule on the webpage (Course Calendar) during the semester, which will include sections covered, homework postings with due dates, and the exam schedule. Below you will find a tentative listing of model frameworks for the course, along with the approximate number of weeks spent on the particular subject:

 Course introduction and basic modeling examples 1-2 weeks Population genetics 2-3 weeks Markov chain models 2-3 weeks weeks Epidemiology 2 weeks weeks Applications to cancer and drug resistance 2-3 weeks Chemical reaction networks 2 weeks
Models will be taken from a variety of life science fields, and I am very open student input for topic selection. No prior biological knowledge will be assumed, but it is recommended to read Chapter 1 of the notes for a primer on genetics, which will be extremely useful for the biology intially covered. I would also like to make connections with deterministic models possibly encountered in Math 336, but I will NOT assume that you have taken that course.

## Absences and Make-ups

Students are expected to attend all classes; if you expect to miss one or two classes, please use the University absence reporting website to indicate the date and reason for your absence. An email is automatically sent to me. Please note that there will be no make-ups for quizzes or exams. If you have a major medical or personal problem and plan to miss an exam, please contact the instructor by email, with a note from the Dean's office to authenticate an absence that is supported by appropriate documentation.

All students in the course are expected to be familiar with and abide by the academic integrity policy. Violations of this policy are taken very seriously.

## Students with Disabilities

Full disability policies and procedures are indicated here. Students with disabilites requesting accommodations must present a Letter of Accommodations to the instructor as early in the term as possible.

## Sakai

I will use Sakai for email contact, as well as to post homework solutions. All enrolled students should have automatic access to the site after logging in to Sakai. Make sure to frequently check your email associated to your Sakai account.

## Resources

• Chapter 1 of Professor Ocone's notes. A primer on genetics and basic cellular biology.
• Chapter 2 of Professor Ocone's notes. A quick review of probability theory.
• Chapter 3 of Professor Ocone's notes. Mathematical analysis of genetics in a large population.
• Chapter 4 of Professor Ocone's notes. Discrete-time Markov chains with applications to population genetics.
• Introduction to Luria-Delbrück distribution. Gentle introduction to experiment and stochastic analysis.
• Mathematical details of Luria-Delbrück distribution. More rigorous analysis of the the previous resource.
• Original 1943 paper. Very readable. Eventually lead to 1969 Nobel Prize in Medicine or Physiology.
• Notes on stochastic chemical kinetics. Dr. Eduardo Sontag's notes on modeling reaction networks stochastically.
• Alternatively, each student at the University is able to logon to a virtual Linux machine here, which has MATLAB installed. Just click on the "Connect" button in the top right-hand corner, and use your NetID. Note: You must activate your NetID before accessing the server, which you can do here. Click "Service Activation" on the left, and after logging in, you should see a checkbox for "Apps.rutgers.edu Cloud Service." (For me it's the last box, but it may vary student-to-student). After clicking "Activate Services," you should be able to login to the Linux machine.
• You are able to use MATLAB in the computer labs. See here for information about the hours and locations of the various Rutgers computer labs.
• For some accessible MATLAB tutorials, see below. I suggest opening MATLAB and typing all of the commands into your own machine as the best way of becoming familar with the language. Also, Google is probably your best resource for any information on any of the MATLAB commands that you encounter.
• MIT intro lectures. Some really nice introductory video lectures from MIT, covering the basics of MATLAB. If you are completely new to the software, I suggest you start there.
• The official Primer. A bit long, but it has a lot, and some of it you can ignore (e.g. Chapter 3). Another great place to start.
• Mathworks tutorials. Lots of official tutorials from the makers of MATLAB. Mathworks is also the website where all documentation can be found.
• A short course that covers the basics, as a programming language (arithmetic, plotting, matrix calculations, scripts and functions).
• A document on solving ODEs in MATLAB. Focus mainly on the numerical section (i.e. where ode45 is discussed).
• Here is an introduction to simulating random variables in MATLAB.