Skip navigation.

Contact Us

Department of Computer Science

Janssen Engineering
Room 236
PO Box 441010
Moscow, Idaho
83844-1010

phone: 208-885-6592
fax: 208-885-9052

e-mail:
dept chair
graduate info
undergrad info
sys admin
webmaster

CS Department Banner Graphic

CS 415 Computational Biology: Sequence Analysis

Total Credits: 3 cr

Course Coordinator: Robert Heckendorn

URL: http://marvin.ibest.uidaho.edu/~heckendo/CS515/

Current Catalog Description: Design and analyze algorithms that address the computational problems posed by biological sequence data, such as DNA or protein sequences. Topics may include: comparing sequences (from genes to genomes), database searching, multiple sequence alignment, phylogenetic inferencing, gene discovery and annotation, and genome assembly.

Textbook: Jones and Pevzner, Introductions to Bioinformatic Algorithms, MIT Press.

References: None.

Course Goals: The goal is to understand the major biological problems related to sequence analysis and algorithms/data structures behind the major bioinformatic tools used to solve them. It is not a course in how to use the tools.

Prerequisites by Topic:

The prerequisites are:

  • basic programming skills in any high level language
  • solid intuition of basic math and statistics
  • a love of digging into algorithms and studying processe.

Major Topics Covered in the Course:

  • Introduction to the biology of biological sequences (6hrs)
  • Analysis of algorithms, brute force techniques and search trees (3hrs) (AL1)
  • Branch and bound and Motif finding (3hrs) (IS2)
  • Pairwise sequence alignment (3hrs)
  • Dynamic programming (3hrs) (IS2)
  • Sequence search with approximate matching and BLAST (3hrs)(IM10)
  • Shotgun sequencing (3hrs) (IM10)
  • Markov chains and hidden Markov models (6hrs) (IM3)
  • Identification of sequence families (3hrs) (IM3)
  • Multiple sequence alignment (1hrs) (IS2)
  • Sequence generation from models (2hrs) (IM3)
  • Deterministic phylogenetic analysis (2hrs)
  • Bootstrapping phylogenies (1hrs)
  • Probabilistic phylogenetic analysis including maximum likelihood (3hrs) (IM10)
  • Evolutionary computation applied to phylogenetic analysis (3hrs)

If time allows

  • Parallel processing techniques and bioinformatics (CN4)
  • Transformational grammars

Laboratory projects (specify number of weeks on each): Four to five other assignments are given as homework. These are a combination of mathematically oriented computer science problems, programming exercises, or problems involving biological knowledge in combination with bioinformatic tools.

  • Algorithms, Dynamic programming, Search techniques (2wks)
  • Motif finding (2wks)
  • Sequence Alignment (2wks)
  • Markov Models and protein families (2wks)
  • Final project (4wks)

Final presentations at the end of class include both a 20 minute presentation in front of the class with slides and an online tutorial.

Estimated Curriculum Category Content:

Area Core Advanced Area Core Advanced
Algorithms 1 1 Data Structures    
Software Design     Prog. Languages    
Computer Arch     Other   1

Oral and Written Communications: Every student is required to submit at least five written reports (not including exams, tests, quizzes, or commented programs) of typically four pages and to make one oral presentation of typically 20 minutes in duration. Include only material that is graded for grammar, spelling, style, and so forth, as well as for technical content, completeness, and accuracy. (None are graded for spelling or grammar unless painfully sloppy). Style, sound science, teaching skill, ability to communicate is taken into account.

Social and Ethical Issues: None.

Theoretical Content: This course is all about algorithms and the math to back them up, consequently the course is almost entirely theoretical. Major concepts involve analysis and design of algorithms, probability theory, and Bayesian analysis. This is supported by practical motivation to achieve efficiency of design.

Problem Analysis: For each algorithm discussed in class we discuss in detail its computational complexity and may rewrite the outline of it on the board a half a dozen times in the process of making it more efficient. We also compare algorithm approaches, strengths, and weaknesses. Homework is designed to support discovery of algorithms and analysis of algorithms.

Solution Design: Students are generally given outlines for algorithms or general techniques for solution. They then are asked both in class and on homework to design algorithms either with particular constraints to the design to experience different implementations or given interesting data to analyze.

Course Outcomes: The following list documents the course outcomes and crossreferences them to the BSCS program outcomes. The letter at the beginning of each reference identifies the program outcome supported. The numbers sequentially identify the course outcome for this course. After completing CS 415 a student should know or be able to: