Information Theoretic Approaches to the Life SciencesShort-course NRM 6002, Section 002, 1 credit Offered Spring 2009, Texas Tech University Presented by Dr. David Anderson
What is offered?The Department of Natural Resources Management is hosting a short course on information-theoretic approaches to the life sciences on March 7 and 8, 2009. This course will focus on the practical application of information-theoretic approaches and are based on Kullback-Liebler information and Akaike's Information Criterion (AIC). The material follows Dr. Anderson's new book: Anderson DR. 2008. Model based inference in the life sciences: a primer on evidence. Springer, New York, NY. 180 pp. A copy of this book is provided with enrollment. This text stresses science and science philosophy as much as "statistical methods." The focus is on the quantification and qualification of formal evidence.
Who can take the course?Anyone can take the course, including graduate students, faculty, agency personnel, and professionals.
How do I enroll?The course will be offered for 1 hour of graduate credit (NRM 6002, section 002) for those that want to take it for credit. For those taking it for credit, there will be a meeting early in the semester to discuss course logistics. If you would like to enroll, but do not want to take it for credit, please contact Dr. Kerry Griffis-Kyle.You must contact Dr. Griffis-Kyle to attend the course. This e-mail address is being protected from spam bots, you need JavaScript enabled to view it
Who should take this course?
The new information-theoretic approach covers essentially all of the empirical sciences, including economics, social sciences, life sciences, physical sciences, and medicine. What do I get from the course?
Why should I take this course (by David Anderson)?Over the past 50 years the various life sciences have made huge advances in our understanding of natural and managed systems. This has been a golden age of discovery and technology. To some considerable degree, the easier problems have been addressed by our predecessors – we are now left with the harder, more complex problems. Other fields have all made similar advances. Statistical science has also made incredible discoveries leading to superior approaches to data analysis methods and robust inference procedures. In 1908, 100 years ago, “Student” published on the first “null hypothesis test.” This initial paper quickly lead to other world-class methods such as ANOVA and “regression.” Statistical geniuses included R. A. Fisher, E. Pearson, J. Neyman and a host of others that were active pre-WWII. By the 1960s a general paradigm level understanding was in place, much of this theory was based on testing null hypotheses and inference was based primarily on P-values. Least squares and maximum likelihood methods were well developed and taught in application courses. During the 1970s new and powerful statistical theories began to emerge. Two lines of this emergence were based in “information theory.” This broad theory was developed during the 1940s by Kullback and Leibler working to decode enemy messages related to the war effort and by Shannon working on communication issues. A third line of emergence is a class of Bayesian methods. These new approaches have a large number of advantages over the traditional methods (i.e., null hypothesis tests). The proposed course covers virtually all aspects of the theory and application of methods based on Kullback-Leibler information. The course is presented in a lecture format combined with a lively discussion. Application is stressed and a number of real world examples are given. The textbook and handouts are provided to minimize the need for note-taking. There is an initial focus on hypothesizing alternatives and developing models to clearly represent these. This science strategy goes back to Chamberlin’s famous papers in the 1890s and to Bacon’s work long before that. Consider a science question where 5 alternative hypotheses seem relevant: H1, H2, … H5. Each of these have a corresponding model, g1, g2, …, g5. Now, the science question asks “what is the evidence for hypothesis i? P-values do not constitute evidence or a measure of strength of evidence. New methods, based on information theory, provide such things as a quantitative measure of the information lost by each of the 5 hypotheses/models. Clearly, one would like to select the hypothesis that looses the least amount of information. Other measures of evidence include the likelihood of model i, given the data L(gi|data), the probability of model i given the data, Prob{gi|data}, and evidence ratios. These are formal measures of the strength of evidence for each of the 5 hypotheses/models. Then the emphasis of the course shifts to making formal inference from all the models in an a prioriset. This is 21stcentury science and offers a number of technical advantages. In addition, the computations are trivial, allowing one to focus on the science, not the data analysis method. Just to highlight some of the advantages of these new approaches, let me note some limitations of the traditional theory for null hypothesis testing. The use of these approaches leaves the scientist without general ways to cope with observational studies, ways to rank hypotheses/models, ways to deal with non-nested models, ways to incorporate model selection uncertainty into estimates of precision, ways to model average, ways to reduce model selection bias, ways to deal with large, complex systems and data sets, and ways to set confidence sets on models. Traditional methods do not allow the computation of quantities such as L(gi|data), Prob{gi|data}, and evidence ratios. The new methods are not a “test” in any sense: there is no asymptotic distribution of the test statistic (e.g., t or z or F or χ2), no arbitrary α level, and no P-value. The definition of a P-value is odd – Prob{data|null} -- people want to twist this to be the Prob{null|data} Focus on the null, nearly always trivial and uninteresting, is false on a priorigrounds. Finding little/no support for the null does little to focus information on the alternative. A curse of null hypothesis testing is the “multiple testing problem” and this is avoided with the new approaches. People wanting more information about the limitations of the traditional testing can consult the website www.cnr.colostate.edu/~anderson/thompson1.html
Who is presenting this course?Dr. David Anderson (from http://aicanderson1.home.comcast.net/~aicanderson1/ )
David Anderson spent most of his professional life as a research scientist with the U.S. Department of the Interior. He holds a PhD in Theoretical Ecology from the University of Maryland and has worked in a wide variety of quantitative areas in the biological sciences. He has worked intensively on model selection and related subjects since 1990, beginning with joint work with Drs. Jean-Dominique Lebreton and Jean Clobert (France) and Kenneth Burnham (USA). During this time he has published 18 journal papers and two editions of the Springer-Verlag book on model selection, multimodel inference, and closely related topics (see below). Much of this work has been done in close collaboration with Drs. Kenneth P. Burnham and Gary C. White.
|