Biostatistics Teaching
The section is responsible for teaching biostatistics within all of the programmes of the Faculty: pregraduate teaching for students of medicine, dentistry, public health and human biology; teaching within the programmes of Master of Public Health and Master of International Health; and courses in the PhD programme of the Faculty, both for medical researchers and (courses as well as advising) for PhD students in biostatistics. The Section is actively involved in the Graduate Programme in Biostatistics and Bioinformatics.
Link to class room assignments (Syllabus skema)
On this page, you will find an overview of project proposals and internship opportunities that the Department of Public Health offers to bachelor's, master’s, and research year students.
PhD biostatistics courses open for registration at the Faculty of Health and Medical Sciences
PhD Courses
To sign up you must use the course website of the Ph.D.-school. Click here to sign up.
Beware that a course typically will only be present on the PhD-school page a few months before the course starts. Note that the exact ECTS may vary from year to year. Course secretary for all courses: Susanne Kragskov Laupstad, e-mail: skl@sund.ku.dk
Here is an overview of the courses we regularly offer:
Autumn
Basic
Statistics for experimental medical researchers
Course director: Thomas Gerds
ECTS: 3 – Language: English
Description
Day 1: Data, descriptive statistics, statistical inference Day 2: Testing statistical hypotheses, sample size and power calculation Day 3: ANOVA and regression Day 4: Repeated measurements Day 5: Case study
Learning objectives
This five-day course on biostatistics aims at Ph.D. students in biomedical research who work in a laboratory or similar setting, performing experiments on e.g., cells, tissues, mice, or humans. When participating in this course, you will get a working knowledge of statistical concepts, methods of analysis, and adequate ways of presenting statistical results, as well as hands on experience in analysing experimental data with R statistical software. We will also explain some of the most common pitfalls in the statistical analysis of biomedical research data. In summary, we aim at teaching you high-quality biostatistics suitable for research publications.
Learning objectives A student who has met the objectives of the course will be able to:
- Have a qualified discussion with a statistical consultant, e.g., on how to plan the analyses for a research project or how to answer the concerns raised by a reviewer.
- Interpret statistical information from research papers and discuss assumptions and limitations.
- Distinguish between descriptive statistics and statistical inference (effect estimates, confidence intervals and p-values).
- Apply basic statistical analyses to experimental data using the statistical software R.
- Present statistical results in figures, tables, and words.
Recommended academic qualifications
Familiarity with R programming is necessary for taking part in the exercise classes. We recommend the PhD course: Introduction to R for basic statistics which runs just before this course starts.
Course webpage: NA
Introduction to R for Basic Statistics
Course director: Alessandra Meddis
ECTS: 1.7 – Language: English
Description
We will explain basic concepts on the statistical software R (install R and Rstudio interface, upload packages, load/write data ). Use of functions in R with the help page and simple mathematical calculations. Basic tools for data manipulation (data structures in R, data frame creation, define/select variables), descriptive statistics in R and creation of graphics in basic R (scatterplot, box-plot and histogram). Half of the course will include exercises.
Learning objectives
The course aims to give an introduction to the statistical software R by the user interface Rstudio. The course is designed for health science researcher who wants to become more familiar with R for simple calculations, data management, data exploration and analysis. In particular, the course provides basic functionalities matching the needs for the courses “Basic Statistics for Health Science Researchers” and “Statistics for Experimental Researchers”.
A student who has met the objectives of the course should be able to:
- Import/load data into R
- Use the interface Rstudio
- Implement basic calculation in R
- Manipulate data in R
- Create descriptive analyses in R
- Plot graphics in R
Recommended academic qualifications
The course is for people that have no or little prior knowledge of R
Course webpage: NA
Basic statistics for health science researchers (Danish)
Course director: Julie Forman
ECTS: 7.5 – Language: Danish
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (t-tests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in two-way tables, logistic regression). Introduction to survival analysis (Kaplan-Meier curves, log-rank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and p-values.
- Understand the basic statistical analyses most commonly used in health science: two-sample and paired t-test, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chi-square test, logistic regression, survival analysis and linear mixed models.
- Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
- Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
- Follow advanced statistics courses from the ph.d. school at the faculty of health science.
- Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
- ESSENTIAL: Familiarity with basic R functionality and data management, e.g. from the course “Introduction to R for basic statistics” (taking place last week in August).
- RECOMMENDED: Familiarity with statistical concepts e.g., from completing a statistics course during previous education and from reading research papers.
Course webpage: NA
Advanced
Introduction to causal inference
Course director: Erin Gabriel
ECTS: 2.8 – Language: English
Description
Day 1: Causal language, counterfactuals, DAGs and other causal structures Day 2: Identification and partial identification and sensitivity analysis Day 3: Point estimation of the ATE, g-computation, IPW and double robust Day 4: Mediation, IV-analysis
Learning objectives
This four-day intensive course aimed at Ph.D. students in Biostatistics, Epidemiology, Health Data Science, or Statistics who would like an introduction to causal inference and in particular statistical methods for causal inference. When participating in this course, you will get a working knowledge of the conceptual roots of causal inference, as well as hands on experience using the most common methods used in causal inference in observational and clinical trials data.
Learning objectives A student who has met the objectives of the course will be able to:
- Be aware of and be able to distinguish between the differing schools of thought in causal inference.
- Understand the basic principles of defining and identifying a causal estimand.
- Be able to display information about your assumptions via a DAG or SWIG
- Understand and know how to perform a basic IV analysis, and a basic mediation analysis
- Be aware of the different types of sensitivity analysis, including nonparametric causal bounds
- Perform and present the results of a basic causal analysis in a meaningful and convincing manner, that conveys clear causal reasoning.
- Understand and present a novel statistical method for causal inference using the concepts and basic methods you have learned about in the course
Recommended academic qualifications
Familiarity with R programming is necessary for taking part in the exercise classes and for completing the homework problems. If you are not familiar with R programming, we recommend that you complete the free access e-learning course at http://r.sund.ku.dk/ before starting on this course.
Statistical methods, some basic statistical background will be assumed, such as a basic understanding of regression analysis.
Mathematical theory, some basic understanding of calculus.
Course webpage: NA
Advanced topics in causal inference
Course director: Erin Gabriel
ECTS: 2.8 – Language: English
Description
Day 1: Basics of TMLE, Basics in event-time causal inference, Basics in longitudinal causal inference Day 2: special topics TBD Day 3: special topics TBD: causal discovery Day 4: special topics TBD Other topics may be included.
Learning objectives
This four-day intensive course aimed at Ph.D. students in Biostatistics, Epidemiology, Health Data Science, or Statistics who already work in causal inference and in particular statistical methods for causal inference and want information about an advanced topic. When participating in this course, you will get a working knowledge of the conceptual roots of a set of special and advanced topics in causal inference.
Learning objectives A student who has met the objectives of the course will be able to:
- Be aware basic longitudinal causal inference, tmle and time-to-event causal inference topics
- Be aware of and be able to discuss and use concepts from the special topics that will rotate each time the course is given.
- Special topics for fall 2025 will include but are not limited to, dynamic treatment regimes, Bayesian causal inference
Recommended academic qualifications
Familiarity with R programming is necessary for taking part in the exercise classes and for completing the homework problems. If you are not familiar with R programming, we recommend that you complete the free access e-learning course at http://r.sund.ku.dk/ before starting on this course.
Introduction to causal inference or other similar courses. Mathematical theory, some basic understanding of calculus.
Course webpage: NA
Biostatistics for the design and analysis of clinical trials
Course director: Paul Blanche
ECTS: 2.1 – Language: English
Description
The course introduces statistical concepts and methods to design and analyze clinical trials. An important aim of the course is to relate key concepts in clinical trials and common practice to sound statistical thinking.
We will cover “basic” topics such as randomization, blinding, power and sample calculation, as well as more advanced topics, such as covariate adjustment, robustness, missing data, sensitivity analysis and interim analyses. We will also discuss the rationale for common practices such as “intention-to-treat” and “per-protocol” analyses, as well as the modern approach to think about them: the estimand framework. For the most basic concepts and methods, our aim is to provide the students with the necessary knowledge to use them autonomously. For the more advanced topics, we will present the necessary concepts to prepare the students to take advice from statistical experts. References to consensual recommendations and guidelines will be provided (e.g., CONSORT, EMA, FDA). Several examples will be taken from the teacher’s own experience, to facilitate thorough discussions of specific case studies. Recommendations about the writing of a Statistical Analysis Plan will also be presented.
Learning objectives
A student who has met the objectives of the course will be able to:
- Relate key concepts in clinical trials to statistical thinking.
- Take advice from a statistician to design a clinical trial and write a relevant statistical analysis plan.
- Describe consensual recommendations to design and analyze clinical trials and restate their rationale.
- Exemplify the use of different statistical methods to design and analyze clinical trials.
- Carry out commonly used basic computation using the R software.
Recommended academic qualifications
The course is tailored for Ph.D.-students in health sciences who already have taken the Ph.D.-course “Basic Statistics for Health Researchers” or “Statistics for experimental medical researchers” or have a similar knowledge about both statistical thinking and R coding.
Students with a more limited background are expected to benefit from the course too, but to a lesser extent.
Course webpage: NA
Introduction to advanced Bayesian adaptive trials: design and analysis
Course director: Anders Granholm
ECTS: 1.2 – Language: English
Description
The aim of the course is to provide an introduction to advanced Bayesian adaptive trials, i.e., trials using adaptive stopping, adaptive arm dropping, and response-adaptive randomisation.
The first day will focus on trial design and cover:
- An introduction to advanced adaptive trials designs
- An introduction to Bayesian statistical methods
- Key methodological decisions for advanced adaptive trials
- Trial design specification and evaluation of performance metrics using statistical simulation
The second day will focus on analysis and cover:
- Bayesian analyses, model specification including covariates and priors
- Bayesian model fitting and evaluation using Markov chain Monte Carlo methods and appropriate model diagnostics
- Calculation of average treatment effect and posterior probabilities using G-computation
- Adaptive (interim) analysis and evaluation of stopping rules and updating of allocation profiles
The course focuses mostly on the practical application of the introduced methods and less on the theoretical/mathematical underpinnings. The course will switch between lectures and hands-on computer exercises. Of note, although the course focuses on stand-alone Bayesian adaptive trials, all topics covered are also particularly useful in connection with adaptive platform trials.
Learning objectives
The course will provide students with an introduction to advanced Bayesian adaptive trials, makings students able to participate in the design and analysis of such trials.
A student who has met the objectives of the course will be able to:
- Understand the most important methodological considerations in advanced Bayesian adaptive trials using adaptive stopping, arm dropping, and response-adaptive randomisation
- Evaluate and compare selected performance metrics applicable to advanced Bayesian adaptive trial designs using statistical simulation
- Understand how to specify, conduct, and evaluate simple Bayesian regression analyses of clinical trial data
- Understand how to calculate sample-average treatment effects using G-computation
- Evaluate adaptation rules for adaptive stopping, arm dropping, and response-adaptive randomisation
Recommended academic qualifications
The course is targeted towards participants working with the design or analysis of clinical trials. To be able to follow the course, participants are expected to:
- Have basic knowledge on statistics/data science (e.g., at a level corresponding to the * Basic Statistics for Health Researchers PhD-course or similar)
- Basic knowledge of using the R statistical software package
- Basic knowledge on clinical trials (design or interpretation)
Participants are not expected to have previous experience with advanced adaptive trial designs or Bayesian statistical methods.
Course webpage: NA
Statistical analysis of survival data
Course director: Thomas Scheike
ECTS: 5 – Language: English
Description
The course introduces statistical concepts and methods for analyzing time-to-event (survival) data obtained from following individuals until a particular event occurs, or they are lost to follow-up. We will illustrate the use of modern tools for time-to-event analysis and discuss interpretation and communication of results. The course provides practical experience with health science data using the statistical software R through computer labs and home assignments.
Learning objectives
After completing the course the student is expected to:
- Distinguish methods for analysis of time-to-event data from other types of measurements
- Understand the concepts of censoring and truncation.
- Explain central survival analysis concepts such as hazard, survival and cumulative incidence and their relationships.
- Describe and compare models for time-to-event data. Illustrate how the models can be applied to epidemiological or health data.
- Understand and recognize common pitfalls specific to time-to-event data.
- Determine the proper statistical method to address a specific scientific question from a given time-to-event data set. This includes understanding the underlying assumptions of the method and identifying violations of these.
- Perform time-to-event analysis using the statistical software R. Assess the fit of the model.
- Interpret the results reported by statistical software. Communicate the results and conclusions of a time-to-event analysis in a clear and precise way.
- Take active part in collaborations where decisions are based on the statistical analysis of time-to-event data.
Recommended academic qualifications
The course is tailored for Ph.D.-students in health sciences who already have taken the Ph.D.-course “Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge and deeper understanding of statistical methods for censored time-to-event data. In terms of mathematical theory, basic exposure to calculus is expected.
If you are not familiar with R programming, we recommend that you complete the free access e-learning course at http://r.sund.ku.dk/ before starting on this course.
Course webpage: NA
Programming and statistical modelling in R
Course director: Michael Sachs
ECTS: 2.4 – Language: English
Description
The course covers use of the statistical software package R. The aim is to take the intermediate R user to the next level, and make use of programming techniques for more efficient use of R. A key focus is on introducing core programming principles such as loops and functions. The course will have four days of lectures and exercises. This will give the students a chance to use and work with different aspects of R and apply the principles to their own research.
Learning objectives
A student who has met the objectives of the course will be able to:
- use programming principles (loops and functions) to handle repetitive tasks
- use functions in R
- use loops in R
- do efficient data manipulation, visualization, and aggregation
Recommended academic qualifications
Ph.D.-students and health researchers with a basic knowledge of statistics corresponding to the course on basic statistics for health researchers and with a good working knowledge of R, e.g., as obtained by having already followed an introductory course on R. Participants are expected to bring their own laptop with R installed for the exercises.
Course webpage: https://sachsmc.github.io/r-programming
Advanced statistical analysis of epidemiological studies
Course director: Per Kragh Andersen
ECTS: 4.2 – Language: English
Description
Repetition of logistic regression, Poisson regression, and Cox regression. Time-dependent exposure variables. Conditional logistic regression for matched case-control studies. Alternative designs of cohort studies: Nested case-control- and case-cohort studies. The case-cross-over and case-time-control designs. Competing risks. Recurrent events. Introduction to causal inference.
Learning objectives
The course builds on the Ph.D.-course in Epidemiological methods in medical research. The purpose is to give an introduction to more advanced statistical methods frequently applied in epidemiological studies. After completing the course the participants will:
- be able to analyse data from classical cohort studies using Poisson or Cox regression and data from case-control studies using ordinary or conditional logistic regression
- know about the advantages of using cohort data sampled as a nested case-control study or a case-cohort study
- know about methods to account for competing risks and recurrent events in follow-up studies
- know about the basic concepts for causal inference
Recommended academic qualifications
Ph.D.-students with a background corresponding to the course “Epidemiological methods in medical research”
Course webpage: NA
Spring
Basic
Basic statistics for health science researchers (Danish)
Course director: Aksel Karl Georg Jensen
ECTS: 7.5 – Language: Danish
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (t-tests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in two-way tables, logistic regression). Introduction to survival analysis (Kaplan-Meier curves, log-rank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and p-values.
- Understand the basic statistical analyses most commonly used in health science: two-sample and paired t-test, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chi-square test, logistic regression, survival analysis and linear mixed models.
- Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
- Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
- Follow advanced statistics courses from the ph.d. school at the faculty of health science.
- Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
- ESSENTIAL: Familiarity with basic R functionality and data management, e.g. from the course “Introduction to R for basic statistics” (taking place last week in August).
- RECOMMENDED: Familiarity with statistical concepts e.g., from completing a statistics course during previous education and from reading research papers.
Course webpage: NA
Basic statistics for health researchers (English)
Course director: Paul Blanche
ECTS: 7.5 – Language: English
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (t-tests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in two-way tables, logistic regression). Introduction to survival analysis (Kaplan-Meier curves, log-rank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and p-values.
- Understand the basic statistical analyses most commonly used in health science: two-sample and paired t-test, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chi-square test, logistic regression, survival analysis and linear mixed models.
- Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
- Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
- Follow advanced statistics courses from the ph.d. school at the faculty of health science.
- Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
Introduction to R for Basic Statistics (NB: A minimum level of familiarity with basic R is essential, corresponding to that obtained after completing the course “Introduction to R for basic statistics” or the online introduction at https://biostat.ku.dk/r/. The estimated number of hours to complete the online introduction is 10 to 15 hours, depending on your R- and technical skills)
Course webpage: NA
Introduction to R for Basic Statistics
Course director: Alessandra Meddis
ECTS: 1.7 – Language: English
Description
We will explain basic concepts on the statistical software R (install R and Rstudio interface, upload packages, load/write data ). Use of functions in R with the help page and simple mathematical calculations. Basic tools for data manipulation (data structures in R, data frame creation, define/select variables), descriptive statistics in R and creation of graphics in basic R (scatterplot, box-plot and histogram). Half of the course will include exercises.
Learning objectives
The course aims to give an introduction to the statistical software R by the user interface Rstudio. The course is designed for health science researcher who wants to become more familiar with R for simple calculations, data management, data exploration and analysis. In particular, the course provides basic functionalities matching the needs for the courses “Basic Statistics for Health Science Researchers” and “Statistics for Experimental Researchers”.
A student who has met the objectives of the course should be able to:
- Import/load data into R
- Use the interface Rstudio
- Implement basic calculation in R
- Manipulate data in R
- Create descriptive analyses in R
- Plot graphics in R
Recommended academic qualifications
The course is for people that have no or little prior knowledge of R
Course webpage: NA
Introduction to validation of patient reported outcome measures.
Course director: Karl Bang Christensen
ECTS: 2.5 – Language: English
Description
The course introduces simple methods validation of index scales that summarize information from several items. The course covers classical psychometrics, confirmatory factor analysis, and methods for detection of differential item functioning. The computer exercises use SAS or R, but most of the methods discussed are relatively simple and can be done using SPSS or Stata. The course consists of ten hours of classroom teaching supplemented by online elements.
Illustrative examples are drawn from existing PROMS used in clinical research.
Learning objectives
A student who has met the objectives of the course will be able to:
- Know the basic principles for scale validation.
- Compute simple indicators of patient reported outcome measures (PROMs) validity.
- Do a simple confirmatory factor analysis to evaluate the quality of PROMS.
Recommended academic qualifications
Ph.D.-students and researchers within medicine, public health, epidemiology, sociology, and psychology. A very basic knowledge of statistics will be assumed.
Course webpage: NA
Advanced
Advanced Statistical Topics in Health Research
Course director: Claus Ekstrøm
ECTS: 2.8 – Language: English
Description
Many modern research projects collect data and use experimental designs that require advanced statistical methods beyond what is taught as part of the curriculum in introductory statistical courses. This course covers some of the more general statistical models based on ideas from Bayesian statistics. These methods are suitable for analyzing complex data and experimental designs encountered in health research such as supervised and non-supervised machine learning methods, principal component analysis and partial least squares, support-vector machines, network analysis, and causal learning.
The course will contain equal parts theory and applications and consists of four full days of teaching and computer lab exercises. It is the intention that the participants will have a good understanding of the statistical methods presented and are able to apply them in practice after having followed the course. This course is aimed at health researchers with previous knowledge of statistics and the computer language R who need of an overview about appropriate analytical methods and discussions with statisticians to be able to solve their problem.
-
Penalized regression and regularization methods, and variable selection
- Generalized linear models refresher
- Penalized regression (lasso and elastic net and extended variants)
- Parametric and non-parametric bootstrap
- Cross-validation
-
Network analysis
- Introduction to graphs and graph theory
- Visualizing graphs
- Identifying communities
- Latent variable models
-
Random forests
- Modeling cultures
- Decision trees
- Random forests
- Variable importance
-
Causal Structure Learning
- Introduction to directed acyclic graphs (DAGs)
- Causal structure learning
- Algorithms and assumptions for causal learning
-
Statistical analysis with missing data
- Concepts in missing data analysis: MCAR, MAR, MNAR
- Pitfalls in naive imputation
- Rubin’s rules
- Multiple imputations by chained equations
Learning objectives
A student who has met the objectives of the course will be able to:
- Analyze data using the methods presented and be able to draw valid conclusions based on the results obtained.
- Understand the advantages/disadvantages of the methods presented and be able to discuss potential pitfalls from using these methods.
Recommended academic qualifications
The course is tailored for Ph.D.-students in health sciences who already have taken the Ph.D.-course “Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge about the statistical methods underlying the approaches presented in the course.
A basic knowledge of statistics and previous experience with the software program R is expected. However, little or no previous exposure to the topics covered is expected.
Course webpage: NA
Epidemiological methods in medical research
Course director: Brice Ozenne
ECTS: 7 – Language: English
Description
Epidemiological investigations have made critical contributions to public health. Historical examples include establishing adverse effects of tobacco use on health, describing the spread of diseases and infectious etiology of HIV, or assessing the safety of vaccines in large populations. They have also addressed medical controversies using strict design of studies and careful methodological considerations. However, epidemiologic studies have often showed conflicting results, which has given space for criticism of epidemiology. This course aims at providing the methodological foundations of epidemiology and thereby rationalize decisions about the formulation of research question, study design, statistical methods, and communication of the results. This should promote scientifically sound epidemiological studies and critical assessment of epidemiological evidence.
This course is spread over 10 full-days where you will be introduced to key concepts in epidemiology and statistical methods in epidemiology research. You will apply them to analyse historical datasets and reflect upon their usefulness and limitations. Toward the end of the course, you will be asked to make a short presentation either illustrating the use of concepts/methods seen during the course (e.g. on data from your Ph.D.) or discuss extensions these concepts/methods based on suggested literature.
The course cover the following topics:
- Purpose and role of epidemiology
- Quantification of disease frequency and its association with an exposure
- Introduction to various study designs: cohort, case-control, nested case-control, case-cohort
- Introduction to causal inference: causality, confounding, collider, DAGs (directed acyclic graphs). Reasoning, illustrated using common fallacies in epidemiology: Simpson paradox, Berkson’s paradox, ecology fallacy, immortal time bias.
- Statistical methods for handling confounding (stratification, adjustment, standardisation, matching)
- Statistical models for binary and time to event outcome (logistic regression, Cox regression, Poisson regression). Handling interactions and performing hypothesis testing.
- Introduction to registry data analysis & common challenges.
- Communication of epidemiologic results
Learning objectives
On conclusion of the course, participants should be able to conduct a ‘standard’ epidemiology study:
- reformulate a “typical” epidemiology research question in term of prevalence, rate, or risk.
- define a parameter of interest answering the research question.
- propose a study design relevant for the estimation of the parameter of interest.
- argument about the strength and weaknesses of a study design.
- argument about the variables to consider in the subsequent statistical analysis.
- propose a statistical method relevant for the estimation of the parameter of interest.
- interpret the results: their plausibility and how they answer the research question
- communicate the methods used and the results obtained
They should also be able to critically assess epidemiology articles:
- describe the methodology used by a study based on the ‘materials and method’ section of an article and explicit its implications/assumptions.
- summarize the results of a study based on the ‘result’ section of an article and discuss to which extent they provide evidence to answer the research question.
Data management is not part of the learning objectives for this course.
Acquisition of programming skills is not the focus of the course and will mostly be left to self-study.
Recommended academic qualifications
The course is tailored for Ph.D.-students in health sciences with interest in epidemiologic research. Students are expected to have a basic knowledge in epidemiology, statistics and programming. Having completed the course in Basic Statistics and introduction to R is advantageous but not mandatory.
Course webpage: NA
Statistical methods in bioinformatics
Course director: Claus Ekstrøm
ECTS: 3.5 – Language: English
Description
-
Brief overview of molecular data. Introduction to statistical methods for high-dimensional data, linear models and regularization methods
- Big-p small-n problems
- Multiple testing techniques (inference correction, false discovery rates, q-values)
- The correlation vs. causation and prediction vs. hypothesis differences
- Penalized regression approaches, principal component regression
-
Analysis of mapped reads from mRNA data
- General assembly
- Dynamic programming of pairwise alignment
- Alignment methods for mRNA data
- Poisson methods for expression quantification and transcript distribution
-
Genome-wide association studies
- Multiple testing problems
- Imputation
- Common variants vs rare variants. Sequence Kernel Association Test
- Regularization methods, SVM
- Enrichment approaches, gene-set analyses
-
Network biology
- Quality assessment and heterogeneous data integration
- Biomedical text mining (named entity recognition & co-occurrence analysis)
- Network analysis with STRING and Cytoscape
-
Integrative data analysis
- Zero-inflated and hurdle models (microbiome data and RNA-seq revisited)
- Compositional data analysis
- Gene expression analyses
- Combining data and making inference from multiple platforms and experiments
Learning objectives
A student who has met the objectives of the course will be able to: Bioinformatics is concerned with the study of inherent structure of biological information and statistical methods are the workhorses in many of these studies. Some of this inherent structure is very obvious and can be observed directly through correlations of patterns in high-dimensional data, while other patterns arise through more complicated underlying relationships. This course covers some of the basic and novel statistical models and methods suitable for analysing high dimensional data - in particular high dimensional data that rely heavily on statistical methods. The course will contain of equal parts theory and applications and consists of five full days of teaching and computer lab exercises. It is the intention that the participants will have a thorough understanding of the statistical methods and are able to apply them in practice after having followed this course. A student who has met the objectives of the course will be able to:
- Analyse data from a bioinformatics experiment using the methods described below and draw valid conclusions based on the results obtained.
- Understand the advantages/disadvantages of the methods presented and be able to discuss potential pitfalls from using these methods.
- Develop new methods that can be used to analyse novel types of bioinformatics data.
Recommended academic qualifications
The course is tailored for Ph.D.-students with experience in mathematics, statistics, or bioinformatics, who wish to have more knowledge about the statistical methods underlying the approaches used for common problems in bioinformatics.
A basic knowledge of statistics including a little exposure to calculus is expected. However, little or no previous exposure to the topics covered is expected. Students from applied fields are welcome on the course but should expect extra focus on the statistical methodology.
Course webpage: NA
Statistical analysis of survival data.
Course director: Frank Eriksson
ECTS: 5 – Language: English
Description
The course introduces statistical concepts and methods for analyzing time-to-event (survival) data obtained from following individuals until a particular event occurs, or they are lost to follow-up. We will illustrate the use of modern tools for time-to-event analysis and discuss interpretation and communication of results. The course provides practical experience with health science data using the statistical software R through computer labs and home assignments.
Learning objectives
After completing the course the student is expected to:
- Distinguish methods for analysis of time-to-event data from other types of measurements
- Understand the concepts of censoring and truncation.
- Explain central survival analysis concepts such as hazard, survival and cumulative incidence and their relationships.
- Describe and compare models for time-to-event data. Illustrate how the models can be applied to epidemiological or health data.
- Understand and recognize common pitfalls specific to time-to-event data.
- Determine the proper statistical method to address a specific scientific question from a given time-to-event data set. This includes understanding the underlying assumptions of the method and identifying violations of these.
- Perform time-to-event analysis with using the statistical software R. Assess the fit of the model.
- Interpret the results reported by statistical software. Communicate the results and conclusions of a time-to-event analysis in a clear and precise way.
- Take active part in collaborations where decisions are based on the statistical analysis of time-to-event data.
Recommended academic qualifications
The course is tailored for Ph.D.-students in health sciences who already have taken the Ph.D.-course “Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge and deeper understanding of statistical methods for censored time-to-event data. In terms of mathematical theory, basic exposure to calculus is expected.
If you are not familiar with R programming, we recommend that you complete the free access e-learning course at http://r.sund.ku.dk/ before starting on this course.
Course webpage: NA
Programming and statistical modelling in R
Course director: Michael Sachs
ECTS: 2.4 – Language: English
Description
The course covers use of the statistical software package R. The aim is to take the intermediate R user to the next level, and make use of programming techniques for more efficient use of R. A key focus is on introducing core programming principles such as loops and functions. The course will have four days of lectures and exercises. This will give the students a chance to use and work with different aspects of R and apply the principles to their own research.
Learning objectives
A student who has met the objectives of the course will be able to:
- use programming principles (loops and functions) to handle repetitive tasks
- use functions in R
- use loops in R
- do efficient data manipulation, visualization, and aggregation
Recommended academic qualifications
Ph.D.-students and health researchers with a basic knowledge of statistics corresponding to the course on basic statistics for health researchers and with a good working knowledge of R, e.g., as obtained by having already followed an introductory course on R. Participants are expected to bring their own laptop with R installed for the exercises.
Course webpage: https://sachsmc.github.io/r-programming
Bayesian methods in biomedical research
Course director: Paul Blanche
ECTS: 2.4 – Language: English
Description
Bayesian analysis is a statistical tool that is becoming increasingly popular in biomedical sciences. Notably, Bayesian approaches have become commonly used in adaptive designs for Phase I/II clinical trials, in meta-analyses, and also in transcriptomics analysis. This course provides an introduction to Bayesian tools, with an emphasis on biostatistics applications, in order to familiarize students with such methods and their practical applications. A case study from drug development will be discussed to illustrate some of the methods. Thanks to its rich and flexible modelling possibilities and intuitive interpretation, the Bayesian framework is appealing — especially when the number of observations is scarce. It can adaptively incorporate information as it becomes available, an important feature for early phase clinical trials. For example, adaptive Bayesian designs for Phase I/II trials reduce the chances of unnecessarily exposing participants to inappropriate doses and have better decision-making properties compared to the standard rule-based dose-escalation designs. Besides, the use of a Bayesian approach is also very appealing in meta-analyses because of: i) the often relatively small number of studies available, ii) its flexibility, iii) and its better handling of heterogeneity from aggregated results, especially in network meta-analyses. Finally, Bayesian power provides an interesting opportunity to evaluate the probability of success of a trial or program. Thanks to modern computing tools, practical Bayesian analysis has become relatively straightforward, which is contributing to its increasing popularity. JAGS is a flexible software interfaced with R, that allows to easily specify a Bayesian model and that automatically perform inference for posterior parameters distributions as well as graphic outputs to monitor the quality of the analysis.
The aim of the course is to provide insights into Bayesian statistics in the context of medical studies. We will cover the following topics:
- Bayesian modeling (prior, posterior, likelihood, Bayes theorem);
- Bayesian estimation (Credibility Intervals, Maximum a Posteriori, Bayes factor);
- Bayesian applications to meta-analyses;
- Practical Bayesian Analysis with R and JAGS softwares;
- Critical reading of medical publications. All concepts will be illustrated with real-life examples from the medical literature.
Learning objectives
A student who has met the objectives of the course will be able to:
- understand and assess a Bayesian modelling strategy, and discuss its underlying assumptions
- rigorously describe expert knowledge by a quantitative prior distribution
- perform a Bayesian regression using R, applied to meta-analysis
- put into perspective the results from a Bayesian analysis described in a scientific article
- evaluate the probability of success of a trial or set of trials
Recommended academic qualifications
This course is targeted towards students in graduate programms at the Faculty of Health and Medical Sciences. To be able to follow this course, participants need both:
- some knowledge in statistics (most notably some familiarity with usual probality distributions, probability denstity functions, confidence intervals and Maximum Likelihood Estimation), and
- a practical knowledge of R programming (especially functional programming, for loops and “if” statements, vector allocation, linear regression).
An online technical introduction will be provided, briefly covering these notions to check the students qualify for the above requirements. Estimated completion time for this introduction is 3h +/- 1h (depending on your R skills and familiarity with those concepts). Completing this online introduction ahead of the course is mandatory . Advanced mathematical training is not required as we will explain the methods on an elementary mathematical level, but some familiarity with function integration could be helpful.
During the practical labs on their laptop, the students will learn how to technically apply the Bayesian tools on real data, and should be able to perform a Bayesian regression by the end of the course. Note that several statistical software can be used for Bayesian analysis, however practical lab solutions will only be provided for the R and JAGS softwares (alternatives such as SAS, WinBUGS or STAN will not be covered).
Course webpage: NA
Psychometric validation of patient reported outcome measures
Course director: Karl Bang Christensen
ECTS: 2.5 – Language: English
Description
The course introduces psychometric models for validation of index scales summarizing information from several items. The course covers confirmatory factor analysis (CFA) models, item response theory (IRT) models, and Rasch measurement models. Detection and modelling of differential item functioning and local dependence is discussed. The computer exercises use R. The course consists of ten hours of classroom teaching supplemented by online elements.
Learning objectives
A student who has met the objectives of the course will be able to:
- Know the basic principles for validation of patient reported outcome measures (PROMs) using item response theory (IRT) models and Rasch models.
- Do simple analyses for PROM validation studies using state-of-the-art methods.
- Evaluate the quality of published PROMS validation studies.
Recommended academic qualifications
Ph.D.-students and researchers within medicine, public health, epidemiology, sociology, and psychology. A basic knowledge of statistics will be assumed, as will knowledge of simple methods for scale validation corresponding to the contents of the Ph.D. course ‘Introduction to validation of patient reported outcome measures’
Course webpage: NA
Statistical analysis of repeated measurements and clustered data
Course director: Julie Forman
ECTS: 4.2 – Language: English
Description
This course is concerned with the analysis of correlated quantitative data arising e.g. when collecting data repeatedly on the same persons, animals, or tissue over time or on different locations of the body, or when observations are clustered as from patients in a multi-center study, siblings or pups belonging to the same litter. Appropriate statistical models for analysis will be exemplified and statistical errors arising with other frequently employed analyses will be discussed. Topics include analysis of baseline follow-up studies, longitudinal data analysis, multi-level and variance component models, analysis of cross-over trials, and reproducibility of measurements methods. We will further discuss the potential biases that occur due to missing data and statistical methods for handling these. A thorough introduction to linear mixed models for quantitative outcomes will be given, while generalized linear mixed models and marginal models (aka generalized estimating equations) for the analysis of binary, ordinal, and count data are more briefly touched upon by the end of the course. Computer exercises with R statistical software will be given.
Learning objectives
This advanced statistics course will give you an introduction to the most common repeated measurement designs used in medical research. The aim of the course is to teach you to:
- understand and interpret the analyses of various repeated measurement designs including baseline follow-up studies, cross-over trials, and reproducibility of measurement methods, as well as analyses of clustered designs (e.g. multi-level models), and of mixed type.
- perform your own analyses using R statistical software.
- use model diagnostics to assess the validity of your analyses.
- make suitable presentations of the results from your analyses.
- understand the statistical consequences of different kinds of study designs.
Recommended academic qualifications
Ph.D.-students with a basic knowledge of statistics, e.g. corresponding to the course ”Basic statistics for health researchers” and R programming at beginner level.
Course webpage: https://absalon.ku.dk/courses/47665
Advanced, for statisticians
Targeted Minimum Loss-based Estimation (TMLE) for Causal Inference
Course director: Helene Rytgaard
ECTS: 2.8 – Language: English
Description
Targeted minimum loss-based estimation (TMLE) is a general framework for estimation of causal effects that combines semiparametric efficiency theory and machine learning in a two-step procedure. The main focus of the course is to understand the overall concept, the theory, and the application of TMLE. Topics covered include:
- The roadmap of targeted learning.
- Basics of causal inference, including counterfactual notation, hypothetical interventions, the g-formula, and the average treatment effect (ATE).
- Causal effect estimation in nonparametric models: target parameters, nuisance parameters, efficient influence functions, asymptotic linearity, and statistical inference based on the efficient influence function.
- TMLE as a two-step procedure involving initial estimation followed by a targeting step.
- Super learning: combining multiple machine learning algorithms via loss-based cross-validation.
- Extensions to more complex data settings: survival outcome, time-dependent confounding, dynamic treatment regimes.
- Basic usage of existing software in R.
Learning objectives
A student who has met the objectives of the course will be able to:
- Explain the fundamental principles of statistical inference using targeted minimum loss-based estimation (TMLE) and its application as a general framework for estimation of causal effects.
- Implement TMLE using R software to estimate average treatment effects and time-varying treatment effects based on simulated data, and assess the accuracy and efficiency of the estimators.
- Compare the assumptions and performance of TMLE to related causal inference tools such as inverse probability weighting and standardization, and discuss the strengths and limitations of each approach.
- Evaluate the suitability of super learning and its application in TMLE, and implement the algorithm to improve estimation accuracy.
- Discuss and evaluate the challenges and opportunities in time-varying settings in causal inference, including time-varying treatments and time-dependent confounding, and how TMLE can be used to address these challenges.
Recommended academic qualifications
The course is relevant for Ph.D.-students with sufficient background in mathematics and statistics. To participate in the practicals, the participants should have knowledge of the statistical software R.
Course webpage: NA
Advanced survival analysis
Course director: Thomas Scheike
ECTS: 5.6 – Language: English
Description
This is a course aimed for Ph.D.-students in biostatistics/statistics.
The course will describe advanced topics for survival data. The first 4 days gives a brief introduction and considers regression models for survival data, including Cox’s regression model and alternative models like the additive intensity model. Goodness-of-fit for these models will be discussed. We will also discuss how to deal with multivariate survival data including frailty models and marginal models. The last 4 days will consider competing risks, multistate models and recurrent events. The course will consist of lectures and computer sessions (using R/SAS) illustrating how the various models can be applied with focus on the practical implementation and interpretation of the methods. The course will be passed via satisfactorily responding to a take-home exam. We expect students to bring their own laptops.
Learning objectives
The aim of the course is to make the participants able to
- do practical survival analyses using R
- understand the theoretical arguments behind the key methods
- theoretically analyse simple extensions of survival models
- understand how to deal with competing risks and multistate models.
Recommended academic qualifications
The course is targeted to PhD-students with a background in biostatistics or mathematical statistics
Course webpage: NA