Biostatistics Teaching
The section is responsible for teaching biostatistics within all of the programmes of the Faculty: pregraduate teaching for students of medicine, dentistry, public health and human biology; teaching within the programmes of Master of Public Health and Master of International Health; and courses in the PhD programme of the Faculty, both for medical researchers and (courses as well as advising) for PhD students in biostatistics. The Section is actively involved in the Graduate Programme in Biostatistics and Bioinformatics.
Link to class room assignments (Syllabus skema)
On this page, you will find an overview of project proposals and internship opportunities that the Department of Public Health offers to bachelor's, master’s, and research year students.
PhD biostatistics courses open for registration at the Faculty of Health and Medical Sciences
PhD Courses
To sign up you must use the course website of the Ph.D.school. Click here to sign up.
Beware that a course typically will only be present on the PhDschool page a few months before the course starts. Note that the exact ECTS may vary from year to year. Course secretary for all courses: Susanne Kragskov Laupstad, email: skl@sund.ku.dk
Here is an overview of the courses we regularly offer:
Autumn
Basic
Statistics for experimental medical researchers
Course director: Erin Gabriel
ECTS: 4 – Language: English
Description
This sixday intensive course aims at Ph.D. students in biomedical research who work in a laboratory or similar setting, performing experiments on e.g. cells, tissues, mice, or human volunteers. When participating in this course, you will get a working knowledge of statistical concepts, methods of analysis, and adequate ways of presenting statistical results, as well as hands on experience in analysing experimental data with R statistical software. We will also explain some of the most common errors biomedical researchers make in their statistical analyses. In summary, we aim at teaching you highquality statistics suitable for research publications.
Learning objectives
A student who has met the objectives of the course will be able to:
 Have a qualified discussion with a statistical consultant, e.g. on how to plan the analyses for a research project or how to answer the concerns raised by a reviewer.
 Interpret basic statistical information from research papers, e.g. descriptive statistics, effect estimates, confidence intervals and pvalues.
 Apply the most frequently used statistical analyses to real life experimental data using the statistical software R (see contents section for the specific analyses taught in this course).
 Present statistical results in suitable figures, tables, and words.
 Critically assess the validity of the most frequently used statistical analyses by being aware of their modelling assumptions and limitations.
Recommended academic qualifications
Introduction to R for Basic Statistics (NB: A minimum level of familiarity with basic R is essential, corresponding to that obtained after completing the course “Introduction to R for basic statistics” or the online introduction at https://biostat.ku.dk/r/. The estimated number of hours to complete the online introduction is 10 to 15 hours, depending on your R and technical skills)
Course webpage: NA
Introduction to R for Basic Statistics
Course director: Alessandra Meddis
ECTS: 1.4 – Language: English
Description
We will explain basic concepts on the statistical software R (install R and Rstudio interface, upload packages, load/write data ). Use of functions in R with the help page and simple mathematical calculations. Basic tools for data manipulation (data structures in R, data frame creation, define/select variables), descriptive statistics in R and creation of graphics in basic R (scatterplot, boxplot and histogram). Half of the course will include exercises.
Learning objectives
The course aims to give an introduction to the statistical software R by the user interface Rstudio. The course is designed for health science researcher who wants to become more familiar with R for simple calculations, data management, data exploration and analysis. In particular, the course provides basic functionalities matching the needs for the courses “Basic Statistics for Health Science Researchers” and “Statistics for Experimental Researchers”.
A student who has met the objectives of the course should be able to:
 Import/load data into R
 Use the interface Rstudio
 Implement basic calculation in R
 Manipulate data in R
 Create descriptive analyses in R
 Plot graphics in R
Recommended academic qualifications
The course is for people that have no or little prior knowledge of R
Course webpage: NA
Basic statistics for health science researchers (Danish)
Course director: Julie Forman
ECTS: 7.5 – Language: Danish
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (ttests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in twoway tables, logistic regression). Introduction to survival analysis (KaplanMeier curves, logrank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and pvalues.
 Understand the basic statistical analyses most commonly used in health science: twosample and paired ttest, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chisquare test, logistic regression, survival analysis and linear mixed models.
 Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
 Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
 Follow advanced statistics courses from the ph.d. school at the faculty of health science.
 Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
Introduction to R for Basic Statistics (NB: A minimum level of familiarity with basic R is essential, corresponding to that obtained after completing the course “Introduction to R for basic statistics” or the online introduction at https://biostat.ku.dk/r/. The estimated number of hours to complete the online introduction is 10 to 15 hours, depending on your R and technical skills)
Course webpage: NA
Basic statistics for health researchers (English)
Course director: Paul Blanche
ECTS: 7.5 – Language: English
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (ttests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in twoway tables, logistic regression). Introduction to survival analysis (KaplanMeier curves, logrank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and pvalues.
 Understand the basic statistical analyses most commonly used in health science: twosample and paired ttest, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chisquare test, logistic regression, survival analysis and linear mixed models.
 Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
 Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
 Follow advanced statistics courses from the ph.d. school at the faculty of health science.
 Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
Introduction to R for Basic Statistics (NB: A minimum level of familiarity with basic R is essential, corresponding to that obtained after completing the course “Introduction to R for basic statistics” or the online introduction at https://biostat.ku.dk/r/. The estimated number of hours to complete the online introduction is 10 to 15 hours, depending on your R and technical skills)
Course webpage: NA
Advanced
Statistical analysis of survival data
Course director: Thomas Scheike
ECTS: 4.9 – Language: English
Description
KaplanMeier estimation, logrank test, stratified analysis, Coxregression. Censoring and truncation. Competing risks. Practical implementation of the techniques through computer labs and home assignments.
Learning objectives
The aim of the course is to make the participants able to
 do simple survival analyses
 critically read medical papers using survival analysis techniques
 understand and interpret the outcome of survival analyses
Recommended academic qualifications
The course is tailored for Ph.D.students in health sciences who already have taken the Ph.D.course ”Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge about the statistical methods underlying the approaches presented in the course.
A basic knowledge of statistics and previous experience with the software program R is expected. However, little or no previous exposure to the topics covered is expected.
Course webpage: NA
Targeted Register Analysis
Course director: Thomas Gerds
ECTS: 2.8 – Language: English
Description
The course consists of 4 days where each day consists of lectures about methods and exercises with R:
Lectures: International experts are giving lectures about recent developments in statistical methods for register analyses. The aim is inspiration and the lectures should be about methods that are as complex as they have to be to solve the real world problems; they should neither simplify the data nor the methods only for the sake of teaching success. The tentative list of topics is:
 Analysing Danish register data
 The roadmap of targeted statistical learning
 The transition from traditional epidemiological tools (cohort followup studies, casecontrol studies) which produce hazard ratios or odds ratios to average treatment effects defined in a dynamic causal framework
 Machine learning (random forests/recursive neural networks)
 Longitudinal minimum loss estimation (LTMLE)
Exercises: Participants learn data management with R, especially with respect to working with data from Danish registers. During the computer exercises participants will learn how to move a given data analysis project from the often encountered situation of a messy 1room appartment to a functional multiroom laboratory that invites collaborators to follow the workflow. All steps of the analysis, from the import of the raw data until the export of the tables and figures are controlled by the Rpackage targets.
Learning objectives
A student who has met the objectives of the course will be able to:
 understand the limitations of logistic regression and Cox regression
 know how to ask causible questions (target parameters) before looking into the register data
 define dynamic treatment regimens and analyse register data using the Rpackage ltmle
 have knowledge of statistical (machine) learning algorithms for register data
 use the Rpackage targets to setup and organize a reproducible analysis
Recommended academic qualifications
The course is tailored for Ph.D.students in health sciences who already have taken the Ph.D.course “Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge about the statistical methods underlying the approaches presented in the course.
A basic knowledge of programming with R is expected and previous experience with register data analysis is a great advantage
Course webpage: NA
Programming and statistical modelling in R
Course director: Michael Sachs
ECTS: 2.4 – Language: English
Description
The course covers use of the statistical software package R. The aim is to take the intermediate R user to the next level, and make use of programming techniques for more efficient use of R. A key focus is on introducing core programming principles such as loops and functions. The course will have four halfday lectures after which the students will work on some exercises. This will give the students a chance to use and work with different aspects of R and apply the principles to their own research. Describe the course curriculum in terms of scientific topics covered.
Learning objectives
A student who has met the objectives of the course will be able to:
 use programming principles (loops and functions) to handle repetitive tasks
 use functions in R
 use loops in R
 do efficient data manipulation, visualization, and aggregation
Recommended academic qualifications
Ph.D.students and health researchers with a basic knowledge of statistics corresponding to the course on basic statistics for health researchers and with a good working knowledge of R, e.g., as obtained by having already followed an introductory course on R.
Course webpage: https://sachsmc.github.io/rprogramming
Advanced statistical analysis of epidemiological studies
Course director: Per Kragh Andersen
ECTS: 4.2 – Language: English
Description
Repetition of logistic regression, Poisson regression, and Cox regression. Timedependent exposure variables. Conditional logistic regression for matched casecontrol studies. Alternative designs of cohort studies: Nested casecontrol and casecohort studies. The casecrossover and casetimecontrol designs. Competing risks. Recurrent events. Introduction to causal inference.
Learning objectives
The course builds on the Ph.D.course in Epidemiological methods in medical research. The purpose is to give an introduction to more advanced statistical methods frequently applied in epidemiological studies. After completing the course the participants will:
 be able to analyse data from classical cohort studies using Poisson or Cox regression and data from casecontrol studies using ordinary or conditional logistic regression
 know about the advantages of using cohort data sampled as a nested casecontrol study or a casecohort study
 know about methods to account for competing risks and recurrent events in followup studies
 know about the basic concepts for causal inference
Recommended academic qualifications
Ph.D.students with a background corresponding to the course “Epidemiological methods in medical research”
Course webpage: NA
Advanced Statistical Topics in Health Research A
Course director: Claus Ekstrøm
ECTS: 2.8 – Language: English
Description

Introduction to statistical methods for highdimensional data, linear models, regularization methods, and variable selection
 Bigp smalln problems
 Multiple testing techniques (inference correction, false discovery rates)
 Regularization methods such as lasso, ridge regression, and elastic net
 The correlation vs. causation and prediction vs. hypothesis differences

Permutation testing, bootstrapping, and crossvalidation
 Parametric and nonparametric bootstrap
 Crossvalidation and the jackknife
 Randomization testing

Classification and regression tress
 Classification and regression trees
 Random forests
 Variable importance

Imputation techniques for handling missing data
 Imputation and Rubin’s rules
 Multiple Imputation by Chained Equations
Learning objectives
Many modern research projects collect data and use experimental designs that require advanced statistical methods beyond what is taught as part of the curriculum in introductory statistical courses. This course covers some of the more general statistical models and methods suitable for analyzing complex data and experimental designs encountered in health research such as methods for highdimensional data, classification and regression trees, penalized regression, bootstrapping, crossvalidation, imputation, and dimension reduction.
The course will contain equal parts theory and applications and consists of four full days of teaching and computer lab exercises. It is the intention that the participants will have a good understanding of the statistical methods presented and are able to apply them in practice after having followed the course. This course is aimed at health researchers with previous knowledge of statistics and the computer language R who need of an overview about appropriate analytical methods and discussions with statisticians to be able to solve their problem.
Note that there are two courses entitled “Advanced Statistical Topics in Health Research” (denoted A and B). They have no overlap and can be taken independently of each other.
A student who has met the objectives of the course will be able to:
 Analyze data using the methods presented and be able to draw valid conclusions based on the results obtained.
 Understand the advantages/disadvantages of the methods presented and be able to discuss potential pitfalls from using these methods.
Recommended academic qualifications
The course is tailored for Ph.D.students in health sciences who already have taken the Ph.D.course “Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge about the statistical methods underlying the approaches presented in the course.
A basic knowledge of statistics and previous experience with the software program R is expected. However, little or no previous exposure to the topics covered is expected.
Course webpage: NA
Spring
Basic
Basic statistics for health science researchers (Danish)
Course director: Julie Forman
ECTS: 7.5 – Language: Danish
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (ttests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in twoway tables, logistic regression). Introduction to survival analysis (KaplanMeier curves, logrank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and pvalues.
 Understand the basic statistical analyses most commonly used in health science: twosample and paired ttest, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chisquare test, logistic regression, survival analysis and linear mixed models.
 Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
 Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
 Follow advanced statistics courses from the ph.d. school at the faculty of health science.
 Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
Introduction to R for Basic Statistics (NB: A minimum level of familiarity with basic R is essential, corresponding to that obtained after completing the course “Introduction to R for basic statistics” or the online introduction at https://biostat.ku.dk/r/. The estimated number of hours to complete the online introduction is 10 to 15 hours, depending on your R and technical skills)
Course webpage: NA
Basic statistics for health researchers (English)
Course director: Paul Blanche
ECTS: 7.5 – Language: English
Description
Basic statistical concepts (datatypes, distributions, estimation, confidence intervals). Significance tests (power and sample size calculation, adjustments for multiple testing). Planning and interpretation (exploratory vs confirmatory analyses, randomized vs observational studies, confounding, mediation, effect modification, estimation vs prediction). Analysis of quantitative outcomes (ttests, ANOVA, linear regression, correlation, ANCOVA, multiple linear regression). Analysis of binary and categorical outcomes (association in twoway tables, logistic regression). Introduction to survival analysis (KaplanMeier curves, logrank test, Cox regression). Introduction to analysis of repeated measurements and clustered data (linear mixed models, simplification).
Learning objectives
This course will teach you how to use statistics in a research context by giving you a thorough repetition of basic statistical concepts and models illustrated with case studies from health science.
A student who has met the objectives of the course will be able to:
Interpret basic statistical information from research papers: descriptive statistics, sample size calculations, estimates of effect or association, confidence intervals, and pvalues.
 Understand the basic statistical analyses most commonly used in health science: twosample and paired ttest, linear regression, correlation, analysis of variance (ANOVA), analysis of covariance (ANCOVA), linear models, risk difference, relative risk, odds ratio, chisquare test, logistic regression, survival analysis and linear mixed models.
 Carry out the most commonly used basic statistical analyses using R statistical software, interpret the results, and present them in appropriate tables and figures.
 Recognize the limitations and potential misinterpretations of statistical analyses related to e.g. model violations, confounding, missing data, lack of power, and multiple testing.
 Follow advanced statistics courses from the ph.d. school at the faculty of health science.
 Take advice from a statistician, e.g. in the advisory service at the Section of Biostatistics.
Recommended academic qualifications
Introduction to R for Basic Statistics (NB: A minimum level of familiarity with basic R is essential, corresponding to that obtained after completing the course “Introduction to R for basic statistics” or the online introduction at https://biostat.ku.dk/r/. The estimated number of hours to complete the online introduction is 10 to 15 hours, depending on your R and technical skills)
Course webpage: NA
Introduction to R for Basic Statistics
Course director: Alessandra Meddis
ECTS: 1.4 – Language: English
Description
We will explain basic concepts on the statistical software R (install R and Rstudio interface, upload packages, load/write data ). Use of functions in R with the help page and simple mathematical calculations. Basic tools for data manipulation (data structures in R, data frame creation, define/select variables), descriptive statistics in R and creation of graphics in basic R (scatterplot, boxplot and histogram). Half of the course will include exercises.
Learning objectives
The course aims to give an introduction to the statistical software R by the user interface Rstudio. The course is designed for health science researcher who wants to become more familiar with R for simple calculations, data management, data exploration and analysis. In particular, the course provides basic functionalities matching the needs for the courses “Basic Statistics for Health Science Researchers” and “Statistics for Experimental Researchers”.
A student who has met the objectives of the course should be able to:
 Import/load data into R
 Use the interface Rstudio
 Implement basic calculation in R
 Manipulate data in R
 Create descriptive analyses in R
 Plot graphics in R
Recommended academic qualifications
The course is for people that have no or little prior knowledge of R
Course webpage: NA
Epidemiological methods in medical research
Course director: Brice Ozenne
ECTS: 7 – Language: English
Description
Epidemiological investigations have made critical contributions to public health. Historical examples include establishing adverse effects of tobacco use on health, describing the spread of diseases and infectious etiology of HIV, or assessing the safety of vaccines in large populations. They have also addressed medical controversies using strict design of studies and careful methodological considerations. However, epidemiologic studies have often showed conflicting results, which has given space for criticism of epidemiology. This course aims at providing the methodological foundations of epidemiology and thereby rationalize decisions about the formulation of research question, study design, statistical methods, and communication of the results. This should promote scientifically sound epidemiological studies and critical assessment of epidemiological evidence.
This course is spread over 10 fulldays where you will be introduced to key concepts in epidemiology and statistical methods in epidemiology research. You will apply them to analyse historical datasets and reflect upon their usefulness and limitations. Toward the end of the course, you will be asked to make a short presentation either illustrating the use of concepts/methods seen during the course (e.g. on data from your Ph.D.) or discuss extensions these concepts/methods based on suggested literature.
The course cover the following topics:
 Purpose and role of epidemiology
 Quantification of disease frequency and its association with an exposure
 Introduction to causal inference: causality, confounding, collider, directed acyclic graphs (DAGs)
 Introduction to various study designs: cohort, casecontrol, nested casecontrol, casecohort
 Statistical methods for handling confounding (stratification, adjustment, standardisation, matching)
 Design and analysis of casecontrol studies
 Statistical models for binary and time to event outcome (logistic regression, Cox regression, Poisson regression). Handling interactions and performing hypothesis testing.
 Reasoning, illustrated using common fallacies in epidemiology: Simpson paradox, Berkson’s paradox, ecology fallacy, immortal time bias.
 Communication of epidemiologic results
Learning objectives
On conclusion of the course, participants should be able to conduct a ‘standard’ epidemiology study:
 reformulate a “typical” epidemiology research question in term of prevalence, rate, or risk.
 define a parameter of interest answering the research question.
 propose a study design relevant for the estimation of the parameter of interest.
 argument about the strength and weaknesses of a study design.
 argument about the variables to consider in the subsequent statistical analysis.
 propose a statistical method relevant for the estimation of the parameter of interest.
 interpret the results: their plausibility and how they answer the research question
 communicate the methods used and the results obtained
They should also be able to critically assess epidemiology articles:
 describe the methodology used by a study based on the ‘materials and method’ section of an article and explicit its implications/assumptions.
 summarize the results of a study based on the ‘result’ section of an article and discuss to which extend they provide evidence to answer the research question.
Acquisition of programming skills:
 use a software program to provide a graphical representation of binary outcome and time to event data.
 use a software program to carry out planned analyses and visualize the results.
Data management is not part of the learning objectives for this course.
Recommended academic qualifications
The course is tailored for Ph.D.students in health sciences with interest in epidemiologic research. Students are expected to have a basic knowledge in epidemiology, statistics and programming. Having completed the course in Basic Statistics and introduction to R is advantageous but not mandatory.
Course webpage: NA
Introduction to validation of patient reported outcome measures.
Course director: Karl Bang Christensen
ECTS: 2.2 – Language: English
Description
The course introduces simple methods validation of index scales that summarize information from several items. The course covers classical psychometrics, confirmatory factor analysis, and methods for detection of differential item functioning. The computer exercises use SAS or R, but most of the methods discussed are relatively simple and can be done using SPSS or Stata. The course consists of ten hours of classroom teaching supplemented by online elements.
Illustrative examples are drawn from existing PROMS used in clinical research.
Learning objectives
A student who has met the objectives of the course will be able to:
 Know the basic principles for scale validation.
 Compute simple indicators of patient reported outcome measures (PROMs) validity.
 Do a simple confirmatory factor analysis to evaluate the quality of PROMS.
Recommended academic qualifications
Ph.D.students and researchers within medicine, public health, epidemiology, sociology, and psychology. A very basic knowledge of statistics will be assumed.
Course webpage: NA
Statistical data analysis using the computer program SAS
Course director: Karl Bang Christensen
ECTS: 2.5 – Language: English
Description
The course covers fundamental use of the statistical software package SAS, from data handling over descriptive statistics and standard methods to an introductory description of the regression procedures. Approximately half the time will be reserved for handson exercises. Some emphasis will be put on explaining the theoretical foundation and the applicability of the methods in example problems. There will be a takehome exam which will be evaluated in order to pass the course. T
Learning objectives
A student who has met the objectives of the course will be able to:
 Use statistical methods for data analysis in SAS
 Use SAS for simple data management
 Generate tables and figures for publications
Recommended academic qualifications
PhD students. Some knowledge of basic statistics will be advantageous, but is not required.
Course webpage: NA
Advanced
Advanced Statistical Topics in Health Research B
Course director: Claus Ekstrøm
ECTS: 2.8 – Language: English
Description
Many modern research projects collect data and use experimental designs that require advanced statistical methods beyond what is taught as part of the curriculum in introductory statistical courses. This course covers some of the more general statistical models based on ideas from Bayesian statistics. These methods are suitable for analyzing complex data and experimental designs encountered in health research such as supervised and nonsupervised machine learning methods, principal component analysis and partial least squares, supportvector machines, network analysis, and causal learning.
The course will contain equal parts theory and applications and consists of four full days of teaching and computer lab exercises. It is the intention that the participants will have a good understanding of the statistical methods presented and are able to apply them in practice after having followed the course. This course is aimed at health researchers with previous knowledge of statistics and the computer language R who need of an overview about appropriate analytical methods and discussions with statisticians to be able to solve their problem.
Note that there are two courses entitled “Advanced Statistical Topics in Health Research”. They have no overlap and can be taken independently of each other.

Introduction to Bayesian statistics and the difference between frequentist and Bayesian statistics.
 Credibility intervals, prior and posterior distributions
 Bayesian classifiers
 Markovchain Monte Carlo (MCMC) estimation
 Empirical Bayes estimators

Network analysis
 Introduction to graphs and graph theory
 Visualizing graphs
 Identifying communities
 Latent variable models

Principal component analysis, partial least squares, and Supportvector machines
 Dimension reduction techniques
 PCA and PLS
 Sparse PCA and PLS
 Multiclass and nonlinear SVMs

Causal Structure Learning
 Introduction to directed acyclic graphs (DAGs)
 Causal structure learning
 Algorithms and assumptions for causal learning
Learning objectives
A student who has met the objectives of the course will be able to:
 Analyze data using the methods presented and be able to draw valid conclusions based on the results obtained.
 Understand the advantages/disadvantages of the methods presented and be able to discuss potential pitfalls from using these methods.
Recommended academic qualifications
The course is tailored for Ph.D.students in health sciences who already have taken the Ph.D.course “Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge about the statistical methods underlying the approaches presented in the course.
A basic knowledge of statistics and previous experience with the software program R is expected. However, little or no previous exposure to the topics covered is expected.
Course webpage: NA
Statistical methods in bioinformatics
Course director: Claus Ekstrøm
ECTS: 3.5 – Language: English
Description

Penalized regression approaches, principal component regression
 Analysis of mapped reads from mRNA data
 General assembly
 Dynamic programming of pairwise alignment
 Alignment methods for mRNA data
 Poisson methods for expression quantification and transcript distribution

Genomewide association studies
 Multiple testing problems
 Imputation
 Common variants vs rare variants. Sequence Kernel Association Test
 Regularization methods, SVM
 Enrichment approaches, geneset analyses

Network biology
 Quality assessment and heterogeneous data integration
 Biomedical text mining (named entity recognition & cooccurrence analysis)
 Network analysis with STRING and Cytoscape

Integrative data analysis
 Zeroinflated and hurdle models (microbiome data and RNAseq revisited)
 Compositional data analysis
 Gene expression analyses
 Combining data and making inference from multiple platforms and experiments
Learning objectives
A student who has met the objectives of the course will be able to: Bioinformatics is concerned with the study of inherent structure of biological information and statistical methods are the workhorses in many of these studies. Some of this inherent structure is very obvious and can be observed directly through correlations of patterns in highdimensional data, while other patterns arise through more complicated underlying relationships. This course covers some of the basic and novel statistical models and methods suitable for analysing high dimensional data  in particular high dimensional data that rely heavily on statistical methods. The course will contain of equal parts theory and applications and consists of five full days of teaching and computer lab exercises. It is the intention that the participants will have a thorough understanding of the statistical methods and are able to apply them in practice after having followed this course. A student who has met the objectives of the course will be able to:
 Analyse data from a bioinformatics experiment using the methods described below and draw valid conclusions based on the results obtained.
 Understand the advantages/disadvantages of the methods presented and be able to discuss potential pitfalls from using these methods.
 Develop new methods that can be used to analyse novel types of bioinformatics data.
Recommended academic qualifications
The course is tailored for Ph.D.students with experience in mathematics, statistics, or bioinformatics, who wish to have more knowledge about the statistical methods underlying the approaches used for common problems in bioinformatics. A basic knowledge of statistics including a little exposure to calculus is expected. However, little or no previous exposure to the topics covered is expected. Students from applied fields are welcome on the course but should expect extra focus on the statistical methodology.
Course webpage: NA
Statistical analysis of survival data.
Course director: Frank Eriksson
ECTS: 4.9 – Language: English
Description
KaplanMeier estimation, logrank test, stratified analysis, Coxregression. Censoring and truncation. Competing risks. Practical implementation of the techniques through computer labs and home assignments.
Learning objectives
The aim of the course is to make the participants able to
 do simple survival analyses
 critically read medical papers using survival analysis techniques
 understand and interpret the outcome of survival analyses
Recommended academic qualifications
The course is tailored for Ph.D.students in health sciences who already have taken the Ph.D.course ”Basic Statistics for Health Researchers” or have a similar knowledge about statistics, and who wish to have more knowledge about the statistical methods underlying the approaches presented in the course.
A basic knowledge of statistics and previous experience with the software program R is expected. However, little or no previous exposure to the topics covered is expected.
Course webpage: NA
Programming and statistical modelling in R
Course director: Michael Sachs
ECTS: 2.4 – Language: English
Description
The course covers use of the statistical software package R. The aim is to take the intermediate R user to the next level, and make use of programming techniques for more efficient use of R. A key focus is on introducing core programming principles such as loops and functions. The course will have four halfday lectures after which the students will work on some exercises. This will give the students a chance to use and work with different aspects of R and apply the principles to their own research. Describe the course curriculum in terms of scientific topics covered.
Learning objectives
A student who has met the objectives of the course will be able to:
 use programming principles (loops and functions) to handle repetitive tasks
 use functions in R
 use loops in R
 do efficient data manipulation, visualization, and aggregation
Recommended academic qualifications
Ph.D.students and health researchers with a basic knowledge of statistics corresponding to the course on basic statistics for health researchers and with a good working knowledge of R, e.g., as obtained by having already followed an introductory course on R.
Course webpage: https://sachsmc.github.io/rprogramming
Bayesian methods in biomedical research
Course director: Paul Blanche
ECTS: 2.4 – Language: English
Description
Bayesian analysis is a statistical tool that is becoming increasingly popular in biomedical sciences. Notably, Bayesian approaches have become commonly used in adaptive designs for Phase I/II clinical trials, in metaanalyses, and also in transcriptomics analysis. This course provides an introduction to Bayesian tools, with an emphasis on biostatistics applications, in order to familiarize students with such methods and their practical applications. A case study from drug development will be discussed to illustrate some of the methods. Thanks to its rich and flexible modelling possibilities and intuitive interpretation, the Bayesian framework is appealing — especially when the number of observations is scarce. It can adaptively incorporate information as it becomes available, an important feature for early phase clinical trials. For example, adaptive Bayesian designs for Phase I/II trials reduce the chances of unnecessarily exposing participants to inappropriate doses and have better decisionmaking properties compared to the standard rulebased doseescalation designs. Besides, the use of a Bayesian approach is also very appealing in metaanalyses because of: i) the often relatively small number of studies available, ii) its flexibility, iii) and its better handling of heterogeneity from aggregated results, especially in network metaanalyses. Finally, Bayesian power provides an interesting opportunity to evaluate the probability of success of a trial or program. Thanks to modern computing tools, practical Bayesian analysis has become relatively straightforward, which is contributing to its increasing popularity. JAGS is a flexible software interfaced with R, that allows to easily specify a Bayesian model and that automatically perform inference for posterior parameters distributions as well as graphic outputs to monitor the quality of the analysis.
The aim of the course is to provide insights into Bayesian statistics in the context of medical studies. We will cover the following topics:
 Bayesian modeling (prior, posterior, likelihood, Bayes theorem);
 Bayesian estimation (Credibility Intervals, Maximum a Posteriori, Bayes factor);
 Bayesian applications to metaanalyses;
 Practical Bayesian Analysis with R and JAGS softwares;
 Critical reading of medical publications. All concepts will be illustrated with reallife examples from the medical literrature.
 Evaluating the probability of success of a trial or set of trials
Learning objectives
A student who has met the objectives of the course will be able to:
 understand and assess a Bayesian modelling strategy, and discuss its underlying assumptions
 rigorously describe expert knowledge by a quantitative prior distribution
 perform a Bayesian regression using R, applied to metaanalysis
 put into perspective the results from a Bayesian analysis described in a scientific article
 evaluate the probability of success of a trial or set of trials
Recommended academic qualifications
This course is targeted towards students in graduate programms at the Faculty of Health and Medical Sciences. To be able to follow this course, participants need both:
 some knowledge in statistics (most notably some familiarity with usual probality distributions, probability denstity functions, confidence intervals and Maximum Likelihood Estimation), and
 a practical knowledge of R programming (especially functional programming, for loops and “if” statements, vector allocation, linear regression).
Course webpage: NA
Psychometric validation of patient reported outcome measures
Course director: Karl Bang Christensen
ECTS: 2.6 – Language: English
Description
The course introduces psychometric models for validation of index scales summarizing information from several items. The course covers confirmatory factor analysis (CFA) models, item response theory (IRT) models, and Rasch measurement models. Detection and modelling of differential item functioning and local dependence is discussed. The computer exercises use R. The course consists of ten hours of classroom teaching supplemented by online elements.
Learning objectives
A student who has met the objectives of the course will be able to:
 Know the basic principles for validation of patient reported outcome measures (PROMs) using item response theory (IRT) models and Rasch models.
 Do simple analyses for PROM validation studies using stateoftheart methods.
 Evaluate the quality of published PROMS validation studies.
Recommended academic qualifications
Ph.D.students and researchers within medicine, public health, epidemiology, sociology, and psychology. A basic knowledge of statistics will be assumed, as will knowledge of simple methods for scale validation corresponding to the contents of the Ph.D. course ‘Introduction to validation of patient reported outcome measures’
Course webpage: NA
Statistical analysis of repeated measurements and clustered data
Course director: Julie Forman
ECTS: 4.2 – Language: English
Description
This course is concerned with the analysis of correlated quantitative data arising e.g. when collecting data repeatedly on the same persons, animals, or tissue over time or on different locations of the body, or when observations are clustered as from patients in a multicenter study, siblings or pups belonging to the same litter. Appropriate statistical models for analysis will be exemplified and statistical errors arising with other frequently employed analyses will be discussed. Topics include analysis of baseline followup studies, longitudinal data analysis, multilevel and variance component models, analysis of crossover trials, and reproducibility of measurements methods. We will further discuss the potential biases that occur due to missing data and statistical methods for handling these. A thorough introduction to linear mixed models for quantitative outcomes will be given, while generalized linear mixed models and marginal models (aka generalized estimating equations) for the analysis of binary, ordinal, and count data are more briefly touched upon by the end of the course. Computer exercises with R statistical software will be given.
Learning objectives
This advanced statistics course will give you an introduction to the most common repeated measurement designs used in medical research. The aim of the course is to teach you to:
 understand and interpret the analyses of various repeated measurement designs including baseline followup studies, crossover trials, and reproducibility of measurement methods, as well as analyses of clustered designs (e.g. multilevel models), and of mixed type.
 perform your own analyses using R statistical software.
 use model diagnostics to assess the validity of your analyses.
 make suitable presentations of the results from your analyses.
 understand the statistical consequences of different kinds of study designs.
Recommended academic qualifications
Ph.D.students with a basic knowledge of statistics, e.g. corresponding to the course ”Basic statistics for health researchers” and R programming at beginner level.
Course webpage: https://absalon.ku.dk/courses/47665
Advanced, for statisticians
Targeted Minimum Lossbased Estimation (TMLE) for Causal Inference
Course director: Helene Rytgaard
ECTS: 2.8 – Language: English
Description
Targeted minimum lossbased estimation (TMLE) is a general framework for estimation of causal effects that combines semiparametric efficiency theory and machine learning in a twostep procedure. The main focus of the course is to understand the overall concept, the theory, and the application of TMLE. Topics covered include:
 The roadmap of targeted learning.
 Basics of causal inference, including counterfactual notation, hypothetical interventions, the gformula, and the average treatment effect (ATE).
 Causal effect estimation in nonparametric models: target parameters, nuisance parameters, efficient influence functions, asymptotic linearity, and statistical inference based on the efficient influence function.
 TMLE as a twostep procedure involving initial estimation followed by a targeting step.
 Super learning: combining multiple machine learning algorithms via lossbased crossvalidation.
 Extensions to more complex data settings: survival outcome, timedependent confounding, dynamic treatment regimes.
 Basic usage of existing software in R.
Learning objectives
A student who has met the objectives of the course will be able to:
 Explain the fundamental principles of statistical inference using targeted minimum lossbased estimation (TMLE) and its application as a general framework for estimation of causal effects.
 Implement TMLE using R software to estimate average treatment effects and timevarying treatment effects based on simulated data, and assess the accuracy and efficiency of the estimators.
 Compare the assumptions and performance of TMLE to related causal inference tools such as inverse probability weighting and standardization, and discuss the strengths and limitations of each approach.
 Evaluate the suitability of super learning and its application in TMLE, and implement the algorithm to improve estimation accuracy.
 Discuss and evaluate the challenges and opportunities in timevarying settings in causal inference, including timevarying treatments and timedependent confounding, and how TMLE can be used to address these challenges.
Recommended academic qualifications
The course is relevant for Ph.D.students with sufficient background in mathematics and statistics. To participate in the practicals, the participants should have knowledge of the statistical software R.
Course webpage: NA
Advanced survival analysis
Course director: Thomas Scheike
ECTS: 5.6 – Language: English
Description
This is a course aimed for Ph.D.students in biostatistics/statistics.
The course will describe advanced topics for survival data. The first 4 days gives a brief introduction and considers regression models for survival data, including Cox’s regression model and alternative models like the additive intensity model. Goodnessoffit for these models will be discussed. We will also discuss how to deal with multivariate survival data including frailty models and marginal models. The last 4 days will consider competing risks, multistate models and recurrent events. The course will consist of lectures and computer sessions (using R/SAS) illustrating how the various models can be applied with focus on the practical implementation and interpretation of the methods. The course will be passed via satisfactorily responding to a takehome exam. We expect students to bring their own laptops.
Learning objectives
The aim of the course is to make the participants able to
 do practical survival analyses using R
 understand the theoretical arguments behind the key methods
 theoretically analyse simple extensions of survival models
 understand how to deal with competing risks and multistate models.
Recommended academic qualifications
The course is targeted to PhDstudents with a background in biostatistics or mathematical statistics
Course webpage: NA