SPIDA 2005: Summer Programme in Data Analysis

Description of Topics

SPIDA Programme --- May 25 - June 2, 2005
Dates	Topic	Instructor
Wednesday May 25th	Data Analysis using SAS for Windows	Mirka Ondrack Ernest Kwan

Thursday May 26th	Review of Linear Models with SAS	Michael Friendly Ernest Kwan

Friday May 27th	Logistic Regression	John Fox
Saturday May 28th	Generalized Linear Models	John Fox

Monday May 30th	Mixed Models & Models for Hierarchical Data I	Daniel Bauer
Tuesday May 31st	Mixed Models & Models for Hierarchical Data II
Wednesday June 1st	Mixed Models for Longitudinal Data I
Thursday June 2nd	Mixed Models for Longitudinal Data II

Friday June 3rd	Symposium on Bootstrapping	Robert A. Stine

Data Analysis Using SAS for Windows

Date: May 25th
Instructor: Mirka Ondrack, York University
Ernest Kwan, York University

Should I register for this workshop?

First, examine the sample data sets and output: try to complete the assignment. This will enable you to judge if you know SAS well enough to reproduce the output. If so, you do not need to register.

This course is designed to provide basic knowledge of The Statistical Analysis System (version 8.2) with an emphasis on practical experience in the PC laboratory. Topics covered in this SAS session include: an overview of SAS and its underlying logic; using the SAS Data Step for reading data from various sources; storing data files; modifying data files using some basic SAS programming techniques; and using the SAS Proc step for basic statistical analysis. Throughout, the management and preliminary analysis of Statistics Canada data files are subject to SAS procedures.

Previous experience in using SAS is not required for the basic session: the Windows version of SAS will be introduced, emphasizing work with SAS programming language, as well as the use of dialog boxes. This course is of value for participants conversant with previous versions of SAS, as well as providing the basic elements of SAS for those without prior experience.

Review of Linear Models with SAS

Date: May 26th
Instructors: Michael Friendly, York University
Ernest Kwan, York University

The second part of these sessions concentrates on using the more advanced SAS programming language. Topics will include management of data sets, transformation and generation of variables, a more in-depth exploration of SAS Proc steps, and examples of using SAS for linear models.

Lecture and Workshop Materials

Logistic Regression

Date: May 27th
Instructor: John Fox, McMaster University
TA: Lisa Fiksenbaum

This workshop shows how linear regression analysis and linear models (with which participants are assumed to be familiar) can be extended to the analysis of categorical response variables. The workshop covers logistic regression models (also called linear-logit models) for dichotomous (two-category) responses; for unordered polytomous (several-category) responses; and for ordered categorical responses.

Lecture and Workshop Materials

Generalized Linear Models

Date: May 28th
Instructor: John Fox, McMaster University
TAs: Lisa Fiksenbaum and Gigi Luk

Generalized linear models are a broad synthetic framework that encompasses the familiar linear models for quantitative, normally distributed responses, and linear-logit models for dichotomous data, along with others, such as Poisson-linear (log-linear) models for count data and gamma regression models for skewed quantitative data. This workshop offers an overview of generalized linear models and includes a discussion of diagnostic methods for such models. Familiarity with linear and logistic regression are assumed.

Lecture and Workshop Materials

Mixed Models and Hierarchical Data I and II

Date: May 30 and May 31st
Instructor: Daniel Bauer, University of North Carolina
TAs: Tao Sun and Nikolai Slobodianik

A broad introduction to the topic of mixed models (multi-level and hierarchical data analysis, as well as longitudinal data analysis), emphasizing both the theoretical bases and practical utility of the models to be covered in the next three sessions.

Mixed models, including multilevel models and hierarchical linear models, are commonly used in educational, social science, and biomedical research. Unlike ordinary regression models, which assume that observations are independent, mixed models permit the analysis of observations that are correlated. Correlated observations may arise in several ways. For instance, the nesting of observations within groups may produce correlated observations. A classical example is that students within a classroom may perform more similarly on a test of academic ability than students from different classrooms. Likewise, students from the same school may perform more similarly than students from different schools. Correlations between observations may also be due to the use of a within-subjects or repeated measures design where the same individual contributes more than one data point to the study. Aside from these classic examples, it is possible to conceive of many other instances in which observations may be correlated.

For these situations use of the standard regression model, which assumes independence of observations, will still yield unbiased estimates of many of the effects of interest, but the standard errors will be too small, resulting in an inflated Type I error rate. Mixed models not only provide ‘honest’ standard errors, they also capitalize on the dependence among observations to estimate the degree to which effects may vary in magnitude over the units studied. For example, a classroom-based prevention program may have greater effects in one classroom than another, and this may be partially explained by characteristics of the teacher (i.e., receptivity to the goals and methods of the prevention program).

The workshop will focus on the conceptualization, estimation and interpretation of mixed models in SAS. The primary emphasis will be on linear mixed models for continuous outcomes; however, nonlinear mixed models for categorical or count outcomes will also be discussed.

Recommended reading:
Hox, J. J. Applied Multilevel Analysis, an earlier, free edition of Multilevel Analysis: Techniques and Applications

Lecture and Workshop Materials

Mixed Models for Longitudinal Data I and II

Dates: June 1st and June 2nd
Instructor: Daniel Bauer, University of North Carolina
TAs: Tao Sun and Nikolai Slobodianik

[See the Detailed Outline for topic sequences.]

The analysis of change has historically been fraught with difficulty. This course will begin with a brief overview of some classical modelling approaches for analyzing longitudinal data, including repeated measures ANOVA, residualized change analysis, and autoregressive models. Although these approaches are optimal for testing some hypotheses, none adequately captures intra-individual changes that follow continuous trajectories over time. In response, recent decades have seen the development of several distinct approaches to growth modeling, including simple regression methods, latent curve modeling, and mixed or multilevel models for longitudinal data.

After a short comparison of these different approaches, the course will focus primarily on the multilevel approach. We will begin by showing how growth models can be represented as multilevel models consisting of a within-person model for change nested within a between-person model of individual differences in change. We will then explore the potential of this model to address three important questions: (1) What is the shape of change over time (i.e., linear, nonlinear)? (2) Are there individual differences in change over time? and (3) Can we explain or predict individual patterns of change over time?

To address the first question, we will discuss ways of identifying patterns of change over time and different approaches for modelling linear versus non-linear change. Addressing the second question, we will see that multilevel models provide a parsimonious way of characterizing individual differences in the change process. Finally, to address the third question, we will explore three different ways that predictors can be incorporated into the model; namely, as predictors of the trajectory parameters, as direct predictors of the repeated measures, or as part of a multivariate "parallel-process" growth model.

Lecture and Workshop Materials

SPIDA 2005 Capstone Seminar: Bootstrapping

Date: June 3rd
Keynote Speaker: Robert A. Stine, University of Pennsylvania

top