Math 5305
Detailed Syllabus
(Taught in summer)
1. Overview of Statistics
Deterministic vs. Statistical questions
Statistics process
Problem conceptualization
populations, samples, parameters
Data collection
Types of studies: experiments and observational studies
Sampling: SRS, bias
Descriptive statistics
Formal inference
Why statistics works
(Tutorial in using the SAS/WINDOWS system)
2. Descriptive Statistics for Univariate Data
Population Distribution of a response variable
Qualitative vs. Quantitative (discrete, continuous)
how we represent it
what does it mean
how to estimate it from data
Descriptive parameters of the population distribution
Location measures (mean and median)
Estimation based on raw, grouped data
Dispersion measures (variance, standard deviation)
The empirical rule for the normal shape
Measures of position (quantiles)
Estimation based on raw data
5-number summary, Box and whisker plot
(SAS tutorial 1. Basic SAS data management maneuvers, univariate
descriptive statistics from SAS)
(HW#1. Using SAS to do univariate descriptive statistics)
3. Descriptive Statistics for Bivariate Data
Population Joint Distribution of two response variables
How this distribution is described for qualitative, quantitative discrete
and continuous variables (bivariate frequency distributions, bivariate
histograms)
How this distribution is used to describe joint variation of the responses
Parameters to describe this distribution
Location and dispersion
Correlation
Conditional Distributions – studying relationships between a Dependent
variable and an Independent variable
How to estimate these conditional distributions for the different types of
response variables
Both variables qualitative: using bivariate frequency distributions
Dependent variable quantitative, Indep. Var. qualitative: Comparative
box and whisker plots
Dependent var. qualitative, Indep. Var quant: grouping the independent
variable
Both variables quantitative: regression line, slope, conditional variance
(SAS tutorial 2. Bivariate statistics from SAS)
(HW#2. Bivariate data analysis)
3. Statistical Reporting
Journal articles on writing technical reports, numeracy, displaying data in graphs
Exploratory descriptive reports
(Project 1: SENIC data analysis – exploratory descriptive report – requires extensive
use of SAS on a large data set for discovering and describing relationships with a
quantitative dependent variable, dealing with issues of causality (can we infer it ?),
confounders, etc.)
4. Probability Pre-requisites for Inference
The Normal Distribution
Calculating
Obtaining the quantiles of a general normal distribution
Q-Q plots for determining if data are normal
Sampling distributions
Central Limit Theorem
5. Introduction to Statistical Inference
The main ideas introduced assuming data are N(mu,sig2), where sig2 is known
point estimation of mu
bias and variance
optimality of sample mean
standard error of sample mean and its interpretation
confidence interval for mu
derivation using the properties of the normal distribution
factors affecting its length (n, sig, etc)
hypothesis testing (introduced in the context of a real problem)
motivation for decision rule
Type I and II errors
operational procedure of doing a test
discussion of the power function of the test
6. Inference for Normally distributed data (mu and sigma unknown)
One-sample problem
point estimation of mu and sigma
confidence intervals for mu and sigma
hypothesis tests about mu and sigma
Two independent samples
point estimates, confidence intervals and t-test for difference
in the means
test for equality of variances
large-sample test for means when variances unequal
Paired data
reduction to a single sample by differences
point estimation, confidence interval and paired t-test for mean difference
(HW#4.
7. Experimental Design
Introduction
problems with confounders – randomization
problems with variability – how to defeat it
Survey of basic treatment structures in CR design and RCB design
Two-treatment experiments
In CR design
Sample size calculations (power)
How to randomize
How to analyze with normal data
In Matched-pairs Design
When is matched-pairs better than CR?
How to randomize the matched-pairs design
How to analyze the matched-pairs design with normal data
(Project 2. Design and analysis of a two-treatment experiment in both CR
and Matched-pair designs, do power calculations for both with a
specified power constraint, collect data and analyze, dealing with
outliers, etc.)
One-way layout treatment structure
How to randomize the CR and RCB
Anova, multiple comparisons (LSD) for CR
Anova, multiple comparisons (LSD) for RCB
Two-factor factorial treatment structure
How to randomize in CR and RCB
Anova in CR
Analysis / interpretation of main effects and interaction
cell mean diagrams
Anova in RCB
(Project 3. Design and analysis of a two-factor factorial experiment, including
randomization of e.u.s, cell mean diagrams, dot plots, outlier
assessment)
8. Inference for Dichotomous Data
One sample problem (inference about p=prob of success)
Two Independent samples
Paired Data
One-way layout
In CR
In RCB
Textbook: The material in the course is not contained in any one textbook.
Much of the material is contained in a note set available from the
Recommended references are:
1. Introduction to the Practice of Statistics by D.S. Moore
2. Statistics for Experimenters by Box, Hunter and Hunter