Math 5305

Detailed Syllabus

(Taught in summer)

1.     Overview of Statistics

Deterministic vs. Statistical questions

Statistics process

Problem conceptualization

populations, samples, parameters

Data collection

Types of studies:  experiments and observational studies

Sampling:  SRS, bias

Descriptive statistics

Formal inference

Why statistics works

(Tutorial in using the SAS/WINDOWS system)

2.     Descriptive Statistics for Univariate Data

Population Distribution of a response variable

Qualitative vs. Quantitative (discrete, continuous)

how we represent it

what does it mean

how to estimate it from data

Descriptive parameters of the population distribution

Location measures (mean and median)

Estimation based on raw, grouped data

Dispersion measures (variance, standard deviation)

The empirical rule for the normal shape

Measures of position (quantiles)

Estimation based on raw data

5-number summary, Box and whisker plot

(SAS tutorial 1.  Basic SAS data management maneuvers, univariate

descriptive statistics from SAS)

(HW#1.  Using SAS to do univariate descriptive statistics)

3.     Descriptive Statistics for Bivariate Data

Population Joint Distribution of two response variables

How this distribution is described for qualitative, quantitative discrete

and continuous variables (bivariate frequency distributions, bivariate

histograms)

How this distribution is used to describe joint variation of the responses

Parameters to describe this distribution

Location and dispersion

Correlation

Conditional Distributions – studying relationships between a Dependent

variable and an Independent variable

How to estimate these conditional distributions for the different types of

response variables

Both variables qualitative:  using bivariate frequency distributions

Dependent variable quantitative, Indep. Var. qualitative: Comparative

box and whisker plots

Dependent var. qualitative, Indep. Var quant: grouping the independent

variable

Both variables quantitative:  regression line, slope, conditional variance

(SAS tutorial 2.  Bivariate statistics from SAS)

(HW#2.  Bivariate data analysis)

3.     Statistical Reporting

Journal articles on writing technical reports, numeracy, displaying data in graphs

Exploratory descriptive reports

(Project 1:  SENIC data analysis – exploratory descriptive report – requires extensive

use of SAS on a large data set for discovering and describing relationships with a

quantitative dependent variable, dealing with issues of causality (can we infer it ?),

confounders, etc.)

4.   Probability Pre-requisites for Inference

The Normal Distribution

Calculating Normal probabilities – the N(0,1) table and standardization

Obtaining the quantiles of  a general normal distribution

Q-Q plots for determining if data are normal

Sampling distributions

Central Limit Theorem

5.  Introduction to Statistical Inference

The main ideas introduced assuming data are N(mu,sig2), where sig2 is known

point estimation of mu

bias and variance

optimality of sample mean

standard error of sample mean and its interpretation

confidence interval for mu

derivation using the properties of the normal distribution

factors affecting its length (n, sig, etc)

hypothesis testing (introduced in the context of a real problem)

motivation for decision rule

Type I and II errors

operational procedure of doing a test

discussion of the power function of the test

6.   Inference for Normally distributed data (mu and sigma unknown)

One-sample problem

point estimation of mu and sigma

confidence intervals for mu and sigma

hypothesis tests about mu and sigma

Two independent samples

point estimates, confidence intervals and t-test for difference

in the means

test for equality of variances

large-sample test for means when variances unequal

Paired data

reduction to a single sample by differences

point estimation, confidence interval and paired t-test for mean difference

(HW#4.  Normal inference – practice with the calculations)

7.  Experimental Design

Introduction

problems with  confounders – randomization

problems with variability – how to defeat it

Survey of basic treatment structures in CR design and RCB design

Two-treatment experiments

In CR design

Sample size calculations (power)

How to randomize

How to analyze with normal data

In Matched-pairs Design

When is matched-pairs better than CR?

How to randomize the matched-pairs design

How to analyze the matched-pairs design with normal data

(Project 2.   Design and analysis of a two-treatment experiment in both CR

and Matched-pair designs, do power calculations for both with a

specified power constraint, collect data and analyze, dealing with

outliers, etc.)

One-way layout treatment structure

How to randomize the CR and RCB

Anova, multiple comparisons (LSD) for CR

Anova, multiple comparisons (LSD) for RCB

Two-factor factorial treatment structure

How to randomize in CR and RCB

Anova in CR

Analysis / interpretation of main effects and interaction

cell mean diagrams

Anova in RCB

(Project 3.  Design and analysis of a two-factor factorial experiment, including

randomization of e.u.s, cell mean diagrams, dot plots, outlier

assessment)

8.  Inference for Dichotomous Data

One sample problem (inference about p=prob of success)

Two Independent samples

Paired Data

One-way layout

In CR

In RCB

Textbook:    The material in the course is not contained in any one textbook.

Much of the material is contained in a note set available from the

Recommended references are:

1.      Introduction to the Practice of Statistics  by D.S. Moore

2.      Statistics for Experimenters by Box, Hunter and Hunter