CS395T - Computational Statistics With Application To Bioinformatics
CS395T - Computational Statistics With Application To Bioinformatics
CS395T - Computational Statistics With Application To Bioinformatics
Concepts: measures of central tendency, mean, median; normal (Gaussian), Student, Cauchy,
lognormal, exponential, gamma, chi-square; PDF, CDF, characteristic function; Central Limit
Theorem
Unit 3: Random Number Generators, Tests for Randomness, and Tail Tests Generally
Concepts: random number generator (RNG); multiplicative RNG, p-values, t-values; binomial
distribution; chi-square test; 1- vs. 2-point distribution; Xorshift RNG; combinations of
generators; p-value paradigm
Unit 4: Tail Test Perils and Pitfalls: Chi-Square Misuse, Multiple Hypotheses, Stopping
Criteria
Concepts: Xorshift generators; matrix powers by successive squaring; GCD and Gorilla
randomness tests; transformation method; rejection method; ratio of uniforms method;
squeezes; Leva's algorithm
Concepts: binned data; nonlinear leaset squares (NLS) fits; covariance matrix; goodness of fit;
linear propagation of errors; Jacobian matrix; sampling the posterior distribution; bootstrap
resampling
(custom)
Concepts: phylogenetic trees; cladograms, additive trees, ultrametric trees; distance matrix,
neighbor joining; agglomerative method; vertebrate species; gene chip; Hamming distance;
rooted vs. unrooted; gene co-expression; Pearson r; TreeView
Concepts: data matrix, design matrix; standardize; Singular Value Decomposition (SVD);
orthogonal basis; low-rank approximation; Principal Component Analysis (PCA); main
effects; Gaussian random matrix; order statistic; dimensional reduction; eigengenes,
eigenarrays; non-negative matrix factorization (NMF)
Software: SVMlight