co4 - University of Arizonacushing...Title co4.tif Created Date 7/16/2004 4:01:16 AM
MST131-Probability Theory - Christ University · CO3: Identify applications of matrix theory in...
Transcript of MST131-Probability Theory - Christ University · CO3: Identify applications of matrix theory in...
MST131-Probability Theory
Total Teaching Hours/Semester: 75 No. of Lecture Hours/Week: 5
Max Marks: 100 Credits: 5
Course objectives
To make students to use measure-theoretic and analytical techniques for understanding probability
concepts.
Course Outcomes
CO1: Understand measure and measurable functions
CO2: Analyse probability concepts using measure-theoretic approach
CO3: Identify applications of different limit theorems in statistical problems
CO4: Apply Radon-Nikodym theorem in conditional probability
UNIT I Probability and Random variable Teaching Hours: 15
Algebra of sets, Fields, Sigma fields, Inverse function, Measurable functions, Random
variables, Lebesgue measure, Lebesgue-Stieltjes measure, Counting measure, Discrete
probability space, General probability space as normed measure space, Induced probability
space. Distribution function of a random variable, Distribution function of random vectors.
Indepence of random variables
UNIT II Expectation and Generating functions Teaching Hours: 15
Intgegration with respect to measure (Introduction only), Expectation and moments:
Definition and properties, Moment generating functions, Moment inequalities: Cr-, Holder,
Jenson and basic inequalities, Product spaces and Fubini’s theorem, Charecteristic function
and properties (idea and statement only).
UNIT III Convergence Teaching Hours: 15
Modes of convergence: Convergence in probability, in distribution, in rth mean, almost sure
convergence and their inter-relationships, Convergence theorem for expectation such as
Monotone convergence theorem, Fatou’s lemma, Dominated convergence theorem.
UNIT IV Limit Theorems Teaching Hours: 15
Law of large numbers, Covergence of series of independent random variables, Kolmogorov’s
inequality, Weak law of large numbers (Kninchine’s and Kolmogorov’s), Kolmogorov’s
strong law of large numbers, Central limit theorems for i.i.d random variables, Lindberg-
Levy and Liaponov’s CLT, Lindberg-Feller CLT.
UNIT V Conditioning Teaching Hours: 15
Conditional expectation and its properties, Conditional probabilities, Randon-Nikodym
Theorem (Statement only) and its applications. Bayes’ theorem, Martingales, Submartingales,
Martingale convergence theorem, Decomposition of submaritingales.
Textbooks
1. Billingsley, P. (2008) Probability and Measure, Second edition, John Wiley
2. Bhat, B.R. (2018) Modern Probability Theory, Second edition, Wiley Eastern
3. Rohatgi, V.K. and Salah, A.K.E. (2011) An Introduction to Probability and Statistics,
John Wiley & Sons.
Recommended Reading
1. Feller, W. (1976) An Introduction to Probability Theory and its Applications, Volume
I Wiley Eastern.
2. Feller, W. (1976) An Introduction to Probability Theory and its Applications, Volume
II Wiley Eastern.
3. Basu, A.K. (1999) Measure Theory and Probability, Prentice-Hall.
4. Durrett, Rick. Probability: Theory and Examples. 4th ed. Cambridge University Press,
2010
MST132-Distribution Theory
Total Teaching Hours/Semester: 75 No. of Lecture Hours/Week: 5
Max Marks: 100 Credits: 5
Course Objectives
To make students to understand different probability distributions and to model real-life
problems using it.
Course Outcomes
CO1: To understand different families of probability distributions.
CO2: Analyse well-known probability distributions as special case of different families of
distribution
CO3: To identify different distributions arising from sampling from normal distribution.
CO4: To apply probability distribution in various statistical problems.
UNIT I Discrete Distributions Teaching Hours: 15
Modified power series family and properties. Binomial, Negative binomial, Logarithmic
series and Lagrangian distributions and their properties as special cases of the results from
modified power series family, hypergeometric distribution and its properties.
UNIT II Continuous Distributions Teaching Hours: 15
Pearsonian system of distributions, Beta, Gamma, Pareto and Normal as special cases of the
Pearson family and their properties. Exponential family of distributions.
UNIT III Sampling distributions Teaching Hours: 15
Sampling distributions of the mean and variance from normal population, independence of
mean and variance, Chi-square, students t and F distribution and their non-central forms.
Order statistics and their distributions.
UNIT IV Multivariate distributions Teaching Hours: 15
Bivariate Poisson, Multinomial distribution, Multivariate normal (definition only), bivariate
exponential distribution of Gumbel, Marshall and Olkin and Block and Basu, Dirichlet
distribution.
UNIT V Quadratic forms Teaching Hours: 15
Quadratic forms in normal variables: distribution and properties, Cochran’ theorem:
applications.
Textbooks
1. Rohatgi, V.K. and A.K.E. Salah (2011) Introduction to Probability and Statistics, John
Wiley and Sons.
2. Arnold B.C, Balakrishnan N and Nagaraja H.N (2012). A first course in order
statistics.
3. Galambos J, and Kotz’s (1978): Characterization of Probability distributions, Springer
-Verlag.
4. Ord J.K. (1972) Families of frequency distributions, Griffin
Recommende Reading
1. Johnson N.L, Kotz S and Kemp A.W (1992) Univariate discrete distributions, John
Wiley.
2. Johnson N.L, Kotz S and Balakrishnan N (1991) Continuous univariate distributions
I & II, John Wiley.
3. Johnson N.L, Kotz S and Balakrishnan N (1995) Multivariate Distribution, John
Wiley.
MST133-Matrix Theory and Linear Models
Total Teaching Hours/Semester: 75 No. of Lecture Hours/Week: 5
Max Marks: 100 Credits: 5
Course Objective
This course is offered to make students understand the critical aspects of matrix theory and
linear models which are used in different areas of statistics such as regression analysis,
multivariate analysis, design of experiments and stochastic processes.
Course Outcomes
CO1: Understand vector-space and different operations on it
CO2: Analyse system of linear equations using matrix theoretic approach
CO3: Identify applications of matrix theory in statistical problems
CO4: Apply matrix theory in linear models
UNIT I - Vector Space Teaching Hours: 15
Vectors, Operations on vector space, subspace, nullspace and column space, Linearly
independent sets, spanning set, bases, dimension, rank, change of basis.
UNIT II - System of linear equations Teaching Hours: 15
Matrix operations, Linear equations, row reduced and echelon forms, Homogenous system of
equations, Linear dependence
UNIT III - Linear transformations Teaching Hours: 15
Algebra of linear transformations, Matrix representations, rank nullity theorem, determinants,
eigenvalues and eigenvectors, Cayley-Hamilton theorem, Jordan canonical forms,
orthogonalisation process, orthonormal basis.
Unit IV - Quadratic forms and special matrices useful in statistics Teaching Hours: 15
Reduction and classification of quadratic forms, Special matrices: symmetric matrices,
positive definite matrices, idempotent and projection matrices, stochastic matrices, Gramian
matrices, dispersion matrices
Unit V - Linear models Teaching Hours: 15
Fitting the model, ordinary least squares, estimability of parametric functions, Gauss –
Markov theorem, applications: regression model, analysis of variance.
Textbooks
1. David C. Lay, Steven R. Lay, Judi J. McDonald (2016). Linear algebra and its
applications.
2. Gentle, J. E. (2017). Matrix algebra- Theory, Computations and Applications in
Statistics. Springer texts in statistics, Springer, New York.
3. Strang, G. (2006). Linear Algebra and its Applications.: Thomson Brooks. Cole,
Belmont, CA, USA.
Recommended reading
1. Searle, S. R. (1982). Matrix Algebra useful for Statistics. John Wiley and Sons. Inc.
2. Graybill, F. A. (1983). Matrices with applications in statistics, 2nd Ed. Wadsworth
3. Rencher, A. C., & Schaalje, G. B. (2008). Linear models in statistics. John Wiley &
Sons.
4. Christensen, R. (2011). Plane answers to complex questions: the theory of linear
models. Springer Science & Business Media.
MST171-Sample survey Designs
Total Teaching Hours/Semester: 75 No. of Lecture Hours/Week: 4+2
Max Marks: 150 Credits: 5
Course Objective
To impart the knowledge of different sample survey designs useful in the collection of
scientific data.
Course Outcomes
CO1: Understand different steps in designing a sample survey.
CO2: Analyse different sample survey designs and find estimators.
CO3: Identify the use of different sample survey designs.
CO4: Apply suitable sample survey design in real-life problems.
UNIT I: Random sampling designs Teaching Hours: 15
Sampling vs census, simple random sampling: with (SRS) and without replacement
(SRSWOR) of units, estimators of mean, total and variance, determination of sample size,
sampling for proportions, Stratified sampling scheme: estimation and allocation of sample
size, comparison with simple random sampling schemes.
UNIT II: Ratio and regression estimators Teaching Hours: 15
Bias and mean square error, estimation of variance, confidence interval, comparison with
mean per unit estimator, optimum property of ratio estimator, unbiased ratio type estimator,
ratio estimator in stratified random sampling, Difference estimator and Regression estimator:-
Difference estimator, regression estimator, comparison of regression estimator with mean per
unit and ratio estimator, regression estimator in stratified random sampling.
UNIT III: Varying probability sampling designs Teaching Hours: 15
With and without replacement sampling schemes: pps and ppswr schemes, Selection of
samples, estimators: ordered and unordered estimators. Πps sampling schemes.
UNIT IV: Other sampling designs Teaching Hours: 15
Systematic sampling scheme: estimation of population mean and variance, comparison of
systematic sampling with SRS and stratified random sampling, circular systematic sampling,
Cluster sampling: estimation of population mean, estimation of efficiency by a cluster
sample, variance function, determination of optimum cluster size, Multistage sampling:
estimation population total with SRS sampling at both stages, multiphase sampling (outline
only), quota sampling, network sampling; Adaptive sampling: introduction and estimators
under adaptive sampling. Introduction to small area estimation.
Unit V: Errors in Sample Survey Teaching Hours: 15
Sampling and non-sampling errors, the effect of unit nonresponse in the estimate, procedures
for unit nonresponse.
Textbooks
1. Arnab, R. (2017). Survey sampling: Theory and Applications. Academic Press.
2. Singh, S. (2003). Advanced Sampling: Theory and Practice. Kluwer.
3. Singh, D. and Chaudharay, F.S. (2014) Theory and Analysis of Sample Survey
Designs, Wiley Eastern.
Recommended reading
1. Cochran, W.G. (1999) Sampling Techniques, Third edition, John Wiley & Sons.
2. Des Raj (1976) Sampling Theory, McGraw Hill.
3. Murthy, M.N. (1977) Sampling Theory and Methods, Statistical Publishing Society,
Calcutta.
4. Mukhopadhay, P (2009) Theory and methods of survey sampling, Second edition,
PHI Learning Pvt Ltd., New Delhi.
5. Sampath, S. (2001) Sampling theory and methods, Alpha Science International Ltd.,
India.
MST172-Statistical Computing using R
Total Teaching Hours/Semester: 45 No. of Lecture Hours/Week: 2+2
Max Marks: 100 Credits: 3
Course objective
To equip students with knowledge of R programming to develop statistical models for real
world problems
Course Outcomes
CO1: To demonstrate data handling using statistical tool R
CO2: To perform graphical representation of data using R
CO3: To demonstrate the usage of R for data analysis.
UNIT I Introduction Teaching Hours: 9
Variables, Functions, Vectors, Expressions and assignments, Logical expressions, Matrices,
The workspace, R markdown.
UNIT II Basic Programming Teaching Hours: 9
Loops: if, for, while , Program flow , Basic debugging ,Good programming habits, Input and
outputs: Input from a file , Output to a file –Plotting.
UNITIII Programming with functions Teaching Hours: 9
Functions , Optional arguments and default values , Vector-based programming using
functions ,Recursive programming, Debugging functions, Sophisticated data structures -
Factors -Dataframes - Lists - The apply family.
UNIT IV Graphics Teaching Hours: 9
Visualizing data, Graphical summaries of data-Bar chart, Pie chart, Histogram, Box-plot,
Stem and leaf plot, Frequency table, Plotting of probability distributions and sampling
distributions, P-P plot, Q-Q Plot , ggplot2, lattice – 3D plots, Graphics parameters, par -
Graphical augmentation.
UNIT V Simulation Teaching Hours: 9
Numerical methods- Root-finding algorithms, Simulating iid uniform samples, Congruential
generators, Seeding, Simulating discrete random variables, Inversion method for continuous
random variables, Rejection method, generation of normal variates: Rejection with
exponential envelope, Box-Muller algorithm.
Textbooks
1. Chambers, J. M. (2008). Software for Data Analysis-Programming with R. Springer-
Verlag, New York.
2. Matloff, N. (2016). The art of R programming: A tour of statistical software design.
No Starch Press.
3. Jones, O., Maillardet. R. and Robinson, A. (2014). Introduction to Scientific
Programming and Simulation Using R. Chapman & Hall/CRC, The R Series.
Recommended Reading
1. Crawley, M, J. (2012). The R Book, 2nd Edition. John Wiley & Sons.
2. Chambers, J. M. (2008). Software for Data Analysis-Programming with R. Springer-
Verlag, New York.
MST134- Research Methodology and elements of LateX
Total Teaching Hours/Semester: 30 No. of Lecture Hours/Week: 2
Max Marks: 50 Credits: 2
Course objectives
To acquint students with different methodologies in statistical research and to make them
prepare scientific articles using LaTeX
Course Outcomes
CO1: To understand research problem
CO2: To identify suitable methodology for solving the research problem
CO3: To produce scientific articles using LaTeX.
UNIT I Fundamentals of research Teaching Hours: 15
Objectives, Motivation, Utility. Concept of theory, empiricism, deductive and inductive
theory. Characteristics of scientific method , Understanding the language of research ,
Concept, Construct, Definition, Variable. Research Process Problem Identification &
Formulation , Research Question – Investigation Question , Logic & Importance
UNIT II Scientific writing Teaching Hours: 15
Principles of mathematical writing, LaTeX: writing a research paper, survey article, thesis
writing, Beamer: preparing presentations
Textbooks
1. Kothari, C. R. (2004). Research methodology: Methods and techniques. New Age
International.
2. Nicholas J. Higham, (2008) Handbook of Writing for the Mathematical Sciences,
Second Edition, SIAM.
3. L. Lamport (2014), LaTeX, a Document Preparation System, 2nd ed, Addison-Wesley.
MST231-Statistical Inference I
Total Teaching Hours/Semester: 60 No. of Lecture Hours/Week: 4
Max Marks: 100 Credits: 4
Course Objectives
To provide a strong mathematical and conceptual foundation in the methods of parametric
estimation and their properties.
Course outcomes
CO1: To understand the properties of estimators.
CO2: To identify the suitable estimation method.
CO3: To analyse likelihood function and apply different root solving methods to find
estimators
CO4: To construct confidence intervals for parameters involved in the model.
UNIT I Sufficiency Teaching Hours: 12
Sufficiency: factorization theorem, minimal sufficiency, exponential family and
completeness. Ancillary statistics and Basu's theorem
UNIT II Unbiasedness Teaching Hours: 12
UMVUE: Fisher Information and Cramer-Rao inequality, Chapman-Robbin’s and
Bhattacharya bounds, Rao-Blackwell theorem, Lehman-Scheffe theorem. Unbiased
estimation.
UNIT III Consistent estimators Teaching Hours: 12
Consistency, Weak and strong consistency, Marginal and joint consistent estimators, CAN
estimators, equivariance, Pitman estimators
UNIT IV Methods of point estimation Teaching Hours: 12
Methods of moments, Minimum chi square and its modification, Least square estimation,
Maximum likelihood, Properties of maximum likelihood estimators, Cramer-Huzurbazar
Theorem, Likelihood equation - multiple roots, Iterative methods, EM Algorithm.
UNIT V Interval estimation Teaching Hours: 12
Large sample confidence interval, shortest length confidence interval. Methods of finding
confidence interval: Inversion of test statistic, pivotal quantities, piovoting CDF, evaluation
of confidence interval: size and coverage probability, loss function and test function
optimality.
Textbooks
1. Kale, B. K. (2005). A first course on parametric inference. Alpha Science Int. Ltd.
2. Lehmann, E. L., & Casella, G. (2006). Theory of point estimation. Springer Science
& Business Media.
3. Robert, C., & Casella, G. (2013). Monte Carlo statistical methods. Springer Science &
Business Media.
Recommended reading
1. Srivastava, A. K. , Khan, A. H. and Srivastava, N. (2014). Statistical Inference:
Theory of Estimation. PHI Learning Pvt. Ltd, New Delhi.
2. Casella, G., & Berger, R. L. (2002). Statistical inference . Pacific Grove, CA:
Duxbury.
3. Silvey, S. D. (2017). Statistical inference. Routledge.
4. Trosset, M. W. (2009). An introduction to statistical inference and its applications
with R. Chapman and Hall/CRC.
MST232- Stochastic Processes
Total Teaching Hours/Semester: 60 No. of Lecture Hours/Week: 4
Max Marks: 100 Credits: 4
Course Objectives
To equip the students with theoretical and practical knowledge of stochastic models wich are
used in economics, life sciences, engineering etc.
Course outcomes
CO1: To understand stochastic processes.
CO2: To identify ergodic Markov chains
CO3: To analyse queening models using continuous time Markov chains.
CO4: To apply Browning motion in finance problems.
UNIT I Introduction Teaching Hours: 12
Sequence of random variables, definition and classification of stochastic process,
autoregressive processes and stationary processes.
UNIT II Discrete time Markov chains Teaching Hours: 12
Markov Chains: Definition, Examples, Transition probability matrix, Chapman-Kolmogorv
equation, classification of states, limiting and stationary distributions, ergodicity, discrete
renewal equation and basic limit theorem, Absorption probabilities, Criteria for recurrence.
Generic application: hidden Markov models
Unit III Continuous time Markov chains and Poisson process Teaching Hours: 12
Transition probability function, Kolmogorov diferential eqquations, Poisson process:
homogenous process, interarrival distribution, compound process, Birth and death process.
Service applications: Queuing models- Markovian models.
Unit IV Branching process Teaching Hours: 12
Galton-Watson branching processes, Generating function, Extinction probabilities,
Continuous time branching processes, Extinction probabilities, Branching processes with
general variable life time.
Unit V Renewal process and Brownian Motion Teaching Hours: 12
Renewal equation, Renewal theorem, Applications, Generalizations and variations of renewal
processes, Applications of renewal theory, Brownian motion, Introduction to Markov renewal
processes.
Textbooks
1. Karlin, S. and Taylor, H.M. (2012). A first course in stochastic processes. Academic
press.
2. Cinlar, E. (2013). Introduction to stochastic processes. Courier Corporation.
3. S. M. Ross (2014). Introduction to Probability Models. Elsevier.
Recommenede Reading
1. Feller, W. (1965, 1968), An Introduction to Probability Theory and its Applications,
Volume I and II, Wiley Eastern.
2. J. Medhi,Stochastic Processes, 3rd Edition, New Age International, 2009
3. Dobrow, R.P. (2016), Introduction to Stochastic Processes with R, Wiley Eastern.
MST 233- Categorical Data Analysis
Total Teaching Hours/Semester: 60 No. of Lecture Hours/Week: 4
Max Marks: 100 Credits: 4
Course Objectives
To equip the students with the theory and methods to analyse and categorical responses.
Course Outcomes
CO1: To understand the categorical response.
CO2: to identify test for contingency tables.
CO3: To apply regression models for count data.
CO4: To analyse contingency tables using loglinear models.
Unit I – Introduction Teaching Hours: 12
Categorical response data, Probability distributions for categorical data, Statistical inference
for discrete data
Unit II – Contigency tables Teaching Hours: 12
Probability structure for contingency tables, Comparing proportions with 2x2 tables, The
odds ratio, Tests for independence, Exact inerence, Extension to three-way and larger tables
Unit III – Generlaized linear models Teaching Hours: 12
Components of a generalized linear model, GLM for binary and count data, Statistical
inference and model checking, Fitting GLMs
Unit IV Logistic regression Teaching Hours: 12
Interpreting the logistic regression model, Inference for logistic regression, Logistic
regression with categorical predictors, Multiple logistic regression, Summarizing effects,
Building and applying logistic regression models, Multicategory logit models
Unit V Loglinear models for contingency tables Teaching Hours: 12
Loglinear models for two-way and three-way tables , Inference for Loglinear models, the
loglinear-logistic connection, Independence graphs and collapsibility, Models for matched
pairs: Comparing dependent proportions, Logistic regression for matched pairs, Comparing
margins of square contingency tables, symmetry issues
Textbooks
1. Agresti, A. (2013). Categorical Data Analysis, 3rd Edition. New York: Wiley
2. Agresti, A. (2010). Analysis of ordinal categorical data (Vol. 656). John Wiley &
Sons.
Recommended reading
1. Le, C.T. (1998). Applied Categorical Data Analysis. New York: John Wiley and
Sons.
2. Stokes, M. E., Davis, C. S., & Koch, G. G. (2012). Categorical data analysis using
SAS. SAS institute.
3. Agresti, A. (2018). An introduction to categorical data analysis. John Wiley & Sons.
4. Bilder, C. R., & Loughin, T. M. (2014). Analysis of categorical data with R. Chapman
and Hall/CRC.
MST 271 – Regression Analysis
Total Teaching Hours/Semester: 75 No. of Lecture Hours/Week: 4+2
Max Marks: 150 Credits: 5
Course Objectives
To impart the knowledge statistical model building using regression technique.
Course Outcomes
CO1: To understand and formulate simple and multiple regression models
CO2: To identify the correct regression model for the given problem
CO3: To apply non linear regression in real life problems.
CO4: To analyse robustness of the regression model.
Unit I- Linear regression model Teaching Hours: 15
Linear Regression Model: Simple and multiple, Least squares estimation, Properties of the
estimators, Maximum likelihood estimation, Estimation with linear restrictions, Hypothesis
testing, Confidence intervals.
Unit II Model adequacy Teaching Hours: 15
Residual analysis, Departures from underlying assumptions, Effect of outliers, Collinearity,
Non-constant variance and serial correlation, Departures from normality, Diagnostics and
remedies.
Unit III Model Selection Teaching Hours: 15
Selection of input variables and model selection Methods of obtaining the best fit - Stepwise
regression Forward selection and backward elimination
Unit IV Nonlinear regression Teaching Hours: 15
Introduction to general non-linear regression Least squares in non-linear case Estimating the
parameters of a non-linear system Reparametrisation of the model Non-linear growth models
Unit V Robust regression Teaching Hours: 15
Linear absolute deviation regression M estimators Robust regression with rank residuals
Resampling procedures for regression models methods and its properties (without proof) -
Jackknife techniques and least squares approach based on M-estimators.
Textbooks
1. Chatterjee, S., & Hadi, A. S. (2015). Regression analysis by example. John Wiley &
Sons.
2. Draper, N. R., & Smith, H. (1998). Applied regression analysis (Vol. 326). John
Wiley & Sons.
3. Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to linear
regression analysis (Vol. 821). John Wiley & Sons.
Recommende Readings
1. Seber, G. A., & Lee, A. J. (2012). Linear regression analysis (Vol. 329). John Wiley
& Sons.
2. Keith, T. Z. (2014). Multiple regression and beyond: An introduction to multiple
regression and structural equation modeling. Routledge.
3. Fox, J. (2015). Applied regression analysis and generalized linear models. Sage
Publications.
4. Fox, J., & Weisberg, S. (2018). An R companion to applied regression. Sage
publications.
MST272- Statistical computing using Python
Total Teaching Hours/Semester: 45 No. of Lecture Hours/Week: 2+2
Max Marks: 100 Credits: 3
Course Objectives
To equip the studens with programming skill in python and to apply in data analysis.
Course outcomes
CO1: To understand python and basic syntax
CO2: To understand functions and data odeling
CO3: To analyze statistical datasets and visualize it.
Unit I- Introduction Teaching Hours: 15
installing Python; basic syntax, interactive shell, editing, saving, and running a script, The
concept of data types; variables, assignments; immutable variables; numerical types;
arithmetic operators and expressions; comments in the program; understanding error
messages; Conditions, boolean logic, logical operators; ranges; Control statements: if-else,
loops
Unit II Design with functions Teaching Hours: 15
hiding redundancy, complexity; arguments and return values; formal vs actual arguments,
named arguments. Program structure and design. Recursive functions. Classes and OOP:
classes, objects, attributes and methods; defining classes; design with classes, data modeling
Unit III Statistical tools Teaching Hours: 15
Pandas,Statsmodels, Seaborn, displaying statistical data, distributions and hypothesis testing,
linear regression models.
Textbooks
1. Lambert, K. A. (2018). Fundamentals of Python: first programs. Cengage Learning.
2. Haslwanter, T. (2016). An Introduction to Statistics with Python. Springer
International Publishing:.
Recommended Readings
1. Unpingco, J. (2016). Python for probability, statistics, and machine learning (Vol. 1).
Springer International Publishing.
2. Anthony, F. (2015). Mastering pandas. Packt Publishing Ltd.
MST241A- Principles of Data Science and Data Base Techniques
Total Teaching Hours/Semester: 60 No. of Lecture Hours/Week: 3+2
Max Marks: 100 Credits: 4
Course Objective
To provide strong foundation for data science and application area related to it and
understand the underlying core concepts and emerging technologies in data science.
Course Learning Outcomes
CO1: Explore the fundamental concepts of data science
CO2: Understand data analysis techniques for applications handling large data
CO3: Demonstrate various databases and Compose effective queries
Unit-1 Introduction to Data Science Teaching Hours: 15
Definition – Big Data and Data Science Hype – Why data science – Getting Past the Hype –
The Current Landscape – Who is Data Scientist? - Data Science Process Overview –
Defining goals – Retrieving data – Data preparation – Data exploration – Data modeling –
Presentation.
Unit-2 Big Data Teaching Hours: 15
Problems when handling large data – General techniques for handling large data – Case study
– Steps in big data – Distributing data storage and processing with Frameworks – Case study.
Unit-3 Introduction to DBMS Teaching Hours: 15
Concept & Overview of DBMS, Data Models, Database Languages, Database Administrator,
Database Users, Three Schema architecture of DBMS. Basic concepts, Design Issues,
Mapping Constraints, Keys, Entity-Relationship Diagram, Weak Entity Sets, Extended E-R
features
Unit-4 Relational Model and Database Design Teaching Hours: 15
SQL and Integrity Constraints, Concept of DDL, DML, DCL. Basic Structure, Set operations,
Aggregate Functions, Null Values, Domain Constraints, Referential Integrity Constraints,
assertions, views, Nested Subqueries, Functional Dependency, Different anomalies in
designing a Database, Normalization : using functional dependencies, Boyce-Codd Normal
Form, 4NF, 5NF
Essential Readings
1. Introducing Data Science, Davy Cielen, Arno D. B. Meysman, Mohamed Ali,
Manning Publications Co., 1st edition, 2016.
2. Thomas Cannolly and Carolyn Begg, “Database Systems, A Practical Approach to
Design, Implementation and Management”, 3rd Edition, Pearson Education, 2007.
Recommended Readings
1. An Introduction to Statistical Learning: with Applications in R, Gareth James,
Daniela Witten, Trevor Hastie, Robert Tibshirani, Springer, 1st edition, 2013
2. Ethics and Data Science, D J Patil, Hilary Mason, Mike Loukides, O’ Reilly, 1st
edition, 2018
3. LiorRokach and OdedMaimon, Data Mining and Knowledge Discovery Handbook,
Springer, 2nd edition, 2010.
MST241B- Survival Analysis
Total Teaching Hours/Semester: 60 No. of Lecture Hours/Week: 3+2
Max Marks: 100 Credits: 4
Course Objective
This course will provide an introduction to the principles and methods for the analysis of
time-to-event data. This type of data occurs extensively in both observational and
experimental biomedical and public health studies.
Unit I: Parametric Survival Models Teaching Hours: 15
The hazard and survival functions in continuous time. Parametric forms and the distribution
of log time. The exponential, Weibull, Gompertz, Gamma, Generalized Gamma, Coale-
McNeil, and generalized F distributions. The U.S. life table.
Approaches to modelling the effects of covariates. Parametric families. Proportional hazards
models (PH). Accelerated failure time models (AFT). The intersection of PH and AFT.
Proportional odds models (PO). The intersection of PO and AFT. Recidivism in the U.S.
Unit II: Non-Parametric Survival Models Teaching Hours: 15
One-sample estimation with censored data. The Kaplan-Meier estimator. Greenwood's
formula. The Nelson-Aalen estimator. Expectation of life. Comparison of several groups:
Mantel-Haenszel and the log-rank test.
Regression: Cox's model and partial likelihood. The score and information. The problem of
ties. Tests of hypotheses. Time-varying covariates. Estimating the baseline survival.
Martingale residuals.
Unit III: Models for Discrete Data and Extensions Teaching Hours: 15
Cox's discrete logistic model and logistic regression. Modeling grouped continuous data and
the complementary log-log transformation. Piece-wise constant hazards and Poisson
regression.
Current status data versus retrospective data. Open intervals and time since last event.
Backward recurrence times. Interval censoring.
Unit IV: Competing Risks Teaching Hours: 15
Modeling multiple causes of failure. Research questions of interest. Cause-specific hazards.
Overall survival. Cause-specific densities. Estimation: one-sample and the generalized
Kaplan-Meier and Nelson-Aalen estimators. The Incidence function.
Regression models. Weibull regression. Cox regression and the partial likelihood. Piece-wise
exponential survival and multinomial logits. The identification problem. Multivariate and
marginal survival. The Fine-Gray model.
Textbooks
1. Klein, J. P., & Moeschberger, M. L. (2006). Survival analysis: techniques for
censored and truncated data. Springer Science & Business Media.
2. Cleves, M.; W. G. Gould, and J. Marchenko (2016). An Introduction to Survival
Analysis Using Stata. Revised Third Edition. College Station, Texas: Stata Press.
3. Kalbfleisch, J. D., & Prentice, R. L. (2011). The statistical analysis of failure time
data (Vol. 360). John Wiley & Sons.
Recommende Readings
1. Cox, D. and D. Oakes (1984). Analysis of Survival Data. London: Chapman-Hall.
2. Singer, J.D and J. B. Willett (2003) Applied Longitudinal Data Analysis: Modeling
Change and Event Occurrence. Oxford, England: Oxford University Press.
3. Therneau, T. M. and P. M. Grambsch (2000). Modeling Survival Data: Extending the
Cox Model. New York: Springe
4. Collett, D. (2015). Modelling survival data in medical research. Chapman and
Hall/CRC.
MST241C- Statistical Quality Control
Total Teaching Hours/Semester: 60 No. of Lecture Hours/Week: 3+2
Max Marks: 100 Credits: 4
Course Objective
This course provides an introduce to the application of statistical tools on industrial
environment to study, analyze and control the quality of products.
Course Outcomes:
CO1: Demonstrate the concepts control charts to improve the quality standards of the
process.
CO2: Apply the idea of Sampling Plans to control the quality of industrial outputs.
Unit – I: Statistical Process Control Teaching Hours: 15
Meaning and scope of statistical quality control - Causes of quality variation - Control charts
for variables and attributes - Rational subgroups - Construction and operation of , σ, R, np, p,
c and u charts - Operating characteristic curves of control charts. Process capability analysis
using histogram, probability plotting and control chart - Process capability ratios and their
interpretations.
Unit – II: Advanced Control Charts Teaching Hours: 15
Specification limits and tolerance limits - Modified control charts - Basic principles and
design of cumulative-sum control charts – Concept of V-mask procedure – Tabular CUSUM
charts. Construction of Moving range, moving-average and geometric moving-average
control charts..
Unit – III: Statistical Product Control Teaching Hours: 15
Acceptance sampling: Sampling inspection by attributes – single, double and multiple
sampling plans – Rectifying Inspection. Measures of performance: OC, ASN, ATI and AOQ
functions. Concepts of AQL, LTPD and IQL. Dodge – Romig and MIL-STD-105D tables.
Sampling inspection by variables - known and unknown sigma variables sampling plan -
Merits and limitations of variables sampling plan - Derivation of OC curve – determination of
plan parameters.
Unit – IV Continuous Sampling Plans Teaching Hours: 15
Continuous sampling plans by attributes - CSP-1 and its modifications - concept of AOQL in
CSPs - Multi-level continuous sampling plans - Operation of multi-level CSP of Lieberman
and Solomon – Wald - Wolfowitz continuous sampling plans. Sequential Sampling Plans by
attributes – Decision Lines - OC and ASN functions.
Essential Readings:
1. Montgomery, D. C. (2009). Introduction to Statistical Quality Control, Sixth Edition,
Wiley India, New Delhi.
2. Duncan, A. J. (2003.). Quality Control and Industrial Statistics, Irwin-Illinois, US.
Recommened Readings:
1. Juran, J.M., and De Feo, J.A. (2010). Juran’s Quality control Handbook – The
Complete Guide to Performance Excellence, Sixth Edition, Tata McGraw-Hill, New
Delhi.
2. Schilling, E. G., and Nuebauer, D.V. (2009). Acceptance Sampling in Quality Control
Second Edition, CRC Press, New York.
3. Ross, S. M. (2009). Introduction to Probability Models, Tenth Edition, Academic
Press, MA, US.