- Table View
- List View
Statistical Concepts - A Second Course
by Debbie L. Hahs-Vaughn Richard G. LomaxStatistical Concepts—A Second Course presents the last 10 chapters from An Introduction to Statistical Concepts, Fourth Edition. Designed for second and upper-level statistics courses, this book highlights how statistics work and how best to utilize them to aid students in the analysis of their own data and the interpretation of research results. In this new edition, Hahs-Vaughn and Lomax discuss sensitivity, specificity, false positive and false negative errors. Coverage of effect sizes has been expanded upon and more organizational features (to summarize key concepts) have been included. A final chapter on mediation and moderation has been added for a more complete presentation of regression models. This book acts as a clear and accessible instructional tool to help readers fully understand statistical concepts and how to apply them to data. It is an invaluable resource for students undertaking a course in statistics in any number of social science and behavioral science disciplines.
Statistical Confidentiality
by Mark Elliot George T. Duncan Gonzalez Juan Jose SalazarBecause statistical confidentiality embraces the responsibility for both protecting data and ensuring its beneficial use for statistical purposes, those working with personal and proprietary data can benefit from the principles and practices this book presents. Researchers can understand why an agency holding statistical data does not respond well to the demand, "Just give me the data; I'm only going to do good things with it." Statisticians can incorporate the requirements of statistical confidentiality into their methodologies for data collection and analysis. Data stewards, caught between those eager for data and those who worry about confidentiality, can use the tools of statistical confidentiality toward satisfying both groups. The eight chapters lay out the dilemma of data stewardship organizations (such as statistical agencies) in resolving the tension between protecting data from snoopers while providing data to legitimate users, explain disclosure risk and explore the types of attack that a data snooper might mount, present the methods of disclosure risk assessment, give techniques for statistical disclosure limitation of both tabular data and microdata, identify measures of the impact of disclosure limitation on data utility, provide restricted access methods as administrative procedures for disclosure control, and finally explore the future of statistical confidentiality.
Statistical Data Analysis and Entropy (Behaviormetrics: Quantitative Approaches to Human Behavior #3)
by Nobuoki EshimaThis book reconsiders statistical methods from the point of view of entropy, and introduces entropy-based approaches for data analysis. Further, it interprets basic statistical methods, such as the chi-square statistic, t-statistic, F-statistic and the maximum likelihood estimation in the context of entropy. In terms of categorical data analysis, the book discusses the entropy correlation coefficient (ECC) and the entropy coefficient of determination (ECD) for measuring association and/or predictive powers in association models, and generalized linear models (GLMs). Through association and GLM frameworks, it also describes ECC and ECD in correlation and regression analyses for continuous random variables. In multivariate statistical analysis, canonical correlation analysis, T2-statistic, and discriminant analysis are discussed in terms of entropy. Moreover, the book explores the efficiency of test procedures in statistical tests of hypotheses using entropy. Lastly, it presents an entropy-based path analysis for structural GLMs, which is applied in factor analysis and latent structure models. Entropy is an important concept for dealing with the uncertainty of systems of random variables and can be applied in statistical methodologies. This book motivates readers, especially young researchers, to address the challenge of new approaches to statistical data analysis and behavior-metric studies.
Statistical Data Analysis for the Physical Sciences
by Adrian BevanData analysis lies at the heart of every experimental science. Providing a modern introduction to statistics, this book is ideal for undergraduates in physics. It introduces the necessary tools required to analyse data from experiments across a range of areas, making it a valuable resource for students. In addition to covering the basic topics, the book also takes in advanced and modern subjects, such as neural networks, decision trees, fitting techniques, and issues concerning limit or interval setting. Worked examples and case studies illustrate the techniques presented, and end-of-chapter exercises help test the reader's understanding of the material.
Statistical Data Analysis Using SAS: Intermediate Statistical Methods (Springer Texts in Statistics)
by Mervyn G. Marasinghe Kenneth J. KoehlerThe aim of this textbook (previously titled SAS for Data Analytics) is to teach the use of SAS for statistical analysis of data for advanced undergraduate and graduate students in statistics, data science, and disciplines involving analyzing data.The book begins with an introduction beyond the basics of SAS, illustrated with non-trivial, real-world, worked examples. It proceeds to SAS programming and applications, SAS graphics, statistical analysis of regression models, analysis of variance models, analysis of variance with random and mixed effects models, and then takes the discussion beyond regression and analysis of variance to conclude.Pedagogically, the authors introduce theory and methodological basis topic by topic, present a problem as an application, followed by a SAS analysis of the data provided and a discussion of results. The text focuses on applied statistical problems and methods. Key features include: end of chapter exercises, downloadable SAS code and data sets, and advanced material suitable for a second course in applied statistics with every method explained using SAS analysis to illustrate a real-world problem.New to this edition:• Covers SAS v9.2 and incorporates new commands• Uses SAS ODS (output delivery system) for reproduction of tables and graphics output• Presents new commands needed to produce ODS output• All chapters rewritten for clarity• New and updated examples throughout• All SAS outputs are new and updated, including graphics• More exercises and problems• Completely new chapter on analysis of nonlinear and generalized linear models• Completely new appendixMervyn G. Marasinghe, PhD, is Associate Professor Emeritus of Statistics at Iowa State University, where he has taught courses in statistical methods and statistical computing.Kenneth J. Koehler, PhD, is University Professor of Statistics at Iowa State University, where he teaches courses in statistical methodology at both graduate and undergraduate levels and primarily uses SAS to supplement his teaching.
Statistical Data Analytics: Foundations for Data Mining, Informatics, and Knowledge Discovery
by Walter W. PiegorschA comprehensive introduction to statistical methods for data mining and knowledge discovery. Applications of data mining and ‘big data’ increasingly take center stage in our modern, knowledge-driven society, supported by advances in computing power, automated data acquisition, social media development and interactive, linkable internet software. This book presents a coherent, technical introduction to modern statistical learning and analytics, starting from the core foundations of statistics and probability. It includes an overview of probability and statistical distributions, basics of data manipulation and visualization, and the central components of standard statistical inferences. The majority of the text extends beyond these introductory topics, however, to supervised learning in linear regression, generalized linear models, and classification analytics. Finally, unsupervised learning via dimension reduction, cluster analysis, and market basket analysis are introduced. Extensive examples using actual data (with sample R programming code) are provided, illustrating diverse informatic sources in genomics, biomedicine, ecological remote sensing, astronomy, socioeconomics, marketing, advertising and finance, among many others. Statistical Data Analytics: Focuses on methods critically used in data mining and statistical informatics. Coherently describes the methods at an introductory level, with extensions to selected intermediate and advanced techniques. Provides informative, technical details for the highlighted methods. Employs the open-source R language as the computational vehicle – along with its burgeoning collection of online packages – to illustrate many of the analyses contained in the book. Concludes each chapter with a range of interesting and challenging homework exercises using actual data from a variety of informatic application areas. This book will appeal as a classroom or training text to intermediate and advanced undergraduates, and to beginning graduate students, with sufficient background in calculus and matrix algebra. It will also serve as a source-book on the foundations of statistical informatics and data analytics to practitioners who regularly apply statistical learning to their modern data.
Statistical Data Analytics: Foundations for Data Mining, Informatics, and Knowledge Discovery, Solutions Manual
by Walter W. PiegorschSolutions Manual to accompany Statistical Data Analytics: Foundations for Data Mining, Informatics, and Knowledge Discovery A comprehensive introduction to statistical methods for data mining and knowledge discovery.Extensive solutions using actual data (with sample R programming code) are provided, illustrating diverse informatic sources in genomics, biomedicine, ecological remote sensing, astronomy, socioeconomics, marketing, advertising and finance, among many others.
Statistical Data Cleaning with Applications in R
by Mark van der Loo Edwin De JongeA comprehensive guide to automated statistical data cleaning <p><p> The production of clean data is a complex and time-consuming process that requires both technical know-how and statistical expertise. Statistical Data Cleaning with Applications in R brings together a wide range of techniques for cleaning textual, numeric or categorical data. This book examines technical data cleaning methods relating to data representation and data structure. A prominent role is given to statistical data validation, data cleaning based on predefined restrictions, and data cleaning strategy.
Statistical Data Mining Using SAS Applications (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series)
by George FernandezStatistical Data Mining Using SAS Applications, Second Edition describes statistical data mining concepts and demonstrates the features of user-friendly data mining SAS tools. Integrating the statistical and graphical analysis tools available in SAS systems, the book provides complete statistical data mining solutions without writing SAS program co
Statistical Decision Problems
by Michael Zabarankin Stan UryasevStatistical Decision Problems presents a quick and concise introduction into the theory of risk, deviation and error measures that play a key role in statistical decision problems. It introduces state-of-the-art practical decision making through twenty-one case studies from real-life applications. The case studies cover a broad area of topics and the authors include links with source code and data, a very helpful tool for the reader. In its core, the text demonstrates how to use different factors to formulate statistical decision problems arising in various risk management applications, such as optimal hedging, portfolio optimization, cash flow matching, classification, and more. The presentation is organized into three parts: selected concepts of statistical decision theory, statistical decision problems, and case studies with portfolio safeguard. The text is primarily aimed at practitioners in the areas of risk management, decision making, and statistics. However, the inclusion of a fair bit of mathematical rigor renders this monograph an excellent introduction to the theory of general error, deviation, and risk measures for graduate students. It can be used as supplementary reading for graduate courses including statistical analysis, data mining, stochastic programming, financial engineering, to name a few. The high level of detail may serve useful to applied mathematicians, engineers, and statisticians interested in modeling and managing risk in various applications.
Statistical Design and Analysis of Biological Experiments (Statistics for Biology and Health)
by Hans-Michael KaltenbachThis richly illustrated book provides an overview of the design and analysis of experiments with a focus on non-clinical experiments in the life sciences, including animal research. It covers the most common aspects of experimental design such as handling multiple treatment factors and improving precision. In addition, it addresses experiments with large numbers of treatment factors and response surface methods for optimizing experimental conditions or biotechnological yields.The book emphasizes the estimation of effect sizes and the principled use of statistical arguments in the broader scientific context. It gradually transitions from classical analysis of variance to modern linear mixed models, and provides detailed information on power analysis and sample size determination, including ‘portable power’ formulas for making quick approximate calculations. In turn, detailed discussions of several real-life examples illustrate the complexities and aberrations that can arise in practice.Chiefly intended for students, teachers and researchers in the fields of experimental biology and biomedicine, the book is largely self-contained and starts with the necessary background on basic statistical concepts. The underlying ideas and necessary mathematics are gradually introduced in increasingly complex variants of a single example. Hasse diagrams serve as a powerful method for visualizing and comparing experimental designs and deriving appropriate models for their analysis. Manual calculations are provided for early examples, allowing the reader to follow the analyses in detail. More complex calculations rely on the statistical software R, but are easily transferable to other software. Though there are few prerequisites for effectively using the book, previous exposure to basic statistical ideas and the software R would be advisable.
Statistical Design, Monitoring, and Analysis of Clinical Trials: Principles and Methods (Chapman & Hall/CRC Biostatistics Series)
by Weichung Joe Shih Joseph AisnerStatistical Design, Monitoring, and Analysis of Clinical Trials, Second Edition concentrates on the biostatistics component of clinical trials. This new edition is updated throughout and includes five new chapters.Developed from the authors’ courses taught to public health and medical students, residents, and fellows during the past 20 years, the text shows how biostatistics in clinical trials is an integration of many fundamental scientific principles and statistical methods. The book begins with ethical and safety principles, core trial design concepts, the principles and methods of sample size and power calculation, and analysis of covariance and stratified analysis. It then focuses on sequential designs and methods for two-stage Phase II cancer trials to Phase III group sequential trials, covering monitoring safety, futility, and efficacy. The authors also discuss the development of sample size reestimation and adaptive group sequential procedures, phase 2/3 seamless design and trials with predictive biomarkers, exploit multiple testing procedures, and explain the concept of estimand, intercurrent events, and different missing data processes, and describe how to analyze incomplete data by proper multiple imputations. This text reflects the academic research, commercial development, and public health aspects of clinical trials. It gives students and practitioners a multidisciplinary understanding of the concepts and techniques involved in designing, monitoring, and analyzing various types of trials. The book’s balanced set of homework assignments and in-class exercises are appropriate for students and researchers in (bio)statistics, epidemiology, medicine, pharmacy, and public health.
Statistical Design of Experiments with Engineering Applications
by Kamel Rekab Muzaffar ShaikhIn today's high-technology world, with flourishing e-business and intense competition at a global level, the search for the competitive advantage has become a crucial task of corporate executives. Quality, formerly considered a secondary expense, is now universally recognized as a necessary tool. Although many statistical methods are available for
Statistical Diagnostics for Cancer: Analyzing High-Dimensional Data
by Frank Emmert-Streib Matthias DehmerThis ready reference discusses different methods for statistically analyzing and validating data created with high-throughput methods. As opposed to other titles, this book focusses on systems approaches, meaning that no single gene or protein forms the basis of the analysis but rather a more or less complex biological network. From a methodological point of view, the well balanced contributions describe a variety of modern supervised and unsupervised statistical methods applied to various large-scale datasets from genomics and genetics experiments. Furthermore, since the availability of sufficient computer power in recent years has shifted attention from parametric to nonparametric methods, the methods presented here make use of such computer-intensive approaches as Bootstrap, Markov Chain Monte Carlo or general resampling methods. Finally, due to the large amount of information available in public databases, a chapter on Bayesian methods is included, which also provides a systematic means to integrate this information. A welcome guide for mathematicians and the medical and basic research communities.
Statistical Disclosure Control
by Eric Schulte Nordholt Sarah Giessing Luisa Franconi Keith Spicer Josep Domingo-Ferrer Anco Hundepool Peter-Paul de WolfA reference to answer all your statistical confidentiality questions.This handbook provides technical guidance on statistical disclosure control and on how to approach the problem of balancing the need to provide users with statistical outputs and the need to protect the confidentiality of respondents. Statistical disclosure control is combined with other tools such as administrative, legal and IT in order to define a proper data dissemination strategy based on a risk management approach.The key concepts of statistical disclosure control are presented, along with the methodology and software that can be used to apply various methods of statistical disclosure control. Numerous examples and guidelines are also featured to illustrate the topics covered.Statistical Disclosure Control: Presents a combination of both theoretical and practical solutions Introduces all the key concepts and definitions involved with statistical disclosure control. Provides a high level overview of how to approach problems associated with confidentiality.Provides a broad-ranging review of the methods available to control disclosure.Explains the subtleties of group disclosure control.Features examples throughout the book along with case studies demonstrating how particular methods are used.Discusses microdata, magnitude and frequency tabular data, and remote access issues.Written by experts within leading National Statistical Institutes.Official statisticians, academics and market researchers who need to be informed and make decisions on disclosure limitation will benefit from this book.
Statistical Distributions
by Merran Evans Catherine Forbes Nicholas Hastings Brian PeacockA new edition of the trusted guide on commonly used statistical distributionsFully updated to reflect the latest developments on the topic, Statistical Distributions, Fourth Edition continues to serve as an authoritative guide on the application of statistical methods to research across various disciplines. The book provides a concise presentation of popular statistical distributions along with the necessary knowledge for their successful use in data modeling and analysis.Following a basic introduction, forty popular distributions are outlined in individual chapters that are complete with related facts and formulas. Reflecting the latest changes and trends in statistical distribution theory, the Fourth Edition features:A new chapter on queuing formulas that discusses standard formulas that often arise from simple queuing systemsMethods for extending independent modeling schemes to the dependent case, covering techniques for generating complex distributions from simple distributionsNew coverage of conditional probability, including conditional expectations and joint and marginal distributionsCommonly used tables associated with the normal (Gaussian), student-t, F and chi-square distributionsAdditional reviewing methods for the estimation of unknown parameters, such as the method of percentiles, the method of moments, maximum likelihood inference, and Bayesian inferenceStatistical Distributions, Fourth Edition is an excellent supplement for upper-undergraduate and graduate level courses on the topic. It is also a valuable reference for researchers and practitioners in the fields of engineering, economics, operations research, and the social sciences who conduct statistical analyses.
Statistical Epidemiology
by Graham Law Shane PascoeStatistics are a vital skill for epidemiologists and form an essential part of clinical medicine and public health. This textbook introduces students to statistical epidemiology methods in a carefully structured and accessible format. With clearly defined learning outcomes, the suggested chapter orders can be tailored to the needs of students at both undergraduate and graduate level from a range of academic backgrounds. The book covers study design, measuring disease, bias, error, analysis and modelling and is illustrated with figures, focus boxes, study questions and examples applicable to everyday clinical problems. Drawing on the authors' extensive teaching experience, the text provides an introduction to core statistical epidemiology that will be a valuable resource for students and lecturers in health and medical sciences and applied statistics, health staff, clinical researchers and data managers.
Statistical Evidence: A Likelihood Paradigm (Chapman And Hall/crc Monographs On Statistics And Applied Probability Ser. #71)
by Richard RoyallInterpreting statistical data as evidence, Statistical Evidence: A Likelihood Paradigm focuses on the law of likelihood, fundamental to solving many of the problems associated with interpreting data in this way. Statistics has long neglected this principle, resulting in a seriously defective methodology. This book redresses the balance, explaining why science has clung to a defective methodology despite its well-known defects. After examining the strengths and weaknesses of the work of Neyman and Pearson and the Fisher paradigm, the author proposes an alternative paradigm which provides, in the law of likelihood, the explicit concept of evidence missing from the other paradigms. At the same time, this new paradigm retains the elements of objective measurement and control of the frequency of misleading results, features which made the old paradigms so important to science. The likelihood paradigm leads to statistical methods that have a compelling rationale and an elegant simplicity, no longer forcing the reader to choose between frequentist and Bayesian statistics.
Statistical Foundations of Data Science (Chapman & Hall/CRC Data Science Series)
by Jianqing Fan Runze Li Cun-Hui Zhang Hui ZouStatistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis
by Alfred P. RovaiThe purpose of this book is to provide a working background of descriptive and inferential statistics and step-by-step examples of how to perform various statistical procedures using Microsoft Excel's native operators and functions. Automated procedures are also described using Excel's Analysis TookPak and AnalystSoft StatPlus.
Statistical Genomics: Linkage, Mapping, and QTL Analysis
by Ben Hui LiuGenomics, the mapping of the entire genetic complement of an organism, is the new frontier in biology. This handbook on the statistical issues of genomics covers current methods and the tried-and-true classical approaches.
Statistical Hypothesis Testing in Context: Reproducibility, Inference, and Science
by Michael P. Fay Erica H. BrittainFay and Brittain present statistical hypothesis testing and compatible confidence intervals, focusing on application and proper interpretation. The emphasis is on equipping applied statisticians with enough tools - and advice on choosing among them - to find reasonable methods for almost any problem and enough theory to tackle new problems by modifying existing methods. After covering the basic mathematical theory and scientific principles, tests and confidence intervals are developed for specific types of data. Essential methods for applications are covered, such as general procedures for creating tests (e.g., likelihood ratio, bootstrap, permutation, testing from models), adjustments for multiple testing, clustering, stratification, causality, censoring, missing data, group sequential tests, and non-inferiority tests. New methods developed by the authors are included throughout, such as melded confidence intervals for comparing two samples and confidence intervals associated with Wilcoxon-Mann-Whitney tests and Kaplan-Meier estimates. Examples, exercises, and the R package asht support practical use.
Statistical Hypothesis Testing with Microsoft ® Office Excel ® (Synthesis Lectures on Mathematics & Statistics)
by Robert HirschThis book provides a comprehensive treatment of the logic behind hypothesis testing. Readers will learn to understand statistical hypothesis testing and how to interpret P-values under a variety of conditions including a single hypothesis test, a collection of hypothesis tests, and tests performed on accumulating data. The author explains how a hypothesis test can be interpreted to draw conclusions, and descriptions of the logic behind frequentist (classical) and Bayesian approaches to interpret the results of a statistical hypothesis test are provided. Both approaches have their own strengths and challenges, and a special challenge presents itself when hypothesis tests are repeatedly performed on accumulating data. Possible pitfalls and methods to interpret hypothesis tests when accumulating data are also analyzed. This book will be of interest to researchers, graduate students, and anyone who has to interpret the results of statistical analyses.
Statistical Hypothesis Testing with SAS and R
by Sonja Kuhnt Dirk TaegerA comprehensive guide to statistical hypothesis testing with examples in SAS and RWhen analyzing datasets the following questions often arise:Is there a short hand procedure for a statistical test available in SAS or R?If so, how do I use it?If not, how do I program the test myself?This book answers these questions and provides an overview of the most commonstatistical test problems in a comprehensive way, making it easy to find and performan appropriate statistical test.A general summary of statistical test theory is presented, along with a basicdescription for each test, including the necessary prerequisites, assumptions, theformal test problem and the test statistic. Examples in both SAS and R are provided,along with program code to perform the test, resulting output and remarksexplaining the necessary program parameters.Key features:* Provides examples in both SAS and R for each test presented.* Looks at the most common statistical tests, displayed in a clear and easy to follow way.* Supported by a supplementary website http://www.d-taeger.de featuring exampleprogram code.Academics, practitioners and SAS and R programmers will find this book a valuableresource. Students using SAS and R will also find it an excellent choice for referenceand data analysis.
Statistical Implications of Turing's Formula
by Zhiyi ZhangFeatures a broad introduction to recent research on Turing's formula and presents modern applications in statistics, probability, information theory, and other areas of modern data science Turing's formula is, perhaps, the only known method for estimating the underlying distributional characteristics beyond the range of observed data without making any parametric or semiparametric assumptions. This book presents a clear introduction to Turing's formula and its connections to statistics. Topics with relevance to a variety of different fields of study are included such as information theory; statistics; probability; computer science inclusive of artificial intelligence and machine learning; big data; biology; ecology; and genetics. The author provides examinations of many core statistical issues within modern data science from Turing's perspective. A systematic approach to long-standing problems such as entropy and mutual information estimation, diversity index estimation, domains of attraction on general alphabets, and tail probability estimation is presented in light of the most up-to-date understanding of Turing's formula. Featuring numerous exercises and examples throughout, the author provides a summary of the known properties of Turing's formula and explains how and when it works well; discusses the approach derived from Turing's formula in order to estimate a variety of quantities, all of which mainly come from information theory, but are also important for machine learning and for ecological applications; and uses Turing's formula to estimate certain heavy-tailed distributions. In summary, this book: * Features a unified and broad presentation of Turing's formula, including its connections to statistics, probability, information theory, and other areas of modern data science * Provides a presentation on the statistical estimation of information theoretic quantities * Demonstrates the estimation problems of several statistical functions from Turing's perspective such as Simpson's indices, Shannon's entropy, general diversity indices, mutual information, and Kullback-Leibler divergence * Includes numerous exercises and examples throughout with a fundamental perspective on the key results of Turing's formula Statistical Implications of Turing's Formula is an ideal reference for researchers and practitioners who need a review of the many critical statistical issues of modern data science. This book is also an appropriate learning resource for biologists, ecologists, and geneticists who are involved with the concept of diversity and its estimation and can be used as a textbook for graduate courses in mathematics, probability, statistics, computer science, artificial intelligence, machine learning, big data, and information theory. Zhiyi Zhang, PhD, is Professor of Mathematics and Statistics at The University of North Carolina at Charlotte. He is an active consultant in both industry and government on a wide range of statistical issues, and his current research interests include Turing's formula and its statistical implications; probability and statistics on countable alphabets; nonparametric estimation of entropy and mutual information; tail probability and biodiversity indices; and applications involving extracting statistical information from low-frequency data space. He earned his PhD in Statistics from Rutgers University.