- Table View
- List View
Data Science: The Hard Parts
by Daniel VaughanThis practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one.Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries.With this book, you will:Understand how data science creates valueDeliver compelling narratives to sell your data science projectBuild a business case using unit economics principlesCreate new features for a ML model using storytellingLearn how to decompose KPIsPerform growth decompositions to find root causes for changes in a metricDaniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).
Data Science in Agriculture and Natural Resource Management (Studies in Big Data #96)
by G. P. Obi Reddy Mehul S. Raval J. Adinarayana Sanjay ChaudharyThis book aims to address emerging challenges in the field of agriculture and natural resource management using the principles and applications of data science (DS). The book is organized in three sections, and it has fourteen chapters dealing with specialized areas. The chapters are written by experts sharing their experiences very lucidly through case studies, suitable illustrations and tables. The contents have been designed to fulfil the needs of geospatial, data science, agricultural, natural resources and environmental sciences of traditional universities, agricultural universities, technological universities, research institutes and academic colleges worldwide. It will help the planners, policymakers and extension scientists in planning and sustainable management of agriculture and natural resources. The authors believe that with its uniqueness the book is one of the important efforts in the contemporary cyber-physical systems.
Data Science in Applications (Studies in Computational Intelligence #1084)
by Gintautas Dzemyda Jolita Bernatavičienė Janusz KacprzykThis book provides an overview of a wide range of relevant applications and reveals how to solve them. Many of the latest applications in finance, technology, education, medicine and other important and relevant fields are data-driven. The volumes of data are enormous. Specific methods need to be developed or adapted to solve a particular problem. It illustrates data science in applications. These applications have in common the discovery of knowledge in data and the use of this knowledge to make real decisions. The set of examples presented serves as a recipe book for their direct application to similar problems or as a guide for the development of new, more sophisticated approaches. The intended readership is data scientists looking for appropriate solutions to their problems. In addition, the examples provided serves as material for lectures at universities.
Data Science in Context: Foundations, Challenges, Opportunities
by Alfred Z. Spector Peter Norvig Chris Wiggins Jeannette M. WingData science is the foundation of our modern world. It underlies applications used by billions of people every day, providing new tools, forms of entertainment, economic growth, and potential solutions to difficult, complex problems. These opportunities come with significant societal consequences, raising fundamental questions about issues such as data quality, fairness, privacy, and causation. In this book, four leading experts convey the excitement and promise of data science and examine the major challenges in gaining its benefits and mitigating its harms. They offer frameworks for critically evaluating the ingredients and the ethical considerations needed to apply data science productively, illustrated by extensive application examples. The authors' far-ranging exploration of these complex issues will stimulate data science practitioners and students, as well as humanists, social scientists, scientists, and policy makers, to study and debate how data science can be used more effectively and more ethically to better our world.
Data Science in Cybersecurity and Cyberthreat Intelligence (Intelligent Systems Reference Library #177)
by Leslie F. Sikos Kim-Kwang Raymond ChooThis book presents a collection of state-of-the-art approaches to utilizing machine learning, formal knowledge bases and rule sets, and semantic reasoning to detect attacks on communication networks, including IoT infrastructures, to automate malicious code detection, to efficiently predict cyberattacks in enterprises, to identify malicious URLs and DGA-generated domain names, and to improve the security of mHealth wearables. This book details how analyzing the likelihood of vulnerability exploitation using machine learning classifiers can offer an alternative to traditional penetration testing solutions. In addition, the book describes a range of techniques that support data aggregation and data fusion to automate data-driven analytics in cyberthreat intelligence, allowing complex and previously unknown cyberthreats to be identified and classified, and countermeasures to be incorporated in novel incident response and intrusion detection mechanisms.
Data Science in Engineering Vol. 10: Proceedings of the 42nd IMAC, A Conference and Exposition on Structural Dynamics 2024 (Conference Proceedings of the Society for Experimental Mechanics Series)
by Thomas Matarazzo François Hemez Eleonora Maria Tronci Austin DowneyData Science in Engineering, Volume 10: Proceedings of the 42nd IMAC, A Conference and Exposition on Structural Dynamics, 2024, the tenth volume of ten from the Conference brings together contributions to this important area of research and engineering. The collection presents early findings and case studies on fundamental and applied aspects of Data Science in Engineering, including papers on: Novel Data-driven Analysis Methods Deep Learning Gaussian Process Analysis Real-time Video-based Analysis Applications to Nonlinear Dynamics and Damage Detection Data-driven System Prognostics
Data Science in Engineering, Volume 10: Proceedings of the 41st IMAC, A Conference and Exposition on Structural Dynamics 2023 (Conference Proceedings of the Society for Experimental Mechanics Series)
by Ramin Madarshahian François HemezData Science in Engineering, Volume 10: Proceedings of the 41st IMAC, A Conference and Exposition on Structural Dynamics, 2023, the tenth volume of ten from the Conference brings together contributions to this important area of research and engineering. The collection presents early findings and case studies on fundamental and applied aspects of Data Science in Engineering, including papers on:Novel Data-driven Analysis MethodsDeep Learning Gaussian Process AnalysisReal-time Video-based AnalysisApplications to Nonlinear Dynamics and Damage DetectionHigh-rate Structural Monitoring and Prognostics
Data Science in Engineering, Volume 9: Proceedings of the 39th IMAC, A Conference and Exposition on Structural Dynamics 2021 (Conference Proceedings of the Society for Experimental Mechanics Series)
by Ramin Madarshahian Francois HemezData Science and Engineering Volume 9: Proceedings of the 39th IMAC, A Conference and Exposition on Structural Dynamics, 2021, the ninth volume of nine from the Conference, brings together contributions to this important area of research and engineering. The collection presents early findings and case studies on fundamental and applied aspects of Data Science in Engineering, including papers on:Data Science in Engineering ApplicationsEngineering MathematicsComputational Methods in Engineering
Data Science in Engineering, Volume 9: Proceedings of the 40th IMAC, A Conference and Exposition on Structural Dynamics 2022 (Conference Proceedings of the Society for Experimental Mechanics Series)
by Ramin Madarshahian Francois HemezData Science in Engineering, Volume 9: Proceedings of the 40th IMAC, A Conference and Exposition on Structural Dynamics, 2022, the nineth volume of nine from the Conference brings together contributions to this important area of research and engineering. The collection presents early findings and case studies on fundamental and applied aspects of Data Science in Engineering, including papers on:Novel Data-driven Analysis MethodsDeep Learning Gaussian Process AnalysisReal-time Video-based AnalysisApplications to Nonlinear Dynamics and Damage DetectionHigh-rate Structural Monitoring and Prognostics
Data Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving (Chapman And Hall/crc The R Ser. #26)
by Deborah Nolan Duncan Temple LangEffectively Access, Transform, Manipulate, Visualize, and Reason about Data and ComputationData Science in R: A Case Studies Approach to Computational Reasoning and Problem Solving illustrates the details involved in solving real computational problems encountered in data analysis. It reveals the dynamic and iterative process by which data analysts
Data Science in Societal Applications: Concepts and Implications (Studies in Big Data #114)
by Siddharth Swarup Rautaray Manjusha Pandey Nhu Gia NguyenThe book provides an insight into the practical applications and theoretical foundation of data science. The book discusses new ways of embracing agile approaches to various facets of data science, including machine learning and artificial intelligence, data mining, data visualization, and communication. The book includes contributions from academia and industry experts detailing the shortfalls of current tools and techniques used and generating the blueprint of the new technologies. The topics covered in the book range from theoretical and foundational research, platforms, methods, applications, and tools in data science. The chapters in the book add a social, geographical, and temporal dimension to data science research. The papers included are application-oriented that prepare and use data in discovery research. This book will provide researchers and practitioners with a detailed snapshot of current progress in data science. Moreover, it will stimulate new study, research, and the development of new applications.
Data Science Landscape: Towards Research Standards And Protocols (Studies in Big Data #38)
by Usha Mujoo Munshi Neeta VermaThe edited volume deals with different contours of data science with special reference to data management for the research innovation landscape. The data is becoming pervasive in all spheres of human, economic and development activity. In this context, it is important to take stock of what is being done in the data management area and begin to prioritize, consider and formulate adoption of a formal data management system including citation protocols for use by research communities in different disciplines and also address various technical research issues. The volume, thus, focuses on some of these issues drawing typical examples from various domains. The idea of this work germinated from the two day workshop on “Big and Open Data – Evolving Data Science Standards and Citation Attribution Practices”, an international workshop, led by the ICSU-CODATA and attended by over 300 domain experts. The Workshop focused on two priority areas (i) Big and Open Data: Prioritizing, Addressing and Establishing Standards and Good Practices and (ii) Big and Open Data: Data Attribution and Citation Practices. This important international event was part of a worldwide initiative led by ICSU, and the CODATA-Data Citation Task Group. In all, there are 21 chapters (with 21st Chapter addressing four different core aspects) written by eminent researchers in the field which deal with key issues of S&T, institutional, financial, sustainability, legal, IPR, data protocols, community norms and others, that need attention related to data management practices and protocols, coordinate area activities, and promote common practices and standards of the research community globally. In addition to the aspects touched above, the national / international perspectives of data and its various contours have also been portrayed through case studies in this volume.
Data Science, Learning by Latent Structures, and Knowledge Discovery (Studies in Classification, Data Analysis, and Knowledge Organization)
by Berthold Lausen Sabine Krolak-Schwerdt Matthias BöhmerThis volume comprises papers dedicated to data science and the extraction of knowledge from many types of data: structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering and pattern recognition methods; strategies for modeling complex data and mining large data sets; applications of advanced methods in specific domains of practice. The contributions offer interesting applications to various disciplines such as psychology, biology, medical and health sciences; economics, marketing, banking and finance; engineering; geography and geology; archeology, sociology, educational sciences, linguistics and musicology; library science. The book contains the selected and peer-reviewed papers presented during the European Conference on Data Analysis (ECDA 2013) which was jointly held by the German Classification Society (GfKl) and the French-speaking Classification Society (SFC) in July 2013 at the University of Luxembourg.
Data Science: New Issues, Challenges and Applications (Studies in Computational Intelligence #869)
by Janusz Kacprzyk Gintautas Dzemyda Jolita BernatavičienėThis book contains 16 chapters by researchers working in various fields of data science. They focus on theory and applications in language technologies, optimization, computational thinking, intelligent decision support systems, decomposition of signals, model-driven development methodologies, interoperability of enterprise applications, anomaly detection in financial markets, 3D virtual reality, monitoring of environmental data, convolutional neural networks, knowledge storage, data stream classification, and security in social networking. The respective papers highlight a wealth of issues in, and applications of, data science. Modern technologies allow us to store and transfer large amounts of data quickly. They can be very diverse - images, numbers, streaming, related to human behavior and physiological parameters, etc. Whether the data is just raw numbers, crude images, or will help solve current problems and predict future developments, depends on whether we can effectively process and analyze it. Data science is evolving rapidly. However, it is still a very young field. In particular, data science is concerned with visualizations, statistics, pattern recognition, neurocomputing, image analysis, machine learning, artificial intelligence, databases and data processing, data mining, big data analytics, and knowledge discovery in databases. It also has many interfaces with optimization, block chaining, cyber-social and cyber-physical systems, Internet of Things (IoT), social computing, high-performance computing, in-memory key-value stores, cloud computing, social computing, data feeds, overlay networks, cognitive computing, crowdsource analysis, log analysis, container-based virtualization, and lifetime value modeling. Again, all of these areas are highly interrelated. In addition, data science is now expanding to new fields of application: chemical engineering, biotechnology, building energy management, materials microscopy, geographic research, learning analytics, radiology, metal design, ecosystem homeostasis investigation, and many others.
Data Science on the Google Cloud Platform: Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning
by Valliappa LakshmananLearn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.You’ll learn how to:Automate and schedule data ingest, using an App Engine applicationCreate and populate a dashboard in Google Data StudioBuild a real-time analysis pipeline to carry out streaming analyticsConduct interactive data exploration with Google BigQueryCreate a Bayesian model on a Cloud Dataproc clusterBuild a logistic regression machine-learning model with SparkCompute time-aggregate features with a Cloud Dataflow pipelineCreate a high-performing prediction model with TensorFlowUse your deployed model as a microservice you can access from both batch and real-time pipelines
Data Science on the Google Cloud Platform: Implementing End-to-end Real-time Data Pipelines: From Ingest To Machine Learning
by Valliappa LakshmananLearn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.You'll learn how to:Employ best practices in building highly scalable data and ML pipelines on Google CloudAutomate and schedule data ingest using Cloud RunCreate and populate a dashboard in Data StudioBuild a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQueryConduct interactive data exploration with BigQueryCreate a Bayesian model with Spark on Cloud DataprocForecast time series and do anomaly detection with BigQuery MLAggregate within time windows with DataflowTrain explainable machine learning models with Vertex AIOperationalize ML with Vertex AI Pipelines
Data Science Programming All-In-One For Dummies
by John Paul Mueller Luca MassaronYour logical, linear guide to the fundamentals of data science programming Data science is exploding—in a good way—with a forecast of 1.7 megabytes of new information created every second for each human being on the planet by 2020 and 11.5 million job openings by 2026. It clearly pays dividends to be in the know. This friendly guide charts a path through the fundamentals of data science and then delves into the actual work: linear regression, logical regression, machine learning, neural networks, recommender engines, and cross-validation of models. Data Science Programming All-In-One For Dummies is a compilation of the key data science, machine learning, and deep learning programming languages: Python and R. It helps you decide which programming languages are best for specific data science needs. It also gives you the guidelines to build your own projects to solve problems in real time. Get grounded: the ideal start for new data professionals What lies ahead: learn about specific areas that data is transforming Be meaningful: find out how to tell your data story See clearly: pick up the art of visualization Whether you’re a beginning student or already mid-career, get your copy now and add even more meaning to your life—and everyone else’s!
Data Science Projects with Python: A case study approach to successful data science projects using Python, pandas, and scikit-learn
by Stephen KlostermanGain hands-on experience with industry-standard data analysis and machine learning tools in PythonKey FeaturesLearn techniques to use data to identify the exact problem to be solvedVisualize data using different graphsIdentify how to select an appropriate algorithm for data extractionBook DescriptionData Science Projects with Python is designed to give you practical guidance on industry-standard data analysis and machine learning tools in Python, with the help of realistic data. The book will help you understand how you can use pandas and Matplotlib to critically examine a dataset with summary statistics and graphs, and extract the insights you seek to derive. You will continue to build on your knowledge as you learn how to prepare data and feed it to machine learning algorithms, such as regularized logistic regression and random forest, using the scikit-learn package. You’ll discover how to tune the algorithms to provide the best predictions on new and, unseen data. As you delve into later chapters, you’ll be able to understand the working and output of these algorithms and gain insight into not only the predictive capabilities of the models but also their reasons for making these predictions. By the end of this book, you will have the skills you need to confidently use various machine learning algorithms to perform detailed data analysis and extract meaningful insights from unstructured data.What you will learnInstall the required packages to set up a data science coding environmentLoad data into a Jupyter Notebook running PythonUse Matplotlib to create data visualizationsFit a model using scikit-learnUse lasso and ridge regression to reduce overfittingFit and tune a random forest model and compare performance with logistic regressionCreate visuals using the output of the Jupyter NotebookWho this book is forIf you are a data analyst, data scientist, or a business analyst who wants to get started with using Python and machine learning techniques to analyze data and predict outcomes, this book is for you. Basic knowledge of computer programming and data analytics is a must. Familiarity with mathematical concepts such as algebra and basic statistics will be useful.
Data Science Projects with Python: A case study approach to gaining valuable insights from real data with machine learning, 2nd Edition
by Stephen KlostermanGain hands-on experience of Python programming with industry-standard machine learning techniques using pandas, scikit-learn, and XGBoostKey FeaturesThink critically about data and use it to form and test a hypothesisChoose an appropriate machine learning model and train it on your dataCommunicate data-driven insights with confidence and clarityBook DescriptionIf data is the new oil, then machine learning is the drill. As companies gain access to ever-increasing quantities of raw data, the ability to deliver state-of-the-art predictive models that support business decision-making becomes more and more valuable.In this book, you'll work on an end-to-end project based around a realistic data set and split up into bite-sized practical exercises. This creates a case-study approach that simulates the working conditions you'll experience in real-world data science projects.You'll learn how to use key Python packages, including pandas, Matplotlib, and scikit-learn, and master the process of data exploration and data processing, before moving on to fitting, evaluating, and tuning algorithms such as regularized logistic regression and random forest.Now in its second edition, this book will take you through the end-to-end process of exploring data and delivering machine learning models. Updated for 2021, this edition includes brand new content on XGBoost, SHAP values, algorithmic fairness, and the ethical concerns of deploying a model in the real world.By the end of this data science book, you'll have the skills, understanding, and confidence to build your own machine learning models and gain insights from real data.What you will learnLoad, explore, and process data using the pandas Python packageUse Matplotlib to create compelling data visualizationsImplement predictive machine learning models with scikit-learnUse lasso and ridge regression to reduce model overfittingEvaluate random forest and logistic regression model performanceDeliver business insights by presenting clear, convincing conclusionsWho this book is forData Science Projects with Python – Second Edition is for anyone who wants to get started with data science and machine learning. If you're keen to advance your career by using data analysis and predictive modeling to generate business insights, then this book is the perfect place to begin. To quickly grasp the concepts covered, it is recommended that you have basic experience of programming with Python or another similar language, and a general interest in statistics.
Data Science Revealed: With Feature Engineering, Data Visualization, Pipeline Development, and Hyperparameter Tuning
by Tshepo Chris NokeriGet insight into data science techniques such as data engineering and visualization, statistical modeling, machine learning, and deep learning. This book teaches you how to select variables, optimize hyper parameters, develop pipelines, and train, test, and validate machine and deep learning models. Each chapter includes a set of examples allowing you to understand the concepts, assumptions, and procedures behind each model.The book covers parametric methods or linear models that combat under- or over-fitting using techniques such as Lasso and Ridge. It includes complex regression analysis with time series smoothing, decomposition, and forecasting. It takes a fresh look at non-parametric models for binary classification (logistic regression analysis) and ensemble methods such as decision trees, support vector machines, and naive Bayes. It covers the most popular non-parametric method for time-event data (the Kaplan-Meier estimator). It also covers ways of solving classification problems using artificial neural networks such as restricted Boltzmann machines, multi-layer perceptrons, and deep belief networks. The book discusses unsupervised learning clustering techniques such as the K-means method, agglomerative and Dbscan approaches, and dimension reduction techniques such as Feature Importance, Principal Component Analysis, and Linear Discriminant Analysis. And it introduces driverless artificial intelligence using H2O.After reading this book, you will be able to develop, test, validate, and optimize statistical machine learning and deep learning models, and engineer, visualize, and interpret sets of data.What You Will LearnDesign, develop, train, and validate machine learning and deep learning modelsFind optimal hyper parameters for superior model performanceImprove model performance using techniques such as dimension reduction and regularizationExtract meaningful insights for decision making using data visualizationWho This Book Is ForBeginning and intermediate level data scientists and machine learning engineers
Data Science Solutions on Azure: Tools and Techniques Using Databricks and MLOps
by Julian Soh Priyanshi SinghUnderstand and learn the skills needed to use modern tools in Microsoft Azure. This book discusses how to practically apply these tools in the industry, and help drive the transformation of organizations into a knowledge and data-driven entity. It provides an end-to-end understanding of data science life cycle and the techniques to efficiently productionize workloads. The book starts with an introduction to data science and discusses the statistical techniques data scientists should know. You'll then move on to machine learning in Azure where you will review the basics of data preparation and engineering, along with Azure ML service and automated machine learning. You'll also explore Azure Databricks and learn how to deploy, create and manage the same. In the final chapters you'll go through machine learning operations in Azure followed by the practical implementation of artificial intelligence through machine learning. Data Science Solutions on Azure will reveal how the different Azure services work together using real life scenarios and how-to-build solutions in a single comprehensive cloud ecosystem. What You'll LearnUnderstand big data analytics with Spark in Azure Databricks Integrate with Azure services like Azure Machine Learning and Azure SynapsDeploy, publish and monitor your data science workloads with MLOps Review data abstraction, model management and versioning with GitHubWho This Book Is ForData Scientists looking to deploy end-to-end solutions on Azure with latest tools and techniques.
Data Science Solutions on Azure: The Rise of Generative AI and Applied AI
by Julian Soh Priyanshi SinghThis revamped and updated book focuses on the latest in AI technology—Generative AI. It builds on the first edition by moving away from traditional data science into the area of applied AI using the latest breakthroughs in Generative AI. Based on real-world projects, this edition takes a deep look into new concepts and approaches such as Prompt Engineering, testing and grounding of Large Language Models, fine tuning, and implementing new solution architectures such as Retrieval Augmented Generation (RAG). You will learn about new embedded AI technologies in Search, such as Semantic and Vector Search. Written with a view on how to implement Generative AI in software, this book contains examples and sample code. In addition to traditional Data Science experimentation in Azure Machine Learning (AML) that was covered in the first edition, the authors cover new tools such as Azure AI Studio, specifically for testing and experimentation with Generative AI models. What's New in this Book Provides new concepts, tools, and technologies such as Large and Small Language Models, Semantic Kernel, and Automatic Function Calling Takes a deeper dive into using Azure AI Studio for RAG and Prompt Engineering design Includes new and updated case studies for Azure OpenAI Teaches about Copilots, plugins, and agents What You'll Learn Get up to date on the important technical aspects of Large Language Models, based on Azure OpenAI as the reference platform Know about the different types of models: GPT3.5 Turbo, GPT4, GPT4o, Codex, DALL-E, and Small Language Models such as Phi-3 Develop new skills such as Prompt Engineering and fine tuning of Large/Small Language Models Understand and implement new architectures such as RAG and Automatic Function Calling Understand approaches for implementing Generative AI using LangChain and Semantic Kernel See how real-world projects help you identify great candidates for Applied AI projects, including Large/Small Language Models Who This Book Is For Software engineers and architects looking to deploy end-to-end Generative AI solutions on Azure with the latest tools and techniques.
Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn
by Tshepo Chris NokeriApply supervised and unsupervised learning to solve practical and real-world big data problems. This book teaches you how to engineer features, optimize hyperparameters, train and test models, develop pipelines, and automate the machine learning (ML) process. The book covers an in-memory, distributed cluster computing framework known as PySpark, machine learning framework platforms known as scikit-learn, PySpark MLlib, H2O, and XGBoost, and a deep learning (DL) framework known as Keras. The book starts off presenting supervised and unsupervised ML and DL models, and then it examines big data frameworks along with ML and DL frameworks. Author Tshepo Chris Nokeri considers a parametric model known as the Generalized Linear Model and a survival regression model known as the Cox Proportional Hazards model along with Accelerated Failure Time (AFT). Also presented is a binary classification model (logistic regression) and an ensemble model (Gradient Boosted Trees). The book introduces DL and an artificial neural network known as the Multilayer Perceptron (MLP) classifier. A way of performing cluster analysis using the K-Means model is covered. Dimension reduction techniques such as Principal Components Analysis and Linear Discriminant Analysis are explored. And automated machine learning is unpacked. This book is for intermediate-level data scientists and machine learning engineers who want to learn how to apply key big data frameworks and ML and DL frameworks. You will need prior knowledge of the basics of statistics, Python programming, probability theories, and predictive analytics. What You Will LearnUnderstand widespread supervised and unsupervised learning, including key dimension reduction techniquesKnow the big data analytics layers such as data visualization, advanced statistics, predictive analytics, machine learning, and deep learningIntegrate big data frameworks with a hybrid of machine learning frameworks and deep learning frameworksDesign, build, test, and validate skilled machine models and deep learning modelsOptimize model performance using data transformation, regularization, outlier remedying, hyperparameter optimization, and data split ratio alteration Who This Book Is ForData scientists and machine learning engineers with basic knowledge and understanding of Python programming, probability theories, and predictive analytics
Data Science Strategy For Dummies
by Ulrika JägareAll the answers to your data science questions Over half of all businesses are using data science to generate insights and value from big data. How are they doing it? Data Science Strategy For Dummies answers all your questions about how to build a data science capability from scratch, starting with the “what” and the “why” of data science and covering what it takes to lead and nurture a top-notch team of data scientists. With this book, you’ll learn how to incorporate data science as a strategic function into any business, large or small. Find solutions to your real-life challenges as you uncover the stories and value hidden within data. Learn exactly what data science is and why it’s important Adopt a data-driven mindset as the foundation to success Understand the processes and common roadblocks behind data science Keep your data science program focused on generating business value Nurture a top-quality data science team In non-technical language, Data Science Strategy For Dummies outlines new perspectives and strategies to effectively lead analytics and data science functions to create real value.
Data Science Thinking: The Next Scientific, Technological and Economic Revolution (Data Analytics)
by Longbing CaoThis book explores answers to the fundamental questions driving the research, innovation and practices of the latest revolution in scientific, technological and economic development: how does data science transform existing science, technology, industry, economy, profession and education? How does one remain competitive in the data science field? What is responsible for shaping the mindset and skillset of data scientists? Data Science Thinking paints a comprehensive picture of data science as a new scientific paradigm from the scientific evolution perspective, as data science thinking from the scientific-thinking perspective, as a trans-disciplinary science from the disciplinary perspective, and as a new profession and economy from the business perspective. The topics cover an extremely wide spectrum of essential and relevant aspects of data science, spanning its evolution, concepts, thinking, challenges, discipline, and foundation, all the way to industrialization, profession, education, and the vast array of opportunities that data science offers. The book's three parts each detail layers of these different aspects. The book is intended for decision-makers, data managers (e.g., analytics portfolio managers, business analytics managers, chief data analytics officers, chief data scientists, and chief data officers), policy makers, management and decision strategists, research leaders, and educators who are responsible for pursuing new scientific, innovation, and industrial transformation agendas, enterprise strategic planning, a next-generation profession-oriented course development, as well as those who are involved in data science, technology, and economy from an advanced perspective. Research students in data science-related courses and disciplines will find the book useful for positing their innovative scientific journey, planning their unique and promising career, and competing within and being ready for the next generation of science, technology, and economy.