- Table View
- List View
Data Protection: Ensuring Data Availability
by Preston de GuiseThis is the fundamental truth about data protection: backup is dead. Or rather, backup and recovery, as a standalone topic, no longer has relevance in IT. As a standalone topic, it’s been killed off by seemingly exponential growth in storage and data, by the cloud, and by virtualization. So what is data protection? This book takes a holistic, business-based approach to data protection. It explains how data protection is a mix of proactive and reactive planning, technology and activities that allow for data continuity. It shows how truly effective data protection comes from a holistic approach considering the entire data lifecycle and all required SLAs. Data protection is neither RAID nor is it continuous availability, replication, snapshots or backups—it is all of them, combined in a considered and measured approach to suit the criticality of the data and meet all the requirements of the business. The book also discusses how businesses seeking to creatively leverage their IT investments and to drive through cost optimization are increasingly looking at data protection as a mechanism to achieve those goals. In addition to being a type of insurance policy, data protection is becoming an enabler for new processes around data movement and data processing. This book arms readers with information critical for making decisions on how data can be protected against loss in the cloud, on-premises, or in a mix of the two. It explains the changing face of recovery in a highly virtualized data center and techniques for dealing with big data. Moreover, it presents a model for where data recovery processes can be integrated with IT governance and management in order to achieve the right focus on recoverability across the business.
Data Protection: Governance, Risk Management, and Compliance
by David G. HillFailure to appreciate the full dimensions of data protection can lead to poor data protection management, costly resource allocation issues, and exposure to unnecessary risks. Data Protection: Governance, Risk Management, and Compliance explains how to gain a handle on the vital aspects of data protection.The author begins by building the foundatio
Data Protection in a Post-Pandemic Society: Laws, Regulations, Best Practices and Recent Solutions
by Chaminda Hewage Yogachandran Rahulamathavan Deepthi RatnayakeThis book offers the latest research results and predictions in data protection with a special focus on post-pandemic society. This book also includes various case studies and applications on data protection. It includes the Internet of Things (IoT), smart cities, federated learning, Metaverse, cryptography and cybersecurity. Data protection has burst onto the computer security scene due to the increased interest in securing personal data. Data protection is a key aspect of information security where personal and business data need to be protected from unauthorized access and modification. The stolen personal information has been used for many purposes such as ransom, bullying and identity theft. Due to the wider usage of the Internet and social media applications, people make themselves vulnerable by sharing personal data. This book discusses the challenges associated with personal data protection prior, during and post COVID-19 pandemic. Some of these challenges are caused by the technological advancements (e.g. Artificial Intelligence (AI)/Machine Learning (ML) and ChatGPT). In order to preserve the privacy of the data involved, there are novel techniques such as zero knowledge proof, fully homomorphic encryption, multi-party computations are being deployed. The tension between data privacy and data utility drive innovation in this area where numerous start-ups around the world have started receiving funding from government agencies and venture capitalists. This fuels the adoption of privacy-preserving data computation techniques in real application and the field is rapidly evolving. Researchers and students studying/working in data protection and related security fields will find this book useful as a reference.
Data Protection in the Financial Services Industry
by Mandy WebsterPrivacy and data protection are now important issues for companies across the financial services industry. Financial records are amongst the most sensitive for many consumers and the regulator is keen to promote good data handling practices in an industry that is looking towards increased customer profiling, for both risk management and opportunity spotting. Mandy Webster's Data Protection in the Financial Services Industry explains how to manage privacy and data protection issues throughout the customer cycle; from making contact to seeking additional business from current customers. She also looks at the precise role of the Financial Services Authority and its response to compliance or non-compliance. Each of the Eight Principles of the Data Protection Act are reviewed and explained.
Data Protection Law: A Comparative Analysis of Asia-Pacific and European Approaches
by Robert Walters Leon Trakman Bruno ZellerThis book provides a comparison and practical guide for academics, students, and the business community of the current data protection laws in selected Asia Pacific countries (Australia, India, Indonesia, Japan Malaysia, Singapore, Thailand) and the European Union.The book shows how over the past three decades the range of economic, political, and social activities that have moved to the internet has increased significantly. This technological transformation has resulted in the collection of personal data, its use and storage across international boundaries at a rate that governments have been unable to keep pace. The book highlights challenges and potential solutions related to data protection issues arising from cross-border problems in which personal data is being considered as intellectual property, within transnational contracts and in anti-trust law. The book also discusses the emerging challenges in protecting personal data and promoting cyber security. The book provides a deeper understanding of the legal risks and frameworks associated with data protection law for local, regional and global academics, students, businesses, industries, legal profession and individuals.
Data Protection vs. Freedom of Information
by Paul TicherThe Freedom of Information Act (FOI) was a milestone in UK legislation and, for the first time, the lid was legally lifted on a lot of what the UK government was doing in the name of the citizens of the country. While the FOI applies only to public sector organisations, it covers a wide range of information. The Data Protection Act, which applies equally in both the public and private sector, had already given individuals the right to find out what information was being held about them, and to insist on having that information kept accurate and up to date. Of course, the Data Protection Act also placed an obligation on organisations to protect the personal data of those people about whom they collected this information and to ensure that this data was not disclosed, either deliberately or accidentally, to anyone not entitled to see it. Clear and practical guidance for data governance professionalsInevitably, information that could and should be disclosed pursuant to a freedom of information enquiry could quite conceivably also contain information that the data controller must protect and herein lies a challenge for those in the public sector. Data management frameworks must be designed with two apparently contradictory objectives in mind: ensuring that information that might have to be disclosed pursuant to an FOI enquiry can quickly be found and provided, while simultaneously ensuring that personal data that has to be protected remains protected. This is a key data governance issue and, until now, there has been little useful guidance on how to tackle this issue for those charged with designing processes and infrastructure that meets these two sets of legal requirements. This pocket guide focuses on and addresses this critical issue, providing clear and practical guidance for data governance professionals on how to resolve this conundrum.
Data Quality: Empowering Businesses with Analytics and AI
by Prashanth SouthekalDiscover how to achieve business goals by relying on high-quality, robust data In Data Quality: Empowering Businesses with Analytics and AI, veteran data and analytics professional delivers a practical and hands-on discussion on how to accelerate business results using high-quality data. In the book, you’ll learn techniques to define and assess data quality, discover how to ensure that your firm’s data collection practices avoid common pitfalls and deficiencies, improve the level of data quality in the business, and guarantee that the resulting data is useful for powering high-level analytics and AI applications. The author shows you how to: Profile for data quality, including the appropriate techniques, criteria, and KPIs Identify the root causes of data quality issues in the business apart from discussing the 16 common root causes that degrade data quality in the organization. Formulate the reference architecture for data quality, including practical design patterns for remediating data quality Implement the 10 best data quality practices and the required capabilities for improving operations, compliance, and decision-making capabilities in the businessAn essential resource for data scientists, data analysts, business intelligence professionals, chief technology and data officers, and anyone else with a stake in collecting and using high-quality data, Data Quality: Empowering Businesses with Analytics and AI will also earn a place on the bookshelves of business leaders interested in learning more about what sets robust data apart from the rest.
Data Quality Engineering in Financial Services: Applying Manufacturing Techniques To Data
by Brian BuzzelliData quality will either make you or break you in the financial services industry. Missing prices, wrong market values, trading violations, client performance restatements, and incorrect regulatory filings can all lead to harsh penalties, lost clients, and financial disaster. This practical guide provides data analysts, data scientists, and data practitioners in financial services firms with the framework to apply manufacturing principles to financial data management, understand data dimensions, and engineer precise data quality tolerances at the datum level and integrate them into your data processing pipelines.You'll get invaluable advice on how to:Evaluate data dimensions and how they apply to different data types and use casesDetermine data quality tolerances for your data quality specificationChoose the points along the data processing pipeline where data quality should be assessed and measuredApply tailored data governance frameworks within a business or technical function or across an organizationPrecisely align data with applications and data processing pipelinesAnd more
Data Quality in Southeast Asia: Analysis of Official Statistics and Their Institutional Framework as a Basis for Capacity Building and Policy Making in the ASEAN
by Manuel StagarsThis book explores the reliability of official statisticaldata in the ASEAN (the Association of Southeast Asian Nations), and thebenefits of a better vocabulary to discuss the quality of publicly availabledata to address the needs of all users. It introduces a rigorous method todisaggregate and rate data quality into principal factors containing a total often dimensions, which serves as the basis for a discussion on the opportunitiesand challenges for data quality, capacity building programs and data policy in SoutheastAsia. Tools to standardize and monitor statistical capacity and data qualityare presented, as well as methods and data sources to analyse data quality. Thebook analyses data quality in Indonesia, Malaysia, Singapore, the Philippines,Thailand, Vietnam, Brunei, Laos, Cambodia, and Myanmar, before concluding withthoughts on Open Data and the ASEAN Economic Community (AEC).
Data Quality Management in the Data Age: Excellence in Data Quality for Enhanced Digital Economic Growth (SpringerBriefs in Service Science)
by Haiyan YuThis book addresses data quality management for data markets, including foundational quality issues in modern data science. By clarifying the concept of data quality, its impact on real-world applications, and the challenges stemming from poor data quality, it will equip data scientists and engineers with advanced skills in data quality management, with a particular focus on applications within data markets. This will help them create an environment that encourages potential data sellers with high-quality data to join the market, ultimately leading to an improvement in overall data quality. High-quality data, as a novel factor of production, has assumed a pivotal role in driving digital economic development. The acquisition of such data is particularly important for contemporary decision-making models. Data markets facilitate the procurement of high-quality data and thereby enhance the data supply. Consequently, potential data sellers with high-quality data are incentivized to enter the market, an aspect that is particularly relevant in data-scarce domains such as personalized medicine and services. Data scientists have a pivotal role to play in both the intellectual vitality and the practical utility of high-quality data. Moreover, data quality control presents opportunities for data scientists to engage with less structured or ambiguous problems. The book will foster fruitful discussions on the contributions that various scientists and engineers can make to data quality and the further evolution of data markets.
Data Rules: Reinventing the Market Economy (Acting with Technology)
by Jannis Kallinikos Cristina AlaimoA new social science framework for studying the unprecedented social and economic restructuring driven by digital data.Digital data have become the critical frontier where emerging economic practices and organizational forms confront the traditional economic order and its institutions. In Data Rules, Cristina Alaimo and Jannis Kallinikos establish a social science framework for analyzing the unprecedented social and economic restructuring brought about by data. Working at the intersection of information systems and organizational studies, they draw extensively on intellectual currents in sociology, semiotics, cognitive science and technology, and social theory. Making the case for turning &“data-making&” into an area of inquiry of its own, the authors uncover how data are deeply implicated in rewiring the institutions of the market economy.The authors associate digital data with the decentering of organizations. As they point out, centered systems make sense only when firms (and formal organizations more broadly) can keep the external world at arm&’s length and maintain a relative operation independence from it. These patterns no longer hold. Data transform the production of goods and services to an endless series of exchanges and interactions that defeat the functional logics of markets and organizations. The diffusion of platforms and ecosystems is indicative of these broader transformations. Rather than viewing data as simply a force of surveillance and control, the authors place the transformative potential of data at the center of an emerging socioeconomic order that restructures society and its institutions.
Data Science: Konzepte, Erfahrungen, Fallstudien und Praxis
by Detlev Frick Andreas Gadatsch Jens Kaufmann Birgit Lankes Christoph Quix Andreas Schmidt Uwe SchmitzData Science ist in vielen Organisationen angekommen und oft alltägliche Praxis. Dennoch stehen viele Verantwortliche vor der Herausforderung, sich erstmalig mit konkreten Fragestellungen zu beschäftigen oder laufende Projekte weiterzuentwickeln. Die Spannbreite der Methoden, Werkzeuge und Anwendungsmöglichkeiten ist sehr groß und entwickelt sich kontinuierlich weiter. Die Vielzahl an Publikationen zu Data Science ist spezialisiert und behandelt fokussiert Einzelaspekte. Das vorliegende Werk gibt den Leserinnen und Lesern eine umfassende Orientierung zum Status Quo aus der wissenschaftlichen Perspektive und zahlreiche vertiefende Darstellungen praxisrelevanter Aspekte. Die Inhalte bauen auf den wissenschaftlichen CAS-Zertifikatskursen zu Big Data und Data Science der Hochschule Niederrhein in Kooperation mit der Hochschule Bonn-Rhein-Sieg und der FH Dortmund auf. Sie berücksichtigen wissenschaftliche Grundlagen und Vertiefungen, aber auch konkrete Erfahrungen aus Data Science Projekten. Das Buch greift praxisrelevante Fragen auf wissenschaftlichem Niveau aus Sicht der Rollen eines „Data Strategist“, „Data Architect“ und „Data Analyst“ auf und bindet erprobte Praxiserfahrungen u. a. von Seminarteilnehmern mit ein. Das Buch gibt für Interessierte einen Einblick in die aktuell relevante Vielfalt der Aspekte zu Data Science bzw. Big Data und liefert Hinweise für die praxisnahe Umsetzung.
Data Science: Create Teams That Ask the Right Questions and Deliver Real Value
by Doug RoseLearn how to build a data science team within your organization rather than hiring from the outside. Teach your team to ask the right questions to gain actionable insights into your business. Most organizations still focus on objectives and deliverables. Instead, a data science team is exploratory. They use the scientific method to ask interesting questions and run small experiments. Your team needs to see if the data illuminate their questions. Then, they have to use critical thinking techniques to justify their insights and reasoning. They should pivot their efforts to keep their insights aligned with business value. Finally, your team needs to deliver these insights as a compelling story. Insight!: How to Build Data Science Teams that Deliver Real Business Value shows that the most important thing you can do now is help your team think about data. Management coach Doug Rose walks you through the process of creating and managing effective data science teams. You will learn how to find the right people inside your organization and equip them with the right mindset. The book has three overarching concepts: You should mine your own company for talent. You can't change your organization by hiring a few data science superheroes. You should form small, agile-like data teams that focus on delivering valuable insights early and often. You can make real changes to your organization by telling compelling data stories. These stories are the best way to communicate your insights about your customers, challenges, and industry. What Your Will Learn: Create data science teams from existing talent in your organization to cost-efficiently extract maximum business value from your organization's data Understand key data science terms and concepts Follow practical guidance to create and integrate an effective data science team with key roles and the responsibilities for each team member Utilize the data science life cycle (DSLC) to model essential processes and practices for delivering value Use sprints and storytelling to help your team stay on track and adapt to new knowledge Who This Book Is For Data science project managers and team leaders. The secondary readership is data scientists, DBAs, analysts, senior management, HR managers, and performance specialists.
Data Science: A First Introduction (Chapman & Hall/CRC Data Science Series)
by Tiffany Timbers Trevor Campbell Melissa LeeData Science: A First Introduction focuses on using the R programming language in Jupyter notebooks to perform data manipulation and cleaning, create effective visualizations, and extract insights from data using classification, regression, clustering, and inference. The text emphasizes workflows that are clear, reproducible, and shareable, and includes coverage of the basics of version control. All source code is available online, demonstrating the use of good reproducible project workflows. Based on educational research and active learning principles, the book uses a modern approach to R and includes accompanying autograded Jupyter worksheets for interactive, self-directed learning. The book will leave readers well-prepared for data science projects. The book is designed for learners from all disciplines with minimal prior knowledge of mathematics and programming. The authors have honed the material through years of experience teaching thousands of undergraduates in the University of British Columbia’s DSCI100: Introduction to Data Science course.
Data Science, AI, and Machine Learning in Drug Development (Chapman & Hall/CRC Biostatistics Series)
by Harry YangThe confluence of big data, artificial intelligence (AI), and machine learning (ML) has led to a paradigm shift in how innovative medicines are developed and healthcare delivered. To fully capitalize on these technological advances, it is essential to systematically harness data from diverse sources and leverage digital technologies and advanced analytics to enable data-driven decisions. Data science stands at a unique moment of opportunity to lead such a transformative change. Intended to be a single source of information, Data Science, AI, and Machine Learning in Drug Research and Development covers a wide range of topics on the changing landscape of drug R & D, emerging applications of big data, AI and ML in drug development, and the build of robust data science organizations to drive biopharmaceutical digital transformations. Features Provides a comprehensive review of challenges and opportunities as related to the applications of big data, AI, and ML in the entire spectrum of drug R & D Discusses regulatory developments in leveraging big data and advanced analytics in drug review and approval Offers a balanced approach to data science organization build Presents real-world examples of AI-powered solutions to a host of issues in the lifecycle of drug development Affords sufficient context for each problem and provides a detailed description of solutions suitable for practitioners with limited data science expertise
Data Science and Analytics: 4th International Conference On Recent Developments In Science, Engineering And Technology, Redset 2017, Gurgaon, India, October 13-14, 2017, Revised Selected Papers (Communications In Computer And Information Science #799)
by Brajendra Panda Sudeep Sharma Nihar Ranjan RoyThis book constitutes the refereed proceedings of the 4th International Conference on Recent Developments in Science, Engineering and Technology, REDSET 2017, held in Gurgaon, India, in October 2017. The 66 revised full papers presented were carefully reviewed and selected from 329 submissions. The papers are organized in topical sections on big data analysis, data centric programming, next generation computing, social and web analytics, security in data science analytics.
Data Science and Analytics for SMEs: Consulting, Tools, Practical Use Cases
by Afolabi Ibukun TolulopeMaster the tricks and techniques of business analytics consulting, specifically applicable to small-to-medium businesses (SMEs). Written to help you hone your business analytics skills, this book applies data science techniques to help solve problems and improve upon many aspects of a business' operations. SMEs are looking for ways to use data science and analytics, and this need is becoming increasingly pressing with the ongoing digital revolution. The topics covered in the books will help to provide the knowledge leverage needed for implementing data science in small business. The demand of small business for data analytics are in conjunction with the growing number of freelance data science consulting opportunities; hence this book will provide insight on how to navigate this new terrain.This book uses a do-it-yourself approach to analytics and introduces tools that are easily available online and are non-programming based. Data science will allow SMEs to understand their customer loyalty, market segmentation, sales and revenue increase etc. more clearly. Data Science and Analytics for SMEs is particularly focused on small businesses and explores the analytics and data that can help them succeed further in their business. What You'll LearnCreate and measure the success of their analytics projectStart your business analytics consulting careerUse solutions taught in the book in practical uses cases and problems Who This Book Is ForBusiness analytics enthusiasts who are not particularly programming inclined, small business owners and data science consultants, data science and business students, and SME (small-to-medium enterprise) analysts
Data Science and Analytics Strategy: An Emergent Design Approach (Chapman & Hall/CRC Data Science Series)
by Kailash Awati Alexander ScrivenThis book describes how to establish data science and analytics capabilities in organisations using Emergent Design, an evolutionary approach that increases the chances of successful outcomes while minimising upfront investment. Based on their experiences and those of a number of data leaders, the authors provide actionable advice on data technologies, processes, and governance structures so that readers can make choices that are appropriate to their organisational contexts and requirements. The book blends academic research on organisational change and data science processes with real-world stories from experienced data analytics leaders, focusing on the practical aspects of setting up a data capability. In addition to a detailed coverage of capability, culture, and technology choices, a unique feature of the book is its treatment of emerging issues such as data ethics and algorithmic fairness. Data Science and Analytics Strategy: An Emergent Design Approach has been written for professionals who are looking to build data science and analytics capabilities within their organisations as well as those who wish to expand their knowledge and advance their careers in the data space. Providing deep insights into the intersection between data science and business, this guide will help professionals understand how to help their organisations reap the benefits offered by data. Most importantly, readers will learn how to build a fit-for-purpose data science capability in a manner that avoids the most common pitfalls.
Data Science and Applications for Modern Power Systems (Power Electronics and Power Systems)
by Le Xie Yang Weng Ram RajagopalThis book offers a comprehensive collection of research articles that utilize data—in particular large data sets—in modern power systems operation and planning. As the power industry moves towards actively utilizing distributed resources with advanced technologies and incentives, it is becoming increasingly important to benefit from the available heterogeneous data sets for improved decision-making. The authors present a first-of-its-kind comprehensive review of big data opportunities and challenges in the smart grid industry. This book provides succinct and useful theory, practical algorithms, and case studies to improve power grid operations and planning utilizing big data, making it a useful graduate-level reference for students, faculty, and practitioners on the future grid.
Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data
by Emc Education ServicesData Science and Big Data Analytics is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and technology environment, and the learning is supported and explained with examples that you can replicate using open-source software.This book will help you:Become a contributor on a data science teamDeploy a structured lifecycle approach to data analytics problemsApply appropriate analytic techniques and tools to analyzing big dataLearn how to tell a compelling story with data to drive business actionPrepare for EMC Proven Professional Data Science Certification
Data Science and Big Data Analytics in Smart Environments
by Marta Chinnici; Florin Pop; Cătălin NegruMost applications generate large datasets, like social networking and social influence programs, smart cities applications, smart house environments, Cloud applications, public web sites, scientific experiments and simulations, data warehouse, monitoring platforms, and e-government services. Data grows rapidly, since applications produce continuously increasing volumes of both unstructured and structured data. Large-scale interconnected systems aim to aggregate and efficiently exploit the power of widely distributed resources. In this context, major solutions for scalability, mobility, reliability, fault tolerance and security are required to achieve high performance and to create a smart environment. The impact on data processing, transfer and storage is the need to re-evaluate the approaches and solutions to better answer the user needs. A variety of solutions for specific applications and platforms exist so a thorough and systematic analysis of existing solutions for data science, data analytics, methods and algorithms used in Big Data processing and storage environments is significant in designing and implementing a smart environment.Fundamental issues pertaining to smart environments (smart cities, ambient assisted leaving, smart houses, green houses, cyber physical systems, etc.) are reviewed. Most of the current efforts still do not adequately address the heterogeneity of different distributed systems, the interoperability between them, and the systems resilience. This book will primarily encompass practical approaches that promote research in all aspects of data processing, data analytics, data processing in different type of systems: Cluster Computing, Grid Computing, Peer-to-Peer, Cloud/Edge/Fog Computing, all involving elements of heterogeneity, having a large variety of tools and software to manage them. The main role of resource management techniques in this domain is to create the suitable frameworks for development of applications and deployment in smart environments, with respect to high performance. The book focuses on topics covering algorithms, architectures, management models, high performance computing techniques and large-scale distributed systems.
Data Science and Digital Business
by Fausto Pedro García Márquez Benjamin LevThis book combines the analytic principles of digital business and data science with business practice and big data. The interdisciplinary, contributed volume provides an interface between the main disciplines of engineering and technology and business administration. Written for managers, engineers and researchers who want to understand big data and develop new skills that are necessary in the digital business, it not only discusses the latest research, but also presents case studies demonstrating the successful application of data in the digital business.
Data Science and Machine Learning: Mathematical and Statistical Methods (Chapman And Hall/crc Machine Learning And Pattern Recognition Ser.)
by Dirk P. Kroese Zdravko Botev Thomas Taimre Radislav Vaisman"This textbook is a well-rounded, rigorous, and informative work presenting the mathematics behind modern machine learning techniques. It hits all the right notes: the choice of topics is up-to-date and perfect for a course on data science for mathematics students at the advanced undergraduate or early graduate level. This book fills a sorely-needed gap in the existing literature by not sacrificing depth for breadth, presenting proofs of major theorems and subsequent derivations, as well as providing a copious amount of Python code. I only wish a book like this had been around when I first began my journey!" -Nicholas Hoell, University of Toronto "This is a well-written book that provides a deeper dive into data-scientific methods than many introductory texts. The writing is clear, and the text logically builds up regularization, classification, and decision trees. Compared to its probable competitors, it carves out a unique niche. -Adam Loy, Carleton College The purpose of Data Science and Machine Learning: Mathematical and Statistical Methods is to provide an accessible, yet comprehensive textbook intended for students interested in gaining a better understanding of the mathematics and statistics that underpin the rich variety of ideas and machine learning algorithms in data science. Key Features: Focuses on mathematical understanding. Presentation is self-contained, accessible, and comprehensive. Extensive list of exercises and worked-out examples. Many concrete algorithms with Python code. Full color throughout. Further Resources can be found on the authors website: https://github.com/DSML-book/Lectures
Data Science and Machine Learning Applications in Subsurface Engineering
by Daniel Asante OtchereThis book covers unsupervised learning, supervised learning, clustering approaches, feature engineering, explainable AI and multioutput regression models for subsurface engineering problems. Processing voluminous and complex data sets are the primary focus of the field of machine learning (ML). ML aims to develop data-driven methods and computational algorithms that can learn to identify complex and non-linear patterns to understand and predict the relationships between variables by analysing extensive data. Although ML models provide the final output for predictions, several steps need to be performed to achieve accurate predictions. These steps, data pre-processing, feature selection, feature engineering and outlier removal, are all contained in this book. New models are also developed using existing ML architecture and learning theories to improve the performance of traditional ML models and handle small and big data without manual adjustments. This research-oriented book will help subsurface engineers, geophysicists, and geoscientists become familiar with data science and ML advances relevant to subsurface engineering. Additionally, it demonstrates the use of data-driven approaches for salt identification, seismic interpretation, estimating enhanced oil recovery factor, predicting pore fluid types, petrophysical property prediction, estimating pressure drop in pipelines, bubble point pressure prediction, enhancing drilling mud loss, smart well completion and synthetic well log predictions.
Data Science and Machine Learning for Non-Programmers: Using SAS Enterprise Miner (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series)
by Dothang TruongAs data continues to grow exponentially, knowledge of data science and machine learning has become more crucial than ever. Machine learning has grown exponentially; however, the abundance of resources can be overwhelming, making it challenging for new learners. This book aims to address this disparity and cater to learners from various non-technical fields, enabling them to utilize machine learning effectively. Adopting a hands-on approach, readers are guided through practical implementations using real datasets and SAS Enterprise Miner, a user-friendly data mining software that requires no programming. Throughout the chapters, two large datasets are used consistently, allowing readers to practice all stages of the data mining process within a cohesive project framework. This book also provides specific guidelines and examples on presenting data mining results and reports, enhancing effective communication with stakeholders. Designed as a guiding companion for both beginners and experienced practitioners, this book targets a wide audience, including students, lecturers, researchers, and industry professionals from various backgrounds.