Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python

Ebook348 pages2 hours

Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python

Name: Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python
Author: David Paper
ISBN: 9781484253731

By David Paper

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Aspiring data science professionals can learn the Scikit-Learn library along with the fundamentals of machine learning with this book. The book combines the Anaconda Python distribution with the popular Scikit-Learn library to demonstrate a wide range of supervised and unsupervised machine learning algorithms. Care is taken to walk you through the principles of machine learning through clear examples written in Python that you can try out and experiment with at home on your own machine.
All applied math and programming skills required to master the content are covered in this book. In-depth knowledge of object-oriented programming is not required as working and complete examples are provided and explained. Coding examples are in-depth and complex when necessary. They are also concise, accurate, and complete, and complement the machine learning concepts introduced. Working the examples helps to build the skills necessary to understand and apply complexmachine learning algorithms.
Hands-on Scikit-Learn for Machine Learning Applications is an excellent starting point for those pursuing a career in machine learning. Students of this book will learn the fundamentals that are a prerequisite to competency. Readers will be exposed to the Anaconda distribution of Python that is designed specifically for data science professionals, and will build skills in the popular Scikit-Learn library that underlies many machine learning applications in the world of Python.

What You'll Learn

Work with simple and complex datasets common to Scikit-Learn
Manipulate data into vectors and matrices for algorithmic processing
Become familiar with the Anaconda distribution used in data science
Apply machine learning with Classifiers, Regressors, and Dimensionality Reduction
Tune algorithms and find the best algorithms for each dataset
Load data from and save to CSV, JSON, Numpy, and Pandas formats

Who This Book Is For
The aspiring data scientist yearning to break into machine learning through mastering the underlying fundamentals that are sometimes skipped over in the rush to be productive. Some knowledge of object-oriented programming and very basic applied linear algebra will make learning easier, although anyone can benefit from this book.

Skip carousel

LanguageEnglish

PublisherApress

Release dateNov 16, 2019

ISBN9781484253731

Author

David Paper

Related to Hands-on Scikit-Learn for Machine Learning Applications

Related ebooks

Skip carousel

Practical Python Data Visualization: A Fast Track Approach To Learning Data Visualization With Python
Ebook
Practical Python Data Visualization: A Fast Track Approach To Learning Data Visualization With Python
byAshwin Pajankar
Rating: 4 out of 5 stars
4/5
Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python
Ebook
Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python
byPaul Crickard
Rating: 0 out of 5 stars
0 ratings
Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing
Ebook
Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing
byTaweh Beysolow II
Rating: 0 out of 5 stars
0 ratings
Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale
Ebook
Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale
byMathangi Sri
Rating: 0 out of 5 stars
0 ratings
A Handbook of Mathematical Models with Python: Elevate your machine learning projects with NetworkX, PuLP, and linalg
Ebook
A Handbook of Mathematical Models with Python: Elevate your machine learning projects with NetworkX, PuLP, and linalg
byDr. Ranja Sarkar
Rating: 0 out of 5 stars
0 ratings
Pro Machine Learning Algorithms: A Hands-On Approach to Implementing Algorithms in Python and R
Ebook
Pro Machine Learning Algorithms: A Hands-On Approach to Implementing Algorithms in Python and R
byV Kishore Ayyadevara
Rating: 0 out of 5 stars
0 ratings
Machine Learning Engineering with Python: Manage the lifecycle of machine learning models using MLOps with practical examples
Ebook
Machine Learning Engineering with Python: Manage the lifecycle of machine learning models using MLOps with practical examples
byAndrew P. McMahon
Rating: 0 out of 5 stars
0 ratings
Learn Microservices with Spring Boot: A Practical Approach to RESTful Services Using an Event-Driven Architecture, Cloud-Native Patterns, and Containerization
Ebook
Learn Microservices with Spring Boot: A Practical Approach to RESTful Services Using an Event-Driven Architecture, Cloud-Native Patterns, and Containerization
byMoisés Macero García
Rating: 0 out of 5 stars
0 ratings
Deep Belief Nets in C++ and CUDA C: Volume 1: Restricted Boltzmann Machines and Supervised Feedforward Networks
Ebook
Deep Belief Nets in C++ and CUDA C: Volume 1: Restricted Boltzmann Machines and Supervised Feedforward Networks
byTimothy Masters
Rating: 0 out of 5 stars
0 ratings
Designing Microservices with Django: An Overview of Tools and Practices
Ebook
Designing Microservices with Django: An Overview of Tools and Practices
byAkos Hochrein
Rating: 0 out of 5 stars
0 ratings
Enterprise Bug Busting: From Testing through CI/CD to Deliver Business Results
Ebook
Enterprise Bug Busting: From Testing through CI/CD to Deliver Business Results
byRosalind Radcliffe
Rating: 0 out of 5 stars
0 ratings
ASP.NET 3.5 CMS Development
Ebook
ASP.NET 3.5 CMS Development
byJeff Cochran
Rating: 0 out of 5 stars
0 ratings
Python Interviews: Discussions with Python Experts
Ebook
Python Interviews: Discussions with Python Experts
byMichael Driscoll
Rating: 0 out of 5 stars
0 ratings
System integration testing The Ultimate Step-By-Step Guide
Ebook
System integration testing The Ultimate Step-By-Step Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET
Ebook
Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET
byMarius Iulian Mihailescu
Rating: 0 out of 5 stars
0 ratings
Observability with Grafana: Monitor, control, and visualize your Kubernetes and cloud platforms using the LGTM stack
Ebook
Observability with Grafana: Monitor, control, and visualize your Kubernetes and cloud platforms using the LGTM stack
byRob Chapman
Rating: 3 out of 5 stars
3/5
A Developer's Essential Guide to Docker Compose: Simplify the development and orchestration of multi-container applications
Ebook
A Developer's Essential Guide to Docker Compose: Simplify the development and orchestration of multi-container applications
byEmmanouil Gkatziouras
Rating: 0 out of 5 stars
0 ratings
Software Documentation Strategy A Complete Guide - 2020 Edition
Ebook
Software Documentation Strategy A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
MongoDB Recipes: With Data Modeling and Query Building Strategies
Ebook
MongoDB Recipes: With Data Modeling and Query Building Strategies
bySubhashini Chellappan
Rating: 0 out of 5 stars
0 ratings
Social Media Data Mining and Analytics
Ebook
Social Media Data Mining and Analytics
byGabor Szabo
Rating: 0 out of 5 stars
0 ratings
APIs A Complete Guide - 2021 Edition
Ebook
APIs A Complete Guide - 2021 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Technical Program Manager's Handbook: Unlock your TPM potential by leading technical projects successfully and elevating your career path
Ebook
Technical Program Manager's Handbook: Unlock your TPM potential by leading technical projects successfully and elevating your career path
byJoshua Alan Teter
Rating: 0 out of 5 stars
0 ratings
Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice
Ebook
Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice
byPatrick J
Rating: 0 out of 5 stars
0 ratings
Building REST APIs with Flask: Create Python Web Services with MySQL
Ebook
Building REST APIs with Flask: Create Python Web Services with MySQL
byKunal Relan
Rating: 0 out of 5 stars
0 ratings
Solr in Action
Ebook
Solr in Action
byTimothy Potter
Rating: 3 out of 5 stars
3/5
Software Development Process Models A Complete Guide - 2020 Edition
Ebook
Software Development Process Models A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Architecting Vue.js 3 Enterprise-Ready Web Applications: Build and deliver scalable and high-performance, enterprise-ready applications with Vue and JavaScript
Ebook
Architecting Vue.js 3 Enterprise-Ready Web Applications: Build and deliver scalable and high-performance, enterprise-ready applications with Vue and JavaScript
bySolomon Eseme
Rating: 0 out of 5 stars
0 ratings
Software Craftsmanship A Complete Guide - 2020 Edition
Ebook
Software Craftsmanship A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Software Test A Complete Guide - 2020 Edition
Ebook
Software Test A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
C# Deconstructed: Discover how C# works on the .NET Framework
Ebook
C# Deconstructed: Discover how C# works on the .NET Framework
byMohammad Rahman
Rating: 0 out of 5 stars
0 ratings

Intelligence (AI) & Semantics For You

Skip carousel

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 4 out of 5 stars
4/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
Ebook
ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)
byMatthew Hayes
Rating: 0 out of 5 stars
0 ratings
ChatGPT For Dummies
Ebook
ChatGPT For Dummies
byPam Baker
Rating: 4 out of 5 stars
4/5
Summary of Super-Intelligence From Nick Bostrom
Ebook
Summary of Super-Intelligence From Nick Bostrom
bySummary Station
Rating: 4 out of 5 stars
4/5
Artificial Intelligence: A Guide for Thinking Humans
Ebook
Artificial Intelligence: A Guide for Thinking Humans
byMelanie Mitchell
Rating: 4 out of 5 stars
4/5
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
Ebook
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
Nexus: A Brief History of Information Networks from the Stone Age to AI
Ebook
Nexus: A Brief History of Information Networks from the Stone Age to AI
byYuval Noah Harari
Rating: 4 out of 5 stars
4/5
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
Ebook
Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures
byThe Passive Income Strategist
Rating: 3 out of 5 stars
3/5
So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen
Ebook
So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen
byKristen Meinzer
Rating: 3 out of 5 stars
3/5
Writing AI Prompts For Dummies
Ebook
Writing AI Prompts For Dummies
byStephanie Diamond
Rating: 0 out of 5 stars
0 ratings
Midjourney Mastery - The Ultimate Handbook of Prompts
Ebook
Midjourney Mastery - The Ultimate Handbook of Prompts
byAndreea Todinca
Rating: 5 out of 5 stars
5/5
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
Ebook
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business
byAlec Rowe
Rating: 3 out of 5 stars
3/5
AI Investing For Dummies
Ebook
AI Investing For Dummies
byPaul Mladjenovic
Rating: 0 out of 5 stars
0 ratings
The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 2 out of 5 stars
2/5
101 Midjourney Prompt Secrets
Ebook
101 Midjourney Prompt Secrets
byMarcus Byrne
Rating: 3 out of 5 stars
3/5
The Algorithm of the Universe (A New Perspective to Cognitive AI)
Ebook
The Algorithm of the Universe (A New Perspective to Cognitive AI)
byAncient Philosophy
Rating: 5 out of 5 stars
5/5
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
Ebook
A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)
byS M Howard
Rating: 4 out of 5 stars
4/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Dark Aeon: Transhumanism and the War Against Humanity
Ebook
Dark Aeon: Transhumanism and the War Against Humanity
byJoe Allen
Rating: 5 out of 5 stars
5/5
ChatGPT For Fiction Writing: AI for Authors
Ebook
ChatGPT For Fiction Writing: AI for Authors
byNova Leigh
Rating: 5 out of 5 stars
5/5
Our Final Invention: Artificial Intelligence and the End of the Human Era
Ebook
Our Final Invention: Artificial Intelligence and the End of the Human Era
byJames Barrat
Rating: 4 out of 5 stars
4/5
Killer ChatGPT Prompts: Harness the Power of AI for Success and Profit
Ebook
Killer ChatGPT Prompts: Harness the Power of AI for Success and Profit
byGuy Hart-Davis
Rating: 2 out of 5 stars
2/5
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
Ebook
Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention
bySteven Cooper
Rating: 4 out of 5 stars
4/5
AI for Educators: AI for Educators
Ebook
AI for Educators: AI for Educators
byMatt Miller
Rating: 5 out of 5 stars
5/5
Gigatrends: Six Forces That Are Changing the Future for Billions
Ebook
Gigatrends: Six Forces That Are Changing the Future for Billions
byThomas Koulopoulos
Rating: 0 out of 5 stars
0 ratings
ChatGPT 4 $10,000 per Month #1 Beginners Guide to Make Money Online Generated by Artificial Intelligence
Ebook
ChatGPT 4 $10,000 per Month #1 Beginners Guide to Make Money Online Generated by Artificial Intelligence
byJake L Kent
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

Production data labeling workflows: with Mark Christensen, CEO of Xelex.ai
UNLIMITED
Production data labeling workflows: with Mark Christensen, CEO of Xelex.ai
byPractical AI: Machine Learning, Data Science, LLM
0 ratings
0% found this document useful
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
UNLIMITED
Massively Parallel Data Processing In Python Without The Effort Using Bodo: An interview about how Bodo converts standard Python code to native MPI automatically for massive speed ups in data processing workloads
byData Engineering Podcast
0 ratings
0% found this document useful
Throwing Houlihans at MongoDB with Rick Houlihan: A year or so before the pandemic hit Corey traveled to Australia for a keynote speech. There he crossed paths with the closing keynote which was delivered by Rick Houlihan. Rick, Director Developer Relations for Strategic Accounts at MongoDB, put Corey’s
UNLIMITED
Throwing Houlihans at MongoDB with Rick Houlihan: A year or so before the pandemic hit Corey traveled to Australia for a keynote speech. There he crossed paths with the closing keynote which was delivered by Rick Houlihan. Rick, Director Developer Relations for Strategic Accounts at MongoDB, put Corey’s
byScreaming in the Cloud
0 ratings
0% found this document useful
ChatOps with Jason Hand: Chat bots are your newest co-worker. Slack, HipChat, and other chat clients allow developers and other team members to communicate more dynamically than the limits of email. Companies have started to add bots to their chat rooms.
UNLIMITED
ChatOps with Jason Hand: Chat bots are your newest co-worker. Slack, HipChat, and other chat clients allow developers and other team members to communicate more dynamically than the limits of email. Companies have started to add bots to their chat rooms.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
UNLIMITED
Renee M. P. Teate, "SQL for Data Scientists: A Beginner's Guide for Building Datasets for Analysis" (John Wiley & Sons, 2021): An interview with Renee M. P. Teate
byNew Books in Science, Technology, and Society
0 ratings
0% found this document useful
281 Top 10 Java Books Every Developer Should Read - Simple Programmer Podcast: Java is definitely one of the most popular languages of all time. Practically, everyone knows Java, even people that are not programmers or developers. According to Statistic Times, when it comes to programming languages popularity, "Java is the most...
UNLIMITED
281 Top 10 Java Books Every Developer Should Read - Simple Programmer Podcast: Java is definitely one of the most popular languages of all time. Practically, everyone knows Java, even people that are not programmers or developers. According to Statistic Times, when it comes to programming languages popularity, "Java is the most...
bySimple Programmer Podcast
0 ratings
0% found this document useful
Exploring K-means Clustering and Building a Gradebook With Pandas
UNLIMITED
Exploring K-means Clustering and Building a Gradebook With Pandas
byThe Real Python Podcast
0 ratings
0% found this document useful
MLA 014 Machine Learning Server: Server-side ML. Training & hosting for inference, with a goal towards serverless. AWS SageMaker, Batch, Lambda, EFS, Cortex.dev
UNLIMITED
MLA 014 Machine Learning Server: Server-side ML. Training & hosting for inference, with a goal towards serverless. AWS SageMaker, Batch, Lambda, EFS, Cortex.dev
byMachine Learning Guide
0 ratings
0% found this document useful
Hasty Treat WTF × SSR vs JamStack vs Serverless?: In this Hasty Treat, Scott and Wes talk about the differences between SSR, JamStack, and Serverless. LogRocket - Sponsor LogRocket lets you replay what users do on your site, helping you reproduce bugs and fix issues faster. It’s an exception...
UNLIMITED
Hasty Treat WTF × SSR vs JamStack vs Serverless?: In this Hasty Treat, Scott and Wes talk about the differences between SSR, JamStack, and Serverless. LogRocket - Sponsor LogRocket lets you replay what users do on your site, helping you reproduce bugs and fix issues faster. It’s an exception...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Generators, Coroutines, and Learning Python Through Exercises
UNLIMITED
Generators, Coroutines, and Learning Python Through Exercises
byThe Real Python Podcast
0 ratings
0% found this document useful
The AWS Evangelist with Jon Myer: Jon Myer is a partner solutions architect for cloud management tools at AWS. Prior to joining AWS, Jon worked as a senior cloud solutions architect at NetEnrich AWS, an AWS consultant for DevOps and Solutions at MetroStar Systems, and an AWS course author
UNLIMITED
The AWS Evangelist with Jon Myer: Jon Myer is a partner solutions architect for cloud management tools at AWS. Prior to joining AWS, Jon worked as a senior cloud solutions architect at NetEnrich AWS, an AWS consultant for DevOps and Solutions at MetroStar Systems, and an AWS course author
byScreaming in the Cloud
0 ratings
0% found this document useful
Distributing Geospatial Data: Distributing Geospatial Data - Every wondered why you might what to do this? Or maybe you understand the why but are unsure about the how? Perhaps you have heard people talk about partitioning data or sharding data, you might have heard some of thes...
UNLIMITED
Distributing Geospatial Data: Distributing Geospatial Data - Every wondered why you might what to do this? Or maybe you understand the why but are unsure about the how? Perhaps you have heard people talk about partitioning data or sharding data, you might have heard some of thes...
byThe MapScaping Podcast - GIS, Geospatial, Remote Sensing, earth observation and digital geography
0 ratings
0% found this document useful
Database Caching with Ben Hagan: Database caching is a fundamental challenge in database management and there are hundreds of techniques to satisfy different caching scenarios. PolyScale is a fully automated database cache. It offers an innovative approach to database caching,
UNLIMITED
Database Caching with Ben Hagan: Database caching is a fundamental challenge in database management and there are hundreds of techniques to satisfy different caching scenarios. PolyScale is a fully automated database cache. It offers an innovative approach to database caching,
byData Archives - Software Engineering Daily
0 ratings
0% found this document useful
#121 — ChatGPT and How Generative AI is Augmenting Workflows
UNLIMITED
#121 — ChatGPT and How Generative AI is Augmenting Workflows
byDataFramed
0 ratings
0% found this document useful
Make Your Manufacturing Operations Smarter: Ara Surenian explores Plex's Smart Manufacturing Platform
UNLIMITED
Make Your Manufacturing Operations Smarter: Ara Surenian explores Plex's Smart Manufacturing Platform
byManufacturing Talk Radio
0 ratings
0% found this document useful
This vs That × map vs reduce, forEach vs for in, and more!: In this Hasty Treat, Scott and Wes do a little this vs that with map vs reduce, forEach vs for in, .hasOwnProperty() vs in vs .hasOwn(), CSS absolute + left/right/top/bottom vs transform, and more. Prismic - Sponsor Prismic is a Headless CMS that...
UNLIMITED
This vs That × map vs reduce, forEach vs for in, and more!: In this Hasty Treat, Scott and Wes do a little this vs that with map vs reduce, forEach vs for in, .hasOwnProperty() vs in vs .hasOwn(), CSS absolute + left/right/top/bottom vs transform, and more. Prismic - Sponsor Prismic is a Headless CMS that...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Domain Driven Design For Python: An interview about software architecture patterns in Python for large and complex systems
UNLIMITED
Domain Driven Design For Python: An interview about software architecture patterns in Python for large and complex systems
byThe Python Podcast.__init__
100%
100% found this document useful
97 Things Every Java Programmer Should Know with Kevlin Henney: In this episode, 97 Things Every Java Programmer …
UNLIMITED
97 Things Every Java Programmer Should Know with Kevlin Henney: In this episode, 97 Things Every Java Programmer …
byCoding Over Cocktails
0 ratings
0% found this document useful
Building PDFs in Python with ReportLab
UNLIMITED
Building PDFs in Python with ReportLab
byThe Real Python Podcast
0 ratings
0% found this document useful
Why Enterprise Licensing Changed the Game for Beyond Typicals: In this podcast episode, Sam discusses the development and refinement of our enterprise licensing technology for our software, Beyond Typicals. We outline how this model allows more companies to utilize our product and how it contributes to...
UNLIMITED
Why Enterprise Licensing Changed the Game for Beyond Typicals: In this podcast episode, Sam discusses the development and refinement of our enterprise licensing technology for our software, Beyond Typicals. We outline how this model allows more companies to utilize our product and how it contributes to...
byWe Make Civil Engineering Look Good | Working to Make Transportation and other Civil Engineer Projects Better through Outreach, 3D Visualization and More!
0 ratings
0% found this document useful
Shanea Leven - How To Bring Visibility To Your Codebase: Robby has a chat with the CEO and Co-Founder of CodeSee, Shanea Leven, about how the relatively unknown shift left movement helps in writing maintainable and resilient code, the importance of code visibility, the great effectiveness of CodeSee in managing documentation, the challenges that teams face when they don't teach their developers the best practices in documentation, how Codesee's team implements tech-debt sprints every 6 to 8 weeks, what spatial reasoning is, and so much more. Don’t miss out.
UNLIMITED
Shanea Leven - How To Bring Visibility To Your Codebase: Robby has a chat with the CEO and Co-Founder of CodeSee, Shanea Leven, about how the relatively unknown shift left movement helps in writing maintainable and resilient code, the importance of code visibility, the great effectiveness of CodeSee in managing documentation, the challenges that teams face when they don't teach their developers the best practices in documentation, how Codesee's team implements tech-debt sprints every 6 to 8 weeks, what spatial reasoning is, and so much more. Don’t miss out.
byMaintainable
0 ratings
0% found this document useful
Engineering interview tips & tricks: with Emma Draper & Jonas
UNLIMITED
Engineering interview tips & tricks: with Emma Draper & Jonas
byGo Time: Golang, Software Engineering
0 ratings
0% found this document useful
Structure and Interpretation of Computer Programs with Hal Abelson: Adam talks to Hal Abelson about the textbook he coauthored in 1984, SICP and why it is still popular and influential today. "If you pick up almost any computing book it starts out 'here are these datatypes, these operations that you do' and somewhere...
UNLIMITED
Structure and Interpretation of Computer Programs with Hal Abelson: Adam talks to Hal Abelson about the textbook he coauthored in 1984, SICP and why it is still popular and influential today. "If you pick up almost any computing book it starts out 'here are these datatypes, these operations that you do' and somewhere...
byCoRecursive: Coding Stories
100%
100% found this document useful
Day 2: Exploring The Limitless Potential of OpenAI's ChatGPT-3 (and Soon, ChatGPT-4)
UNLIMITED
Day 2: Exploring The Limitless Potential of OpenAI's ChatGPT-3 (and Soon, ChatGPT-4)
byFULCRUM News - USA and Global Top News Updates
0 ratings
0% found this document useful
181: Boost Your Django DX - Adam Johnson: We talk with Adam Johnson about his new book, "Boost Your Django DX".
UNLIMITED
181: Boost Your Django DX - Adam Johnson: We talk with Adam Johnson about his new book, "Boost Your Django DX".
byTest and Code
0 ratings
0% found this document useful
Data Mechanics: Data Engineering with Jean-Yves Stephan: Apache Spark is a popular open source analytics engine for large-scale data processing. Applications can be written in Java, Scala, Python, R, and SQL. These applications have flexible options to run on like Kubernetes or in the cloud.
UNLIMITED
Data Mechanics: Data Engineering with Jean-Yves Stephan: Apache Spark is a popular open source analytics engine for large-scale data processing. Applications can be written in Java, Scala, Python, R, and SQL. These applications have flexible options to run on like Kubernetes or in the cloud.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
147: Testing Single File Python Applications/Scripts with pytest and coverage: Have you ever written a single file Python application or script? Have you written tests for it? Do you check code coverage? This is the topic of this weeks episode, spurred on by a listener question.
UNLIMITED
147: Testing Single File Python Applications/Scripts with pytest and coverage: Have you ever written a single file Python application or script? Have you written tests for it? Do you check code coverage? This is the topic of this weeks episode, spurred on by a listener question.
byTest and Code
0 ratings
0% found this document useful
120: FastAPI & Typer - Sebastián Ramírez: Sebastián Ramírez is the developer behind FastAPI for Python REST APIs and Typer, for CLI applications. We discuss FastAPI, Typer, Swagger UI, interface design, autocompletion, and more.
UNLIMITED
120: FastAPI & Typer - Sebastián Ramírez: Sebastián Ramírez is the developer behind FastAPI for Python REST APIs and Typer, for CLI applications. We discuss FastAPI, Typer, Swagger UI, interface design, autocompletion, and more.
byTest and Code
0 ratings
0% found this document useful
Running Databases on Kubernetes
UNLIMITED
Running Databases on Kubernetes
byThe Cloudcast
0 ratings
0% found this document useful
Experimentation and A/B Testing For Modern Data Teams With Eppo: An interview with Eppo founder Chetan Sharma about the challenges of designing, running, and analyzing product experiments and the work that he is doing to make it more accessible to organizations of every size.
UNLIMITED
Experimentation and A/B Testing For Modern Data Teams With Eppo: An interview with Eppo founder Chetan Sharma about the challenges of designing, running, and analyzing product experiments and the work that he is doing to make it more accessible to organizations of every size.
byData Engineering Podcast
0 ratings
0% found this document useful

Skip carousel

“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
PC Pro Magazine
UNLIMITED
“There’s No Single ‘Best’ Language To Learn. I Think The Real Key Is To Learn How To Write Code”
Oct 8, 2022
9 min read
Usability
Linux Format
UNLIMITED
Usability
Oct 19, 2021
3 min read
How To Develop A RESTful Client In Go
Linux Format
UNLIMITED
How To Develop A RESTful Client In Go
Nov 16, 2021
Mihalis Tsoukalos is a systems engineer and technical writer. He’s the author of Go Systems Programming and Mastering Go. You can reach him at @mactsouk. The subject of this month’s tutorial is RESTful services. In particular, you’re going to learn h
9 min read
Collect And Graph Metrics With Python
Linux Format
UNLIMITED
Collect And Graph Metrics With Python
May 4, 2021
7 min read
Can I Use Python 2 In Maya 2022?
3D World
UNLIMITED
Can I Use Python 2 In Maya 2022?
Aug 10, 2021
1 min read
MapReduce: The ‘Big Data’ Idea Inside Your Android Phone
APC
UNLIMITED
MapReduce: The ‘Big Data’ Idea Inside Your Android Phone
Dec 2, 2019
4 min read
LXF’s NEW $HOME
Linux Format
UNLIMITED
LXF’s NEW $HOME
Jun 1, 2021
7 min read
Build Your Own URL Shortening Service
Linux Format
UNLIMITED
Build Your Own URL Shortening Service
May 4, 2021
7 min read
Demystifying Artificial Intelligence
Finweek - English
UNLIMITED
Demystifying Artificial Intelligence
Oct 18, 2019
artificial intelligence (AI) has had a significant global impact by changing the way enterprises, markets and consumers define efficiency and innovation. Financial markets typically feature large volumes of noisy and dynamic data while utilising high
3 min read
Grafana Terminology
Linux Format
UNLIMITED
Grafana Terminology
Jan 14, 2020
A Grafana data source is a database, file or service that provides data to Grafana – it cannot operate without data. A Grafana panel is the basic building block of Grafana. Panels are made of visualisations or queries. A Grafana query is used for req
1 min read
2 The Use of Python in AI and ML
Techfastly
UNLIMITED
2 The Use of Python in AI and ML
Nov 30, 2020
3 min read
The Deep Learning Revolution For Artificial Intelligence
Facility Management
UNLIMITED
The Deep Learning Revolution For Artificial Intelligence
Mar 28, 2019
3 min read
Why We Need To Fear The Risk Of AI Model Collapse
Evening Standard
UNLIMITED
Why We Need To Fear The Risk Of AI Model Collapse
Dec 17, 2023
4 min read
Scikit-Learn: The Ultimate Python Library
APC
UNLIMITED
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
01 Ready Or Not, AI Is Here To Assist You
HWM Singapore
UNLIMITED
01 Ready Or Not, AI Is Here To Assist You
Jul 11, 2023
4 min read
Inform And Enhance Your Business With Open Data
PC Pro Magazine
UNLIMITED
Inform And Enhance Your Business With Open Data
Jun 10, 2021
7 min read
Getting The edge
The European Business Review
UNLIMITED
Getting The edge
Feb 25, 2021
7 min read
The Race To Exascale Supercomputers
Maximum PC
UNLIMITED
The Race To Exascale Supercomputers
Jun 21, 2022
9 min read
A.i. Coding
Linux Format
UNLIMITED
A.i. Coding
Aug 22, 2023
16 min read
How To Make Sense From And With AI ?
The European Business Review
UNLIMITED
How To Make Sense From And With AI ?
Sep 25, 2021
4 min read
Leadership Forum: Investing in Disruption
Rotman Management
UNLIMITED
Leadership Forum: Investing in Disruption
Jan 1, 2019
10 min read
Finding A New Career In AI
APC
UNLIMITED
Finding A New Career In AI
Mar 23, 2020
4 min read
How Can AI Help Your Business?
PC Pro Magazine
UNLIMITED
How Can AI Help Your Business?
Jun 8, 2023
7 min read
One Tech Tip: Home For The Holidays? Show Relatives You Care With Some Tech Support
The Independent
UNLIMITED
One Tech Tip: Home For The Holidays? Show Relatives You Care With Some Tech Support
Nov 28, 2024
2 min read
Generative AI: What Leaders Need To Know
Rotman Management
UNLIMITED
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
Machine-learning On Your Android Phone?
APC
UNLIMITED
Machine-learning On Your Android Phone?
Dec 30, 2019
4 min read
Why a Hedge Fund Started a Video Game Competition
Nautilus
UNLIMITED
Why a Hedge Fund Started a Video Game Competition
Nov 30, 2017
There’s a weird way in which a hedge fund is a confluence of everything. There’s the money of course—Two Sigma, located in lower Manhattan, manages over $50 billion, an amount that has grown 600 percent in 6 years and is roughly the size of the econo
9 min read
Web App Security
Linux Format
UNLIMITED
Web App Security
Jun 29, 2021
8 min read
The Machine Learning Revolution
APC
UNLIMITED
The Machine Learning Revolution
Sep 6, 2021
8 min read
AI And Design: Questions Of Ethics
Architecture Australia
UNLIMITED
AI And Design: Questions Of Ethics
Mar 4, 2024
Artificial intelligence (AI) is a very old idea, but the term AI and the field of AI as it relates to modern programmable digital computing have taken their contemporary forms in the past 70 years.1Today, we interact with AI technologies constantly,
5 min read

Related categories

Skip carousel

Reviews for Hands-on Scikit-Learn for Machine Learning Applications

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Hands-on Scikit-Learn for Machine Learning Applications - David Paper

D. PaperHands-on Scikit-Learn for Machine Learning Applicationshttps://doi.org/10.1007/978-1-4842-5373-1_1

1. Introduction to Scikit-Learn

David Paper¹

(1)

Logan, UT, USA

Scikit-Learn is a Python library that provides simple and efficient tools for implementing supervised and unsupervised machine learning algorithms. The library is accessible to everyone because it is open source and commercially usable. It is built on NumPY, SciPy, and matplolib libraries, which means it is reliable, robust, and core to the Python language.

Scikit-Learn is focused on data modeling rather than data loading, cleansing, munging or manipulating. It is also very easy to use and relatively clean of programming bugs.

Machine Learning

Machine learning is getting computers to program themselves. We use algorithms to make this happen. An algorithm is a set of rules used to calculate or problem solve with a computer.

Machine learning advocates create, study, and apply algorithms to improve performance on data-driven tasks. They use tools and technology to answer questions about data by training a machine how to learn.

The goal is to build robust algorithms that can manipulate input data to predict an output while continually updating outputs as new data becomes available. Any information or data sent to a computer is considered input. Data produced by a computer is considered output.

In the machine learning community, input data is referred to as the feature set and output data is referred to as the target. The feature set is also referred to as the feature space. Sample data is typically referred to as training data. Once the algorithm is trained with sample data, it can make predictions on new data. New data is typically referred to as test data.

Machine learning is divided into two main areas: supervised and unsupervised learning. Since machine learning typically focuses on prediction based on known properties learned from training data, our focus is on supervised learning.

Supervised learning is when the data set contains both inputs (or the feature set) and desired outputs (or targets). That is, we know the properties of the data. The goal is to make predictions. This ability to supervise algorithm training is a big part of why machine learning has become so popular.

To classify or regress new data, we must train on data with known outcomes. We classify data by organizing it into relevant categories. We regress data by finding the relationship between feature set data and target data.

With unsupervised learning, the data set contains only inputs but no desired outputs (or targets). The goal is to explore the data and find some structure or way to organize it. Although not the focus of the book, we will explore a few unsupervised learning scenarios.

Anaconda

You can use any Python installation, but I recommend installing Python with Anaconda for several reasons. First, it has over 15 million users. Second, Anaconda allows easy installation of the desired version of Python. Third, it preinstalls many useful libraries for machine learning including Scikit-Learn. Follow this link to see the Anaconda package lists for your operating system and Python version: https://docs.anaconda.com/anaconda/packages/pkg-docs/. Fourth, it includes several very popular editors including IDLE, Spyder, and Jupyter Notebooks. Fifth, Anaconda is reliable and well-maintained and removes compatibility bottlenecks.

You can easily download and install Anaconda with this link: https://www.anaconda.com/download/. You can update with this link: https://docs.anaconda.com/anaconda/install/update-version/. Just open Anaconda and follow instructions. I recommend updating to the current version.

Scikit-Learn

Python’s Scikit-Learn is one of the most popular machine learning libraries. It is built on Python libraries NumPy, SciPy, and Matplotlib. The library is well-documented, open source, commercially usable, and a great vehicle to get started with machine learning. It is also very reliable and well-maintained, and its vast collection of algorithms can be easily incorporated into your projects. Scikit-Learn is focused on modeling data rather than loading, manipulating, visualizing, and summarizing data. For such activities, other libraries such as NumPy, pandas, Matplotlib, and seaborn are covered as encountered. The Scikit-Learn library is imported into a Python script as sklearn.

Data Sets

A great way to understand machine learning application is by working through Python data-driven code examples. We use either Scikit-Learn, UCI Machine Learning, or seaborn data sets for all examples. The Scikit-Learn data sets package embeds some small data sets for getting started and helpers to fetch larger data sets commonly used in the machine learning library to benchmark algorithms on data from the world at large. The UCI Machine Learning Repository maintains 468 data sets to serve the machine learning community. Seaborn provides an API on top of Matplotlib that offers simplicity when working with plot styles, color defaults, and high-level functions for common statistical plot types that facilitate visualization. It also integrates nicely with Pandas DataFrame functionality.

We chose the data sets for our examples because the machine learning community uses them for learning, exploring, benchmarking, and validating, so we can compare our results to others while learning how to apply machine learning algorithms.

Our data sets are categorized as either classification or regression data. Classification data complexity ranges from simple to relatively complex. Simple classification data sets include load_iris, load_wine, bank.csv, and load_digits. Complex classification data sets include fetch_20newsgroups, MNIST, and fetch_1fw_people. Regression data sets include tips, redwine.csv, whitewine.csv, and load_boston.

Characterize Data

Before working with algorithms, it is best to understand the data characterization. Each data set was carefully chosen to help you gain experience with the most common aspects of machine learning. We begin by describing the characteristics of each data set to better understand its composition and purpose. Data sets are organized by classification and regression data.

Classification data is further organized by complexity. That is, we begin with simple classification data sets that are not complex so that the reader can focus on the machine learning content rather than on the data. We then move onto more complex data sets.

Simple Classification Data

Classification is a machine learning technique for predicting the class upon which a dependent variable belongs. A class is a discrete response. In machine learning, a dependent variable is typically referred to as the target. A class is predicted based upon the independent variables of a data set. Independent variables are typically referred to as the feature set or feature space. Feature space is the collection of features used to characterize the data.

Simple data sets are those with a limited number of features. Such a data set is referred to as one with a low-dimensional feature space.

Iris Data

The first data set we characterize is load_iris, which consists of Iris flower data. Iris is a multivariate data set consisting of 50 samples from each of three species of iris (Iris setosa, Iris virginica, and Iris versicolor). Each sample contains four features, namely, length and width of sepals and petals in centimeters. Iris is a typical test case for machine learning classification. It is also one of the best known data sets in the data science literature, which means you can test your results against many other verifiable examples.

The first code example shown in Listing 1-1 loads Iris data, displays its keys, shape of the feature set and target, feature and target names, a slice from the DESCR key, and feature importance (from most to least).

from sklearn import datasets

from sklearn.ensemble import RandomForestClassifier

if __name__ == __main__:

br = '\n'

iris = datasets.load_iris()

keys = iris.keys()

print (keys, br)

X = iris.data

y = iris.target

print ('features shape:', X.shape)

print ('target shape:', y.shape, br)

features = iris.feature_names

targets = iris.target_names

print ('feature set:')

print (features, br)

print ('targets:')

print (targets, br)

print (iris.DESCR[525:900], br)

rnd_clf = RandomForestClassifier(random_state=0,

n_estimators=100)

rnd_clf.fit(X, y)

rnd_name = rnd_clf.__class__.__name__

feature_importances = rnd_clf.feature_importances_

importance = sorted(zip(feature_importances, features),

reverse=True)

print ('most important features' + ' (' + rnd_name + '):')

[print (row) for i, row in enumerate(importance)]

Listing 1-1

Characterize the Iris data set

Go ahead and execute the code from Listing 1-1. Remember that you can find the example from the book’s example download. You don’t need to type the example by hand. It’s easier to access the example download and copy/paste.

Your output from executing Listing 1-1 should resemble the following:

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

features shape: (150, 4)

target shape: (150,)

feature set:

['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

targets:

['setosa' 'versicolor' 'virginica']

============== ==== ==== ======= ===== ====================

Min Max Mean SD Class Correlation

============== ==== ==== ======= ===== ====================

sepal length: 4.3 7.9 5.84 0.83 0.7826

sepal width: 2.0 4.4 3.05 0.43 -0.4194

petal length: 1.0 6.9 3.76 1.76 0.9490 (high!)

petal width:

most important features (RandomForestClassifier):

(0.4604447396171521, 'petal length (cm)')

(0.4241162651271012, 'petal width (cm)')

(0.09090795402103086, 'sepal length (cm)')

(0.024531041234715754, 'sepal width (cm)')

The code begins by importing datasets and RandomForestClassifier packages. RandomForestClassifier is an ensemble learning method that constructs a multitude of decision trees at training time and outputs the class that is the mode of the classes.

In this example, we are only using it to return feature importance. The main block begins by loading data and displaying its characteristics. Loading feature set data into variable X and target data into variable y is convention in the machine learning community.

The code concludes by training RandomForestClassifier on the pandas data, so it can return feature importance. When actually modeling data, we convert pandas data to NumPy for optimum performance. Keep in mind that the keys are available because the data set is embedded in Scikit-Learn.

Notice that we only took a small slice from DESCR, which holds a lot of information about the data set. I always recommend displaying at least the shape of the original data set before embarking on any machine learning experiment.

Tip

RandomForestClassifier is a powerful machine learning algorithm that not only models training data, but returns feature importance.

Wine Data

The next data set we characterize is load_wine. The load_wine data set consists of 178 data elements. Each element has thirteen features that describe three target classes. It is considered a classic in the machine learning community and offers an easy multi-classification data set.

The next code example shown in Listing 1-2 loads wine data and displays its keys, shape of the feature set and target, feature and target names, a slice from the DESCR key, and feature importance (from most to least).

from sklearn.datasets import load_wine

from sklearn.ensemble import RandomForestClassifier

if __name__ == __main__:

br = '\n'

data = load_wine()

keys = data.keys()

print (keys, br)

X, y = data.data, data.target

print ('features:', X.shape)

print ('targets', y.shape, br)

print (X[0], br)

features = data.feature_names

targets = data.target_names

print ('feature set:')

print (features, br)

print ('targets:')

print (targets, br)

rnd_clf = RandomForestClassifier(random_state=0,

n_estimators=100)

rnd_clf.fit(X, y)

rnd_name = rnd_clf.__class__.__name__

feature_importances = rnd_clf.feature_importances_

importance = sorted(zip(feature_importances, features),

reverse=True)

n = 6

print (n, 'most important features' + ' (' + rnd_name + '):')

[print (row) for i, row in enumerate(importance) if i < n]

Listing 1-2

Characterize load_wine

After executing code from Listing 1-2, your output should resemble the following:

dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names'])

features: (178, 13)

targets (178,)

[1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00

2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03]

feature set:

['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']

targets:

['class_0' 'class_1' 'class_2']

6 most important features (RandomForestClassifier):

(0.19399882779940295, 'proline')

(0.16095401215681593, 'flavanoids')

(0.1452667364559143, 'color_intensity')

(0.11070045042456281, 'alcohol')

(0.1097465262717493, 'od280/od315_of_diluted_wines')

(0.08968972021098301, 'hue')

Tip

To create (instantiate) a machine learning algorithm (model), just assign it to a variable (e.g., model = algorithm()). To train based on the model, just fit it to the data (e.g., model.fit(X, y)).

The code begins by importing load_wine and RandomForestClassifier. The main block displays keys, loads data into X and y, displays the first vector from feature set X, displays shapes, and displays feature set and target information. The code concludes by training X with RandomForestClassifier, so we can display the six most important features. Notice that we display the first vector from feature set X to verify that all features are numeric.

Bank Data

The next code example shown in Listing 1-3 works with bank data. The bank.csv data set is composed of direct marketing campaigns from a Portuguese banking institution. The target is described by whether a client will subscribe (yes/no) to a term deposit (target label y). It consists of 41188 data elements with 20 features for each element. A 10% random sample of 4119 data elements is also available from this site for more computationally expensive algorithms such as svm and KNeighborsClassifier.

import pandas as pd

if __name__ == __main__:

br = '\n'

f = 'data/bank.csv'

bank = pd.read_csv(f)

features = list(bank)

print (features, br)

X = bank.drop(['y'], axis=1).values

y = bank['y'].values

print (X.shape, y.shape, br)

print (bank[['job', 'education', 'age', 'housing',

'marital', 'duration']].head())

Listing 1-3

Characterize bank data

After executing code from Listing 1-3, your output should resemble the following:

['age', 'job', 'marital', 'education', 'default', 'housing', 'loan', 'contact', 'month', 'day_of_week', 'duration', 'campaign', 'pdays', 'previous', 'poutcome', 'emp.var.rate', 'cons.price.idx', 'cons.conf.idx', 'euribor3m', 'nr.employed', 'y']

(41188, 20) (41188,)

job education age housing marital duration

0 housemaid basic.4y 56 no married 261

1 services high.school 57 no married 149

2 services high.school 37 yes married 226

3 admin. basic.6y 40 no married 151

4 services high.school 56 no married 307

The code example begins by importing the pandas package. The main block loads bank data from a CSV file into a Pandas DataFrame and displays the column names (or features). To retrieve column names from pandas, all we need to do is make the DataFrame a list and assign the result to a variable. Next, feature set X and target y are created. Finally, X and y shapes are displayed as well as a few choice features.

Digits Data

The final code example in this

Enjoying the preview?

Page 1 of 1

Hands-on Scikit-Learn for Machine Learning Applications: Data Science Fundamentals with Python

About this ebook

David Paper

Read more from David Paper

Data Science Fundamentals for Python and MongoDB

TensorFlow 2.x in the Colaboratory Cloud: An Introduction to Deep Learning on Google’s Cloud Service

State-of-the-Art Deep Learning Models in TensorFlow: Modern Machine Learning in the Google Colab Ecosystem

Related authors

Related to Hands-on Scikit-Learn for Machine Learning Applications

Related ebooks

Practical Python Data Visualization: A Fast Track Approach To Learning Data Visualization With Python

Data Engineering with Python: Work with massive datasets to design data models and automate data pipelines using Python

Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing

Practical Natural Language Processing with Python: With Case Studies from Industries Using Text Data at Scale

A Handbook of Mathematical Models with Python: Elevate your machine learning projects with NetworkX, PuLP, and linalg

Pro Machine Learning Algorithms: A Hands-On Approach to Implementing Algorithms in Python and R

Machine Learning Engineering with Python: Manage the lifecycle of machine learning models using MLOps with practical examples

Learn Microservices with Spring Boot: A Practical Approach to RESTful Services Using an Event-Driven Architecture, Cloud-Native Patterns, and Containerization

Deep Belief Nets in C++ and CUDA C: Volume 1: Restricted Boltzmann Machines and Supervised Feedforward Networks

Designing Microservices with Django: An Overview of Tools and Practices

Enterprise Bug Busting: From Testing through CI/CD to Deliver Business Results

ASP.NET 3.5 CMS Development

Python Interviews: Discussions with Python Experts

System integration testing The Ultimate Step-By-Step Guide

Pro Cryptography and Cryptanalysis: Creating Advanced Algorithms with C# and .NET

Observability with Grafana: Monitor, control, and visualize your Kubernetes and cloud platforms using the LGTM stack

A Developer's Essential Guide to Docker Compose: Simplify the development and orchestration of multi-container applications

Software Documentation Strategy A Complete Guide - 2020 Edition

MongoDB Recipes: With Data Modeling and Query Building Strategies

Social Media Data Mining and Analytics

APIs A Complete Guide - 2021 Edition

Technical Program Manager's Handbook: Unlock your TPM potential by leading technical projects successfully and elevating your career path

Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice

Building REST APIs with Flask: Create Python Web Services with MySQL

Solr in Action

Software Development Process Models A Complete Guide - 2020 Edition

Architecting Vue.js 3 Enterprise-Ready Web Applications: Build and deliver scalable and high-performance, enterprise-ready applications with Vue and JavaScript

Software Craftsmanship A Complete Guide - 2020 Edition

Software Test A Complete Guide - 2020 Edition

C# Deconstructed: Discover how C# works on the .NET Framework

Intelligence (AI) & Semantics For You

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

ChatGPT for Beginners: How to Make Money Online and 10x Your Productivity Using ChatGPT Even if You’re an Absolute Beginner (The Complete Up-to-Date ChatGPT Guide)

ChatGPT For Dummies

Summary of Super-Intelligence From Nick Bostrom

Artificial Intelligence: A Guide for Thinking Humans

ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve

Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)

Nexus: A Brief History of Information Networks from the Stone Age to AI

Chat-GPT Income Ideas: Pioneering Monetization Concepts Utilizing Conversational AI for Profitable Ventures

So You Want to Start a Podcast: Finding Your Voice, Telling Your Story, and Building a Community That Will Listen

Writing AI Prompts For Dummies

Midjourney Mastery - The Ultimate Handbook of Prompts

ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business

AI Investing For Dummies

The Secrets of ChatGPT Prompt Engineering for Non-Developers

AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python

101 Midjourney Prompt Secrets

The Algorithm of the Universe (A New Perspective to Cognitive AI)

A Quickstart Guide To Becoming A ChatGPT Millionaire: The ChatGPT Book For Beginners (Lazy Money Series®)

Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates

Dark Aeon: Transhumanism and the War Against Humanity

ChatGPT For Fiction Writing: AI for Authors

Our Final Invention: Artificial Intelligence and the End of the Human Era

Killer ChatGPT Prompts: Harness the Power of AI for Success and Profit

Neural Networks: A Practical Guide for Understanding and Programming Neural Networks and Useful Insights for Inspiring Reinvention

AI for Educators: AI for Educators

Gigatrends: Six Forces That Are Changing the Future for Billions

ChatGPT 4 $10,000 per Month #1 Beginners Guide to Make Money Online Generated by Artificial Intelligence

Related podcast episodes

Related articles

Related categories

Reviews for Hands-on Scikit-Learn for Machine Learning Applications

What did you think?

Book preview

Hands-on Scikit-Learn for Machine Learning Applications - David Paper

1. Introduction to Scikit-Learn

Machine Learning