Open navigation menu

Scribd

0% found this document useful (0 votes)

21 views

Lec1 24th Nov

Uploaded by

chandrashekar m

Copyright

© © All Rights Reserved

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Lec1 24th Nov

Uploaded by

chandrashekar m

Copyright

© © All Rights Reserved

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

BITS Pilani

WILP

AMIL CZG516
ML System Optimization

Murali Parameswaran
[email protected]

BITS Pilani
Pilani Campus
Distributed Computing

Running programs on multiple computers (in a

network – possibly geographically distributed)
Distributed Computing

Infrastructure (Hardware & Systems Software)

for
Applications (Algorithms, Software Solutions)
Content & Pedagogy

• Focus on principles, concepts, and design

• Pragmatics and Implementation to be learnt by doing –
enabled by Assignments and Project.

• Lecture Sessions are expected to be interactive:

• students are expected to raise questions and
• the instructor will ask questions (which the students are
expected to answer)
BITS Pilani
WILP

AIML CLZG516
ML System Optimization
Session 1

Course Introduction
Sec-1: Prof. Madhusudhanan B {[email protected]}
Sec-2: Prof. Murali P {[email protected]} (I-C)
Lifecycle of New Technologies

Engineering
Manufacturing
Commerce
Commodity

Marketing

Hype

Scientist

Research
Lifecycle of AI/ML – Where are we ?
ChatGPT?

Engineering
Manufacturing
Commerce
Commodity

Marketing

Hype

Scientist

Research
Machine Learning – Enterprise Practice
• AI and Machine Learning is becoming central to organizations:
• No longer a one-off activity
• Multiple problems / perspectives addressed through ML
• Multiple ML solutions deployed
• ML is becoming a continual activity:
• Data change; Context changes
• Drift in the solution
• Problems change; Requirements change;
• New model(s) required
• World changes; Expectations change
• Performance and Standardization critical ==>
• Packaging vs. Pricing

8
Operationalizing AI/ML

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Deployment
Develop
Deploy, and Infer
Operationalizing AI/ML

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Deployment
Internal Compliance
Regulatory Compliance
Operationalizing AI/ML

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Deployment
ML is part of an application
Hardware/Platform to use
Operationalizing AI/ML

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Deployment
ML is part of an application
Hardware/Platform to use
Will our model fit in the pipeline?
Operationalizing AI/ML

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Software pipelining:
Test whether model is working within pipeline
Whether inputs come in the same form
Deployment
Whether inputs come in the order needed for your model to respond to
Whether responses are being consumed appropriately
Whether responses are to consumed one at a time, or a sequence of
responses have to be consumed
May need to return to training
Operationalizing AI/ML after validation/testing.

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

If model doesn’t perform while testing.

Deployment
AI scientist will naturally revisit model and return to training.
Operationalizing AI/ML Can we stop after deployment?

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Deployment
Machine Learning – Enterprise Practice-Recap
• AI and Machine Learning is becoming central to organizations:
• No longer a one-off activity
• Multiple problems / perspectives addressed through ML
• Multiple ML solutions deployed
• ML is becoming a continual activity:
• Data change; Context changes
• Drift in the solution
• Problems change; Requirements change;
• New model(s) required
• World changes; Expectations change
• Performance and Standardization critical ==>
• Packaging vs. Pricing
• Depends on competition
• Cost for model inference

16
Operationalizing AI/ML

• Class-room View : • Enterprise View: • Compliance

• Development • System Testing
• Model Building / Testing
• Pipe Fitting
• …
Training
• Deployment
Validation
• Inference (Re-)
Training

Development is not linear, like in class room view.

Extreme Scenario: Deployment

Data is arriving piecemeal (or streaming) ==> Training is incremental!

How is performance delivered? Performance metrics?

Operationalizing AI / ML: Cost
• Cost:
Focus of this course
• Time and Resources during Training vs Inference
• During Training:
• Running Time of an algorithm:
• E.g. k-means is an O(N*N) algorithm given N data points
• E.g. SVM has a time complexity between O(d*N2) and (d*N3)
where
• d is the number of dimensions (of the data points) and
• N is the number of data points
Cost during Training

• Example
• E.g. SVM has a time complexity between O(d*N2) and (d*N3) where
• d is the number of dimensions (of the data points) and
• N is the number of data points
• For a large dataset N, say, N = 109 and d=5 this could be costly:
• Assuming 2 simple arithmetic operations per data point:
• this amounts to at least 1019 (=5*2*109*109) operations
• Given a 2.5 GHz processor, i.e. 0.4ns clock cycle
• and 1 CPI (i.e. cycles per instruction), a measure of processor
throughput
• [simplistic but close to reality!]
• 1019 operations will take close to 5.3 years
• Reducing running time during training is a big focus in this course!
Reducing running time
• Typical methods:
• Parallelize or distribute computation:
• Multi-threaded programming on multi-core processors
• Massively multi-threaded programming on many-core GPGPUs
• Distributed Programming on Scale-out Clusters of CPUs or GPUs
• Hand-tuning or compiler-performed code optimization
• Rewritten for parallelism or generated by compilers
• Process = Program + Address Space (at run time)
• Threads share address space:
• Each thread gets its own call stack
• Heap and global area are shared by all threads
• Threads run on a shared memory model (e.g., multi-core, many-core
processors)
• Distributed programming is on Distributed memory ie. Memory of multiple
computers (Processor+memory+disk+OS)
Cost during training

• Megatron-Turing NLG:
• 530 billion parameters
• Microsoft and Nvidia claim to have used hundreds of DGX A100
servers
• Each server costs ~200,000 $
• Add the networking cost, the infrastructure cost is ~100M$
• Each server consumes 6.5kW of power
• Add a comparable cooling cost!
• We will NOT do much about power consumption in this course!
• But we will look at reducing model size as an important aspect!
Sizes of NLP models over the years
Model Size

• LLMs (Large Language Models) like GPT-4 and Bard are notoriously
large.
• But there are systematic approaches to reduce model size
• Without compromising the accuracy too much.
• We will look at model compression in this course!
Cost during Deployment

• When a model is deployed:

• Requests come in and the model responds with inferences
• E.g. if your model is a classifier:
• For a new input x,
• the response is C(x) such that x  C(x)
• Performance Parameters for this phase:
• Throughput:
• Number of inferences over a unit of time
• Response Time: Time take to serve one inference
• Consider the classifier example with a (data) cluster example!
• Will there be a difference in response time?
Deployment Range
• The model (that has been trained) or an application using the model
could be deployed on a variety of platforms:
• A server (or a workstation)
• What if the model is large?
• The Cloud
• The cloud can provision large infrastructure to host a large
model:
• Increase the number of servers hosting and accepting
requests thereby improving throughput and response time!
• But there is always delay
• i.e., network latency in reaching a remote server or a
server on the cloud (and getting the response back)
• A mobile phone:
• best end-user response time but cannot host large models.
Sequential vs. Batch
• BATCH (Assumption): Requests are collected together and sent
• Responses are collected together and sent

• Ans.
• Part (A) If the model server is parallel multiple threads or processes could
respond in parallel thereby improving response time and throughput.

• Part (B) [Always] Messaging/Communication cost may be reduced:

• Communication cost = setup-cost + transmission cost
• set-up cost is fixed per message
• Transmission cost is proportional to the length of the message
Content & Pedagogy

• Focus on systems, programming techniques, and analysis

• Pragmatics and Implementation to be learnt by doing –
enabled by Assignments and Project.

• Lecture Sessions are expected to be interactive:

• students are expected to raise questions and
• the instructor will ask questions (which the students are
expected to answer)
Evaluation

• Mid-term test and final exam – centrally scheduled by BITS

• A Total of 55% weight = 25% for test + 30% for final
exam
• 1 Assignment and 1 Project : Team exercises, Take-home
• (15+30 =) 45% weight
Assignment and Project

• They are meant for learning

• Expected:
• One complex-end-to-end piece of optimization completed
• One cutting-edge optimization technique learnt
• Team-work with identifiable and quantifiable individual
contributions
• Evaluation both at team level and individual level

You might also like

Machine Learning Interviews
100% (3)
Machine Learning Interviews
22 pages
MODULE 5 Pneumatic Control Circuits PDF
100% (1)
MODULE 5 Pneumatic Control Circuits PDF
108 pages
IMRAD
No ratings yet
IMRAD
10 pages
KSP Reference Manual
No ratings yet
KSP Reference Manual
202 pages
Cs All Regions QP PB 1
No ratings yet
Cs All Regions QP PB 1
225 pages
Cs329s 01 Slides
No ratings yet
Cs329s 01 Slides
70 pages
Deployment
No ratings yet
Deployment
88 pages
Webinar Slides Mlops
100% (1)
Webinar Slides Mlops
35 pages
Machine Learning Canvas (v1.1)
No ratings yet
Machine Learning Canvas (v1.1)
2 pages
Lecture 8 - Lifecycle of A Data Science Project - Part 2
No ratings yet
Lecture 8 - Lifecycle of A Data Science Project - Part 2
43 pages
CT1-MLOPs_S1_2
No ratings yet
CT1-MLOPs_S1_2
68 pages
Machine Learning Canvas (v1.1)
No ratings yet
Machine Learning Canvas (v1.1)
2 pages
cs329s 02 Note Intro ML Sys Design
No ratings yet
cs329s 02 Note Intro ML Sys Design
27 pages
AI Fellowship Nepal
No ratings yet
AI Fellowship Nepal
17 pages
Lecture+Notes_Intro_to_MLOps_Session3
No ratings yet
Lecture+Notes_Intro_to_MLOps_Session3
8 pages
W11 Ecs7020p
No ratings yet
W11 Ecs7020p
35 pages
Machine Learning Systems
No ratings yet
Machine Learning Systems
300 pages
ML Short U1-4
No ratings yet
ML Short U1-4
60 pages
Week 1 - Overview of ML Lifecycle and Deployment
No ratings yet
Week 1 - Overview of ML Lifecycle and Deployment
21 pages
ML Projects For Final Year
No ratings yet
ML Projects For Final Year
7 pages
Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning
No ratings yet
Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning
11 pages
Large Language Model Based Solutions HOW TO DELIVER VALUE WITH COST EFFECTIVE GENERATIVE AI APPLICATIONS 1st Edition Shreyas Subramanian - Download the ebook now for full and detailed access
100% (1)
Large Language Model Based Solutions HOW TO DELIVER VALUE WITH COST EFFECTIVE GENERATIVE AI APPLICATIONS 1st Edition Shreyas Subramanian - Download the ebook now for full and detailed access
53 pages
Basic_concepts_of_Machine_Learning_for_Beginners_1732109263
No ratings yet
Basic_concepts_of_Machine_Learning_for_Beginners_1732109263
102 pages
cs329s 2022 02 Slides MLSD
No ratings yet
cs329s 2022 02 Slides MLSD
99 pages
Aiml Report
No ratings yet
Aiml Report
70 pages
ML Midterm Cheatsheet
No ratings yet
ML Midterm Cheatsheet
2 pages
CT1-MLOPs-S3_4
No ratings yet
CT1-MLOPs-S3_4
37 pages
Identifing Software Bugs or Not Using SMLT Model
No ratings yet
Identifing Software Bugs or Not Using SMLT Model
34 pages
AIML V.22 Brochure Newversion22
No ratings yet
AIML V.22 Brochure Newversion22
16 pages
W03 Benchmarking
No ratings yet
W03 Benchmarking
25 pages
Aiml Report
No ratings yet
Aiml Report
70 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
PGP Aiml2024
No ratings yet
PGP Aiml2024
22 pages
NewITRAddOn
No ratings yet
NewITRAddOn
6 pages
MACHINE LEARNING Unit-1
No ratings yet
MACHINE LEARNING Unit-1
23 pages
00 Course Introduction
No ratings yet
00 Course Introduction
58 pages
Internship Project Guide
No ratings yet
Internship Project Guide
6 pages
01 Intro
No ratings yet
01 Intro
49 pages
PGP AIML Online - Brochure
No ratings yet
PGP AIML Online - Brochure
19 pages
aiml_online_brochure (1)
No ratings yet
aiml_online_brochure (1)
20 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
7 - From ML To Production
No ratings yet
7 - From ML To Production
23 pages
Artificial Intelligence & Machine Learning: Post Graduate Program in
No ratings yet
Artificial Intelligence & Machine Learning: Post Graduate Program in
16 pages
CS-871-Lecture 1
No ratings yet
CS-871-Lecture 1
41 pages
Aiml Online Brochure
No ratings yet
Aiml Online Brochure
15 pages
Machine Learning Canvas (v1.1)
No ratings yet
Machine Learning Canvas (v1.1)
2 pages
Unit
No ratings yet
Unit
9 pages
Artificial Intelligence & Machine Learning Curriculum Pregrad
No ratings yet
Artificial Intelligence & Machine Learning Curriculum Pregrad
12 pages
DevOps For AI-IEEE
No ratings yet
DevOps For AI-IEEE
6 pages
PGP - Unified Brochure
No ratings yet
PGP - Unified Brochure
18 pages
AIML_Dom_25_Nov_2024
No ratings yet
AIML_Dom_25_Nov_2024
22 pages
MLOps
No ratings yet
MLOps
9 pages
1738644660089
No ratings yet
1738644660089
12 pages
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
No ratings yet
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
12 pages
DA Python Env Intro
No ratings yet
DA Python Env Intro
47 pages
Artificial Intelligence Machine Learning Program Brochure
0% (1)
Artificial Intelligence Machine Learning Program Brochure
14 pages
Machine Learning Operations
No ratings yet
Machine Learning Operations
11 pages
Applied Scientist Candidate Companion
No ratings yet
Applied Scientist Candidate Companion
5 pages
AI Fellowship Syllabus LATAM
No ratings yet
AI Fellowship Syllabus LATAM
17 pages
Lecture 4-5
No ratings yet
Lecture 4-5
48 pages
Mlops: Continuous Delivery and Automation Pipelines in Machine Learning
100% (1)
Mlops: Continuous Delivery and Automation Pipelines in Machine Learning
14 pages
Week 2 - Select and Train A Model
No ratings yet
Week 2 - Select and Train A Model
29 pages
Machine Learning Dev Ops Engineer Nanodegree Program Syllabus
No ratings yet
Machine Learning Dev Ops Engineer Nanodegree Program Syllabus
16 pages
Business Analysis for Product Owners Courseware
From Everand
Business Analysis for Product Owners Courseware
Bart Bernink
No ratings yet
Visvesvaraya Technological University: Mechanical Engineering
No ratings yet
Visvesvaraya Technological University: Mechanical Engineering
4 pages
Operations Research
No ratings yet
Operations Research
40 pages
Minor Project Report Unified Wheel Opener
No ratings yet
Minor Project Report Unified Wheel Opener
9 pages
Mechatronics Microprocessor - Jun July - 2018-10ME65 PDF
No ratings yet
Mechatronics Microprocessor - Jun July - 2018-10ME65 PDF
37 pages
Shirdi Sai Engineering College
No ratings yet
Shirdi Sai Engineering College
2 pages
Synthesis of A Bone Like Composite Material Dderived From Natural Pearl Oyster Shells For Potential Tissue Bioengineering Applications
No ratings yet
Synthesis of A Bone Like Composite Material Dderived From Natural Pearl Oyster Shells For Potential Tissue Bioengineering Applications
6 pages
Operation Mangaement
No ratings yet
Operation Mangaement
32 pages
CBCS Scheme: Fluid Power Systems (Model QP)
No ratings yet
CBCS Scheme: Fluid Power Systems (Model QP)
2 pages
Mechatronics: CBCS Scheme: 2015-16
No ratings yet
Mechatronics: CBCS Scheme: 2015-16
2 pages
Model Question Paper (CBCS) With Effect From 2015-16: 15ME73 USN
No ratings yet
Model Question Paper (CBCS) With Effect From 2015-16: 15ME73 USN
2 pages
Mechatronics: CBCS Scheme: 2015-16
No ratings yet
Mechatronics: CBCS Scheme: 2015-16
2 pages
Model Question Paper-CBCS Scheme: 18ME15/25 15ME751
No ratings yet
Model Question Paper-CBCS Scheme: 18ME15/25 15ME751
2 pages
Cim Automation Lab Manual 10me78
No ratings yet
Cim Automation Lab Manual 10me78
57 pages
Green Engine Seminar Report
No ratings yet
Green Engine Seminar Report
23 pages
Fusing EDC Specifications and Design Veeva Vault CDMS
No ratings yet
Fusing EDC Specifications and Design Veeva Vault CDMS
9 pages
Arduino: The Visuino Project - Part 3 Page - 1/16 Create Your Own Components For Visuino
No ratings yet
Arduino: The Visuino Project - Part 3 Page - 1/16 Create Your Own Components For Visuino
16 pages
D2T3 - James Forshaw - Introduction To Logical Privilege Escalation On Windows
No ratings yet
D2T3 - James Forshaw - Introduction To Logical Privilege Escalation On Windows
116 pages
Oracle: Exam Questions 1Z0-071
No ratings yet
Oracle: Exam Questions 1Z0-071
7 pages
OOP Lab02 ProblemModelingEncapsulation
No ratings yet
OOP Lab02 ProblemModelingEncapsulation
12 pages
R Lab
No ratings yet
R Lab
114 pages
Philips C++ Coding Standard ( C++11)
No ratings yet
Philips C++ Coding Standard ( C++11)
97 pages
Fast Fourier Transform (FFT) (Theory and Implementation)
No ratings yet
Fast Fourier Transform (FFT) (Theory and Implementation)
54 pages
Chapter 04
No ratings yet
Chapter 04
50 pages
Chapter 3 Program Design and Coding Part 1
No ratings yet
Chapter 3 Program Design and Coding Part 1
82 pages
CMPSCI 201: Architecture and Assembly Language: Deepak Ganesan Computer Science Department
No ratings yet
CMPSCI 201: Architecture and Assembly Language: Deepak Ganesan Computer Science Department
15 pages
Introduction To Computer Science - Chapter 4 Lab: Writing Multiple Classes - 25 Points
No ratings yet
Introduction To Computer Science - Chapter 4 Lab: Writing Multiple Classes - 25 Points
5 pages
Amoeba Distributed Operating System
No ratings yet
Amoeba Distributed Operating System
20 pages
GEI-100271 System Database (SDB) Browser PDF
No ratings yet
GEI-100271 System Database (SDB) Browser PDF
18 pages
02-To Run of Various Shell Commands
No ratings yet
02-To Run of Various Shell Commands
6 pages
React Reference Overview – React
No ratings yet
React Reference Overview – React
3 pages
Unit Ii
No ratings yet
Unit Ii
82 pages
Migrate Oracle To SQL Server 2008
No ratings yet
Migrate Oracle To SQL Server 2008
171 pages
Cards
No ratings yet
Cards
29 pages
sitecore-102 (2)
No ratings yet
sitecore-102 (2)
44 pages
Crash 1132
No ratings yet
Crash 1132
9 pages
2024 Power Apps Coding Standards For Canvas Apps
No ratings yet
2024 Power Apps Coding Standards For Canvas Apps
52 pages
ISAM (An Acronym For Indexed Sequential Access Method) Is A Method For Creating, Maintaining, and
No ratings yet
ISAM (An Acronym For Indexed Sequential Access Method) Is A Method For Creating, Maintaining, and
4 pages
2.2 Graphical Method Minimization
No ratings yet
2.2 Graphical Method Minimization
11 pages
Insurance Database
No ratings yet
Insurance Database
12 pages
6 CNC Theory - General
No ratings yet
6 CNC Theory - General
20 pages
Csen 2101
No ratings yet
Csen 2101
2 pages