LLM Economics Hansa Cequity 2023 Low - Shared by WorldLine Technology
LLM Economics Hansa Cequity 2023 Low - Shared by WorldLine Technology
LLM Economics Hansa Cequity 2023 Low - Shared by WorldLine Technology
by Ayush Jain
© 2023 AIM Media House LLC and/or its affiliates. All rights reserved. For more
information, email [email protected] or visit aimresearch.co.
September 2023
RESEARCH
LLM Economics
Table of Content
Foreword............................................................................................................................................................3
Executive Summary.......................................................................................................................................4
Introduction......................................................................................................................................................5
Cost Analysis....................................................................................................................................................13
Self-hosting LLMs.............................................................................................................................................................................21
Conclusion .........................................................................................................................................................36
Foreword
Generative Artificial Intelligence (AI) has gained
significant attention for its potential to transform
various industries. Some of the ways that an
organisation can use generative AI are - Personalising
customer experiences, streamlining operations and
efficiency, enhancing decision-making, preserving
privacy and security, fraud detection and
cybersecurity. However, most organisations are
encountering challenges when implementing
generative AI in their systems. Understanding the
costs involved and developing sustainable solutions is
crucial for organisations looking to leverage
generative AI effectively.
3 LLM Economics
AIM Research
Executive Summary
As we find ourselves amidst the 'ChatGPT moment', The future of AI seems to be leaning towards
LLMs stand at the fulcrum of a transformative wave, agent technology, where multiple AI agents work
prompting industry leaders to regard this together to achieve specific tasks, instead of a
development as a powerful tool to 'reduce costs and single AI entity handling all tasks. These
increase profits'. But, the market doesn’t seem to be technologies would be industry-specific and
unfolding as per the hype. A comprehensive would collaborate similarly to a human mind,
understanding of the infrastructure necessary to although achieving this level of integration and
maximize the potential of this new domain—including function is still a far-off goal.
insights into the cost-benefit ratio, pertinent use
cases, and the motivations driving organizations to Organizations are evaluating both API and open-
adopt such tools—remains elusive. source options for AI integration, weighing factors
like speed to market, customization, and
On top of that, Gartner’s recent research also regulatory requirements. While APIs might be
forecasts a significant slowdown in enterprise favored for pilot projects due to their quick
deployments in the general AI space. As highlighted deployment, open-source might be the choice for
in the study, it is projected that over the next two full-fledged production, offering better audit
years, the overwhelming costs will exceed the value facilities and customization options.
that will be generated, culminating in about 50% of
the large enterprises abandoning their large-scale AI Thus, a report like this could serve as a vital tool in
model developments by 2028. this process, helping stakeholders to assess the
potential costs and benefits associated with different
To get to the crux of reality, AIM Research hosted a implementation strategies, whether it be through API
roundtable discussion comprising of several AI or open-source pathways. It could clarify the
leaders from different industries working in this complexities of both direct and indirect costs,
space. Here are some key insights that came to light: facilitating smarter decisions that consider factors
such as quick deployment and customization options.
Identifying the appropriate use case with
quantifiable business benefits is critical. It Ultimately, such a report could guide organizations in
involves understanding the technology's choosing the most suitable and cost-effective
capabilities and aligning them with business solutions for AI integration.
objectives.
4 LLM Economics
AIM Research
Introduction
While the allure of Generative AI in enterprise
solutions is undeniable, there exists a cloud of
uncertainty surrounding its actual costs of
implementation. Many enterprises struggle with the
less concrete parts of using AI, like getting to know
the complex technology, the unpredictable nature of
generative models, and issues related to data privacy
and control. However, the main worry is about the "I believe it's crucial to begin and
financial aspect.
experiment. Given this is a new space -
RESEARCH
The direct and indirect costs associated with from a CXO's perspective, the right
integrating AI into production systems are
thing to do would be to focus on
ambiguous, often leading to misconceptions. For
businesses, especially SMEs, deciphering these internal use cases initially - as they
costs is crucial, from initial deployment to the long- would carry less risk. There are
term aspects of maintenance, updates, data
management, and security. The rapid evolution of AI numerous scenarios where this can
technologies further compounds this challenge, as make a significant difference. Start
models need frequent monitoring, updating, and
retraining to stay effective.
there to build momentum and gain
experience, and then transition to
This ambiguity calls for comprehensive research that tackling more impactful use cases -
can demystify the various components influencing the
cost of implementing Generative AI. By breaking including customer facing ones."
down the myriad elements, from external APIs to
self-hosting open source models on Cloud, a clearer
Arvind Mathur, Chief
picture can emerge, dispelling myths and giving
Information Officer AMEA at
organizations a more grounded understanding. Such
Kellogg's
research would offer a reality check against the
myriad numbers often cited, and guiding enterprises
in their AI endeavors with more precision and
confidence.
LLM Economics 5
AIM Research
Methodology
The research design for this study employs a case "From a marketing communication
study approach. We will consider four use cases from
different industries within the Martech lifecycle and perspective, I expect generative AI
calculate estimated costs under various implementations to happen sooner.
implementation scenarios. This research will then be
validated through secondary studies, consultations
This is because they don't require
with industry experts, and focus groups. The multi- constant changes. I envision numerous
faceted approach bridges the gap between theoretical use cases emerging within the next
estimations and the practical realities of developing
these models for enterprise use cases. year, with a lot more industries coming
up around that as well."
Additionally, for each use case, we estimate the cost
of developing it as a chatbot.
6 LLM Economics
AIM Research
Research Objectives
The research objectives for this study are as follows:
Section 1: Defining the Use Case - We lay the groundwork with descriptive case studies that shed light on
different real-world applications. This is to arrive at a realistic estimation of how many tokens are generated for
each of the use case.
Section 2: Cost Analysis - Next, we conduct a detailed cost analysis, focusing on both API and Cloud GPU
pathways to provide a balanced view. At the end of each analysis, you’ll see a visual representation of the up and
above cost beyond simple integration.
Section 3: Strategies to Reduce Cost - In this section, we explore potential strategies to trim costs effectively
without compromising output quality.
We have also developed a cost calculator tool to facilitate an easy estimation of approximate costs for your
specific use case. You can access this tool [here].
3
7 LLM Economics
RESEARCH
3
Chapter 1
9 LLM Economics
AIM Research
Exhibit 1
Implementing Generative AI Solutions Across the MarTech Lifecycle
Industry Prompting/Finetu
Use Case Examples ning Technique Timeframe
Churn Analysis & JPMorgan Chase's Few-shot learning, 8-12 months from
Win-Back Incentive-Based IndexGPT temperature tuning research to on-
Win-Back in BFSI application ground
implementation
LLM Economics 10
AIM Research
Exhibit 2
Comparing External API and Self-hosted Models
ADVANTAGE ADVANTAGE
Customization Maintenance
APIs might not provide as much flexibility It requires ongoing attention to keep the
or customization compared to self-hosted systems running smoothly and securely
solutions
11 LLM Economics
AIM Research
LLM Economics 12
RESEARCH
3
Chapter 2
Cost Analysis
The cost analysis section will comprise several use cases in which
the costs of using an external API will be compared with those of
self-hosting on cloud GPUs. This section will also explore potential
areas for cost optimization.
AIM Research
To break it down:
14 LLM Economics
AIM Research
LLM Economics 15
AIM Research
Step 3
Calculate the expected cost of Fine-tuning and Inference
Fine-tuning Input cost for 60 Output cost for Total cost of inference
Models Input usage Output usage million tokens 120 million tokens for 1 month
16 LLM Economics
AIM Research
LLM Economics 17
AIM Research
Step - 2
Calculate the expected cost of Inference
18 LLM Economics
AIM Research
Vector DB
Embedding Usage cost per monthly cost
Usage dimensions month for 150QPS*
*This is just an average estimate; the cost will vary based on the
kind of database used
LLM Economics 19
AIM Research
Embedding
Cost
Prompt Length
~+70%
Fine-tuning
Iterations
Theoretical
Minimum Cost
Upstream Downstream
(decisions that are already locked in (degrees of freedom are available
to the cost) that can still affect the cost)
20 LLM Economics
AIM Research
Self-hosting LLMs
Use Case 3 - Churn Analysis and Step - 2: Calculate the number of tokens
generated for Inference
Incentive-Based Win-Back in BFSI
LLM Economics 21
AIM Research
Step - 3
Calculate the expected cost of Fine-tuning and Inference
22 LLM Economics
AIM Research
LLM Economics 23
AIM Research
24 LLM Economics
AIM Research
Self-hosting LLMs
Use Case 4 - Content Personalization in
FMCG
LLM Economics 25
AIM Research
Step - 2
Calculate the expected cost of Inference
26 LLM Economics
AIM Research
Self-hosting LLMs
Optimizing Cost Using Optimization Libraries
LLM Economics 27
AIM Research
Self-hosting LLMs
Balancing Costs in Self-hosting
Exhibit 4
As the complexity of the use case increases, it will affect each of these cost
variables to different degrees.
Sampled
Responses
Infrastructure
~+55%
Fine-tuning
Theoretical Minimum Iterations
Cost
Upstream Dowmstream
(decisions that are already locked in (degrees of freedom are available
to the cost) that can still affect the cost)
28 LLM Economics
RESEARCH
3
Chapter 3
30 LLM Economics
AIM Research
LLM Economics 31
AIM Research
Optimize Hardware Resources: Choose hardware Caching Responses: Employ caching mechanisms to
resources that match the model's requirements store and reuse frequently generated responses,
without overprovisioning. This involves selecting the reducing the need for repetitive computations and
right CPU, GPU, or TPU configurations based on the lowering overall processing costs.
model's complexity and workload, minimizing
unnecessary expenses. Monitoring and Optimization: Continuously monitor
resource utilization and model performance to
Use Efficient Model Architectures: Explore efficient identify areas for improvement. Fine-tune the model
model architectures that strike a balance between and infrastructure configurations based on real-time
performance and resource consumption. Smaller or data to maximize efficiency.
lightweight models can still deliver satisfactory results
for many use cases while reducing computational
costs.
32 LLM Economics
RESEARCH
Chapter 4
Roadmap for
Implementation
In this section, we outline a roadmap to financially sustainable
implementations of text-based Large Language Models (LLMs).
Focusing on long-term cost-effectiveness, we will explore strategies
for optimizing expenses and resource allocation, ensuring a
balanced budget while maintaining optimal performance.
AIM Research
LLM Economics 34
AIM Research
Application
Type?
Project Scale?
Tolerance for Latency?
HIGH LOW
Small to Medium?
HIGH MID
Consider self- Use pre-trained
hosted model; API with
more complex reasonable rate Secure data?
models with fine- limits or work Self-hosted
tuning with embeddings models with
for simpler tasks powerful GPU
YES NO
35 LLM Economics
AIM Research
Conclusion
In conclusion, as we step into a period that is
expected to be quite dynamic in the coming years, it's
crucial to keep a close eye on the changes happening
in the marketing departments of various industries.
These shifts, spurred by generative AI advancements,
are setting the stage for deeper and more
personalized connections with customers.
LLM Economics 36
AIM Research
Acknowledgements
The preparation of this report was greatly enriched
by the invaluable contributions of numerous
professionals including Narasimha Medeme, VP
Head Data Science at MakeMyTrip, Ashwin Swarup,
VP Data Science at NimbleWork.Inc., Anirban Nandi,
VP AI Products & Business Analytics at Rakuten India,
Sourav Banerjee, Head of Innovation at
TheMathCompany, and Srinath Sivalenka, Senior
Manager - Generative AI Capabilities, who generously
dedicated their time and expertise to the peer-review
process.
LLM Economics 37
AIM Research
38 LLM Economics
Organisations across the
world utilize us for advice and
tools to lead their digital
transformation using data.
aimresearch.co
© 2023 AIM Media House LLC and/or its affiliates. All rights reserved.
Images or text from this
publication may not be reproduced or distributed in any
form without prior written permission from Analytics India Magazine.
The information contained in this publication has been obtained from
sources believed to be reliable. Analytics India Magazine disclaims all
warranties as to the accuracy, completeness, or adequacy of such
information and shall have no liability for errors, omissions, or
inadequacies in such information. This publication consists of the
opinions of Analytics India Magazine and should not be construed as
statements of fact. The opinions expressed herein are subject to
change without notice.
RESEARCH
AIM India
#280, 2nd floor, 5th Main, 15 A cross,
Sector 6, HSR layout Bengaluru, Karnataka
560102
AIM Americas
2955, 1603 Capitol Avenue, Suite 413A,
Cheyenne, WY, Laramie, US, 82001
www.aimresearch.co