PWC AI Engineer Interview Assignment Guidelines

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

‭ erverless GenAI Question Answering application using AWS CDK, AWS‬

S
‭Lambda, Poetry, Langchain, OpenAI, Pinecone and Gradio‬
‭ his post demonstrates building a basic Document Q&A Generative AI application using AWS serverless technologies,‬
T
‭Langchain, AWS Cloud Development Kit (CDK), Poetry, OpenAI, Pinecone and Gradio.‬
‭ arge Language Models (LLMs) perform quite well on general knowledge retrieval tasks because they have been trained‬
L
‭on a large corpus of openly available internet data. However, in an enterprise environment or use cases where we need to‬
‭ask questions on more specialized topics, a general purpose LLM will have issues in coming up with precise responses.‬
‭LLMs can be used for more complex and knowledge-intensive tasks by using an approach called Retrieval-Augmented‬
‭Generation (RAG). RAG combines retrieval-based methods with generative language models to improve the quality of‬
‭generated text, especially in question answering and text generation tasks.‬
‭ or the purposes of this example, we will build a simple system to do question-answering on openly available clinical trial‬
F
‭protocol documents (https://clinicaltrials.gov/). To setup the knowledge base for the RAG approach, we will build a data‬
‭pipeline that ingests a clinical protocol document (pdf format), converts the pdf to text format, chunk the document into‬
‭smaller pieces, convert the chunks into embeddings using an embedding model and store the embeddings in a vector‬
‭database.‬
‭ ow, when we ask a question to the LLM about the clinical trial, we can provide more context to the LLM by providing‬
N
‭relevant chunks of the document as part of the prompt. This can be done by using a framework such as Langchain, that‬
‭takes the user question, converts the question into a vector representation, does a semantic search against the‬
‭knowledge base, retrieves the relevant chunks, ranks them by relevance and sends the chunks as context in the prompt‬
‭to the LLM.‬

‭Architecture‬
‭The following diagram shows a high level architecture of the system:‬

‭3‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭AWS Services and OSS Frameworks Used‬
‭ WS Cloud Development Kit v2 (CDK)‬‭: Framework for‬‭defining cloud infrastructure in code and provisioning it using‬
A
‭infrastructure-as-code. It enables developers to use programming languages like Python, TypeScript & Java to define‬
‭AWS resources in a higher-level, declarative manner. It provides benefits such as the ability to use code completion in an‬
‭IDE instead of wrangling with yaml files, out of the box default, sensible configurations, declarative syntax, reusable code‬
‭through constructs, built-in leading practices.‬
‭ inecone‬‭: Vector engine to store and search high-dimensional‬‭vector embeddings in real-time. You could also use AWS‬
P
‭alternatives for a vector database. AWS recently announced the preview launch of a vector engine for Amazon‬
‭OpenSearch Serverless (‬‭https://aws.amazon.com/opensearch-service/serverless-vector-engine/‬‭).‬‭There are other options‬
‭available on AWS for a vector database alternative such as‬‭Amazon Aurora PostgreSQL 15.3 with the pgvector‬
‭extension.‬
‭ angChain‬‭: LangChain is a framework for developing‬‭applications powered by language models. It enables developers‬
L
‭to combine LLMs with external data to create custom-knowledge chatbots, question-answering systems, summarization‬
‭tools, and other applications that can interact with users in natural language. We will build a AWS Lambda Layer that‬
‭includes Langchain and other relevant libraries for the backend systems to use as part of the application.‬
‭ oetry‬‭: Poetry is a dependency management and packaging‬‭tool for Python programming. It aims to simplify and improve‬
P
‭the process of managing project dependencies, creating virtual environments, and packaging Python applications and‬
‭libraries. Poetry provides a unified solution that combines the capabilities of other existing tools like pip, virtualenv, and‬
‭setuptools into a single workflow. Poetry allows developers to declare project dependencies in a dedicated pyproject.toml‬
‭file. Poetry performs dependency resolution to confirm that the specified packages and their versions are compatible with‬
‭each other. It generates a "lock file" called poetry.lock, which records the exact versions of the dependencies used in the‬
‭project.‬
‭ radio‬‭: Gradio is an open-source Python framework‬‭that enables developers to create interactive web applications for‬
G
‭data science and machine learning projects with minimal effort. It enables users to create dynamic and interactive web‬
‭applications directly from their Python code, without the need for extensive web development expertise.‬

‭Backend Data Ingestion Pipeline‬


‭•‬ ‭We will start by installing the AWS CDK and creating a template CDK project.‬
‭Install the AWS CDK Toolkit globally using the following Node Package Manager command.‬

‭Unset‬
npm‬‭
‭ install‬‭
-g‬‭
aws-cdk‬

‭Run the following command to verify you installed the tool correctly and print the version installed.‬

‭Unset‬
cdk‬‭
‭ --version‬

‭Run the following command to create a sample CDK project in Python language.‬

‭C/C++‬
cdk‬‭
‭ init‬‭
app‬‭
--language‬‭
python‬

‭4‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Within the sample CDK project folder, create a sample Poetry project. Copy over the pyproject.toml file to the base CDK‬
‭folder. Delete the sample poetry project folder. Deleted .venv file that was setup when CDK initiated the project.‬
‭•‬ ‭Run ‘poetry shell’, to start a new virtual environment. Run ‘poetry add aws-cdk-lib’, ‘poetry add constructs’, 'poetry add‬
‭langchain pinecone-client pypdf openai, gradio, python-dotenv' to add packages.‬

‭You should see an output similar to this after you run the poetry install command.‬

‭Unset‬
‭genai-blog-py3.9)‬‭
( [ssm-user@redacted‬‭
genai_blog]$‬‭
poetry‬‭
install‬
Updating‬‭
‭ dependencies‬
Resolving‬‭
‭ dependencies...‬‭
(2.4s)‬

Package‬‭
‭ operations:‬‭
52‬‭
installs,‬‭
0‬‭
updates,‬‭
0‬‭
removals‬

•‬‭
‭ Installing‬‭
attrs‬‭
(23.1.0)‬

‭5‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭‬‭
• Installing‬‭
exceptiongroup‬‭(1.1.2)‬
•‬‭
‭ Installing‬‭
packaging‬‭(23.1)‬
•‬‭
‭ Installing‬‭
six‬‭
(1.16.0)‬
•‬‭
‭ Installing‬‭
typing-extensions‬‭(4.7.1)‬
•‬‭
‭ Installing‬‭
zipp‬‭
(3.16.2)‬
•‬‭
‭ Installing‬‭
cattrs‬‭
(23.1.2)‬
•‬‭
‭ Installing‬‭
certifi‬‭(2023.7.22)‬
•‬‭
‭ Installing‬‭
charset-normalizer‬‭ (3.2.0)‬
•‬‭
‭ Installing‬‭
frozenlist‬‭(1.4.0)‬
•‬‭
‭ Installing‬‭
idna‬‭
(3.4)‬
•‬‭
‭ Installing‬‭
importlib-resources‬‭ (6.0.1)‬
•‬‭
‭ Installing‬‭
marshmallow‬‭(3.20.1)‬
•‬‭
‭ Installing‬‭
multidict‬‭(6.0.4)‬
•‬‭
‭ Installing‬‭
mypy-extensions‬‭(1.0.0)‬
•‬‭
‭ Installing‬‭
publication‬‭(0.0.3)‬
•‬‭
‭ Installing‬‭
python-dateutil‬‭(2.8.2)‬
•‬‭
‭ Installing‬‭
typeguard‬‭(2.13.3)‬
•‬‭
‭ Installing‬‭
urllib3‬‭(2.0.4)‬
•‬‭
‭ Installing‬‭
aiosignal‬‭(1.3.1)‬
•‬‭
‭ Installing‬‭
async-timeout‬‭(4.0.3)‬
•‬‭
‭ Installing‬‭
greenlet‬‭(2.0.2)‬
•‬‭
‭ Installing‬‭
jsii‬‭
(1.87.0)‬
•‬‭
‭ Installing‬‭
marshmallow-enum‬‭(1.5.1)‬
•‬‭
‭ Installing‬‭
numpy‬‭
(1.25.2)‬
•‬‭
‭ Installing‬‭
pydantic‬‭(1.10.12)‬
•‬‭
‭ Installing‬‭
requests‬‭(2.31.0)‬
•‬‭
‭ Installing‬‭
typing-inspect‬‭(0.9.0)‬
•‬‭
‭ Installing‬‭
yarl‬‭
(1.9.2)‬
•‬‭
‭ Installing‬‭
aiohttp‬‭(3.8.5)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-awscli-v1‬‭ (2.2.200)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-kubectl-v20‬‭ (2.1.2)‬
•‬‭
‭ Installing‬‭
aws-cdk-asset-node-proxy-agent-v5‬‭ (2.0.166)‬
•‬‭
‭ Installing‬‭
click‬‭
(8.1.6)‬
•‬‭
‭ Installing‬‭
constructs‬‭(10.2.69)‬
•‬‭
‭ Installing‬‭
dataclasses-json‬‭(0.5.9)‬
•‬‭
‭ Installing‬‭
langsmith‬‭(0.0.22)‬
•‬‭
‭ Installing‬‭
mccabe‬‭
(0.7.0)‬
•‬‭
‭ Installing‬‭
numexpr‬‭(2.8.5)‬
•‬‭
‭ Installing‬‭
openapi-schema-pydantic‬‭ (1.2.4)‬
•‬‭
‭ Installing‬‭
pathspec‬‭(0.11.2)‬
•‬‭
‭ Installing‬‭
platformdirs‬‭(3.10.0)‬
•‬‭
‭ Installing‬‭
pycodestyle‬‭(2.11.0)‬
•‬‭
‭ Installing‬‭
pyflakes‬‭(3.1.0)‬
•‬‭
‭ Installing‬‭
pyyaml‬‭
(6.0.1)‬
•‬‭
‭ Installing‬‭
sqlalchemy‬‭(2.0.19)‬
•‬‭
‭ Installing‬‭
tenacity‬‭(8.2.2)‬
•‬‭
‭ Installing‬‭
tomli‬‭
(2.0.1)‬
•‬‭
‭ Installing‬‭
aws-cdk-lib‬‭(2.91.0)‬
•‬‭
‭ Installing‬‭
black‬‭
(23.7.0)‬
•‬‭
‭ Installing‬‭
flake8‬‭
(6.1.0)‬
•‬‭
‭ Installing‬‭
langchain‬‭(0.0.251)‬

‭6‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


Writing‬‭
‭ lock‬‭
file‬

Installing‬‭
‭ the‬‭
current‬‭
project:‬‭
genai_blog‬‭
(0.1.0)‬

‭•‬ ‭Create an index in Pinecone (‬‭https://www.pinecone.io‬‭).‬‭You could use the Starter free tier option that allows you to‬
‭create one index and one project. Give the index a name, enter 1536 for dimension size and euclidean as Metric.‬

‭ he reason we use 1536 as the dimension is because we will be using the OpenAI embeddings for this application.‬
T
‭Please refer this link for details: (‬‭https://platform.openai.com/docs/guides/embeddings/what-are-embeddings‬‭)‬
‭Next, create an API key within your Pinecone account. Note the API key value, Environment and Pinecone index name.‬
‭•‬ ‭We will use the GPT 3.5 model as the LLM for this application. Go to‬‭https://platform.openai.com/account/api-keys‬‭and‬
‭create an API key.‬
‭•‬ ‭We will save all the keys into a secret in AWS Secrets Manager and retrieve the keys wherever required in our‬
‭application using the boto3 library. Create an api_keys.json file, enter your keys and create a secret using the aws cli.‬

‭7‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Create a LangChain Layer in AWS Lambda - Create a folder layers/common-layer/python. Create a requirements.txt file‬
‭under/common-layer and add langchain, urllib3, openai, tiktoken as dependencies. The file structure should look similar‬
‭to the following screenshot.‬

‭•‬ ‭Create a Makefile at the project root level.‬

‭8‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Add a ‘make' command in cdk.json.‬

‭9‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭The project structure and the initial dependencies have now been set up. We can add more libraries as we go along‬
‭building the application via poetry install. If the library needs to be a part of the Lambda layers package, we will need to‬
‭add those dependencies to the requirements.txt file under layers folder.‬
‭•‬ ‭We will create a stack 'genai_blog_stack' using the Python programming language. The code will create the following:‬
‭–‬‭An S3 bucket‬
‭–‬‭Create a SQS queue to receive events when documents land in the "genai-demo" bucket under 'raw' prefix,‬
‭–‬‭Attach an event to the S3 bucket to send notifications to the SQS queue,‬
‭–‬‭Build a common layer for libraries such as langchain, etc.‬
‭–‬‭Add a lambda function to poll the queue, read the pdf, extract the pdf and convert to embeddings,‬
‭–‬‭Grant necessary permissions to the Lambda function‬
‭–‬‭Configure the Lambda function to be triggered by messages from the SQS queue.‬
‭–‬‭Grant permission to the Lambda function to access the secret‬

‭All this in less than 50 lines of code! Following is the code snippet:‬

‭Python‬
‭rom‬‭
f aws_cdk‬‭
import‬‭
RemovalPolicy,‬‭Stack,‬‭
Duration‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_s3‬‭
as‬‭
s3‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_sqs‬‭
as‬‭
sqs‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_s3_notifications‬‭as‬‭
s3n‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_lambda‬‭
as‬‭
lambda_‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_lambda_event_sources‬‭as‬‭event_sources‬
from‬‭
‭ constructs‬‭
import‬‭
Construct‬
from‬‭
‭ aws_cdk‬‭
import‬‭
aws_secretsmanager‬‭as‬‭
secretsmanager‬

class‬‭
‭ GenaiBlogStack(Stack):‬
def‬‭
‭ __init__(self,‬‭
scope:‬‭
Construct,‬‭
construct_id:‬‭
str‬
,‬‭
‭ **kwargs)‬‭
->‬‭
None‬
:‬

super‬
‭ ().__init__(scope,‬‭
‭ construct_id,‬‭
**kwargs)‬

‭ Create the S3 bucket‬


#
bucket‬‭
‭ =‬‭
s3.Bucket(self,‬‭
"genai-demo"‬
,‬‭
‭ removal_policy=RemovalPolicy.DESTROY)‬

# Create a SQS queue to receive events when documents land in the "genai-demo" bucket‬

under 'raw' prefix‬

queue‬‭
‭ =‬‭
sqs.Queue(self,‬‭
"genai-demo-queue"‬
,‬‭
‭ visibility_timeout=Duration.minutes(‬
15‬
‭ ))‬

‭ Pass the queue URL as an environment variable‬


#
env_queue_url‬‭
‭ =‬‭
queue.queue_url‬

‭ Attach an event to the S3 bucket to send notifications to the SQS queue‬


#
bucket.add_event_notification(‬

s3.EventType.OBJECT_CREATED,‬

s3n.SqsDestination(queue),‬

s3.NotificationKeyFilter(‬

prefix=‬
‭ "raw/"‬
‭ ,‬

suffix=‬
‭ ".pdf"‬

)‬

‭10‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


)‬

‭ Build common layer for libraries such as langchain, etc.‬


#
common_lambda_layer‬‭
‭ =‬‭
lambda_.LayerVersion(‬
self,‬

"CommonLambdaLayer"‬
‭ ,‬

code=lambda_.Code.from_asset(‬
‭ "./layers/common-layer"‬
‭ ),‬

compatible_runtimes=[lambda_.Runtime.PYTHON_3_9],‬

layer_version_name=‬
‭ "common_lambda_layer"‬

)‬

# Add a lambda function to poll the queue, read the pdf, extract the pdf and convert‬

to embeddings‬

pdf_extraction_lambda‬‭
‭ =‬‭
lambda_.Function(‬
self,‬

'GenaiPDFHandler'‬
‭ ,‬

runtime=lambda_.Runtime.PYTHON_3_9,‬

timeout=Duration.seconds(‬
‭ 300‬
‭ ),‬

memory_size=‬
‭ 2048‬
‭ ,‬

code=lambda_.Code.from_asset(‬
‭ 'lambda'‬
‭ ),‬

handler=‬
‭ 'genaiblog_pdf_extraction.handler'‬
‭ ,‬

layers‬‭
‭ =[common_lambda_layer],‬
environment={‬

"QUEUE_URL"‬
‭ :‬‭
‭ env_queue_url‬
}‬

)‬

# Grant necessary permissions to the Lambda function‬

queue.grant_send_messages(pdf_extraction_lambda)‬

‭ Configure the Lambda function to be triggered by messages from the SQS queue‬
#
queue_trigger‬‭
‭ =‬‭
event_sources.SqsEventSource(queue,‬‭
batch_size=‬
1‬
‭ )‬

pdf_extraction_lambda.add_event_source(queue_trigger)‬

‭ Add S3 read permissions to the Lambda function‬


#
bucket.grant_read(pdf_extraction_lambda)‬

‭ Grant permission to the Lambda function to access the secret‬


#
example_secret‬‭
‭ =‬‭secretsmanager.Secret.from_secret_name_v2(‬
scope=self,‬‭
‭ id‬
=‭
‭"‬secretExample"‬
,‬‭
‭ secret_name=‬
"demo/gb"‬

)‬

example_secret.grant_read(grantee=pdf_extraction_lambda)‬

‭•‬ ‭Write the lambda function to read the pdf file uploaded to S3 bucket, extract pdf to text, convert to embeddings and‬
‭store in a vector database.‬

‭Python‬
‭mport‬‭
i json‬
import‬‭
‭ boto3‬

‭11‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭mport‬‭
i os‬
import‬‭
‭ pinecone‬
from‬‭
‭ langchain‬‭
import‬‭
PromptTemplate‬
from‬‭
‭ langchain.document_loaders‬‭ import‬‭
PyPDFLoader‬
from‬‭
‭ langchain.llms.openai‬‭import‬‭
OpenAI‬
from‬‭
‭ langchain.vectorstores.pinecone‬‭ import‬‭
Pinecone‬
from‬‭
‭ langchain.embeddings.openai‬‭ import‬‭
OpenAIEmbeddings‬
from‬‭
‭ langchain.chains‬‭import‬‭
RetrievalQA‬
from‬‭
‭ botocore.exceptions‬‭import‬‭
ClientError‬

s3‬‭
‭ =‬‭
boto3.client(‬
's3'‬
‭ )‬

def‬‭
‭ get_secret():‬

‭ecret_name‬‭
s =‬‭
"demo/gb"‬
region_name‬‭
‭ =‬‭
"us-east-1"‬

‭ Create a Secrets Manager client‬


#
session‬‭
‭ =‬‭boto3.session.Session()‬
client‬‭
‭ =‬‭
session.client(‬
service_name=‬
‭ 'secretsmanager'‬
‭ ,‬

region_name=region_name‬

)‬

# Retrieve the secret value‬

try‬
‭ :‬

get_secret_value_response‬‭
‭ =‬‭
client.get_secret_value(‬
SecretId=secret_name‬

)‬

except‬‭
‭ ClientError‬‭as‬‭
e:‬
raise‬‭
‭ e‬

‭ Decrypts secret using the associated KMS key.‬


#
secret‬‭
‭ =‬‭
get_secret_value_response[‬
'SecretString'‬
‭ ]‬

secret_json‬‭
‭ =‬‭
json.loads(secret)‬
return‬‭
‭ secret_json‬

def‬‭
‭ handler(event,‬‭
context):‬
print‬
‭ (‬
‭ "request: {}"‬
‭ .‬
‭ format‬
‭ (json.dumps(event)))‬

get_secret()‬

# Loops through the records. We have set queue‬‭


‭ batch size to 1. So, only 1 record will be‬
retrieved.‬

for‬‭
‭ record‬‭
in‬‭
event[‬"Records"‬
‭ ]:‬

body‬‭
‭ =‬‭
record[‬"body"‬
‭ ]‬

body_json‬‭
‭ =‬‭
json.loads(body)‬

‭ucket_name‬‭
b =‬‭
body_json[‬
"Records"‬
‭ ][‬
‭ 0‭
‭]‬[‬
"s3"‬
‭ ][‬
‭ "bucket"‬
‭ ][‬
‭ "name"‬
‭ ]‬

print‬
‭ (bucket_name)‬

key‬‭
‭ =‬‭
body_json[‬"Records"‬
‭ ][‬
‭ 0‭
‭]
‬[‬
"s3"‬
‭ ][‬
‭ "object"‬
‭ ][‬
‭ "key"‬
‭ ]‬

print‬
‭ (key)‬

# Read PDF file from S3 into Lambda temporary‬‭


‭ storage‬

‭12‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭ile_name‬‭
f =‬‭
os.path.basename(key)‬
local_file_path‬‭
‭ =‬‭
'/tmp/'‬‭
+‬‭
file_name‬‭# Path‬‭
to store the downloaded file locally‬
s3.download_file(bucket_name,‬‭
‭ key,‬‭
local_file_path)‬

# Load PDF using pypdf into array of documents,‬‭


‭ where each document contains the page‬
content and metadata with page number.‬

loader‬‭
‭ =‬‭
PyPDFLoader(local_file_path)‬
pages‬‭
‭ =‬‭
loader.load_and_split()‬
print‬
‭ (pages[‬
‭ 0‭
‭]
‬)‬

‭ Retrieve API keys from Secrets Manager‬


#
secret‬‭
‭ =‬‭
get_secret()‬
pinecone_api_key‬‭
‭ =‬‭
secret[‬'PINECONE_API_KEY'‬
‭ ]‬

pinecone_env‬‭
‭ =‬‭
secret[‬ 'PINECONE_API_ENV'‬
‭ ]‬

pinecone_index_name‬‭
‭ =‬‭
secret[‬
'PINECONE_INDEX_NAME'‬
‭ ]‬

openai_api_key‬‭
‭ =‬‭
secret[‬ 'OPENAI_API_KEY'‬
‭ ]‬

# Generate embeddings for the pages, insert‬‭


‭ into Pinecone vector database, and expose the‬
index in a retriever interface‬

pinecone.init(api_key=pinecone_api_key,‬‭
‭ environment=pinecone_env)‬
embeddings‬‭
‭ =‬‭
OpenAIEmbeddings(openai_api_key=openai_api_key)‬
Pinecone.from_documents(pages,‬‭
‭ embeddings,‬‭
index_name=pinecone_index_name)‬

print‬
‭ (f‬
‭ '‬
‭ {key}‬‭
‭ loaded‬‭
to‬‭
pinecone‬‭
vector‬‭
store...‬
'‭
‭ )
‬‬
return‬‭
‭ {‬
"statusCode"‬
‭ :‬‭
‭ 200‬
,‬

"headers"‬
‭ :‬‭
‭ {‭
"
‬Content-Type"‬ :‬‭
‭ "text/plain"‬},‬

"body"‬
‭ :‬‭
‭ "Document successfully loaded to vector‬‭ store"‬
,‬

}‬

‭•‬ ‭Run the following command to synthesize the AWS CloudFormation templates based on the CDK code.‬

‭Unset‬
cdk‬‭
‭ synth‬

‭•‬ ‭Run the following command to deploy the stack.‬

‭Unset‬
cdk‬‭
‭ deploy‬

‭13‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭Wait for the application to be successfully deployed. We will upload an openly available protocol document to our S3‬
‭bucket. Use the following command to upload the pdf to your bucket. Protocol available to download here:‬
‭https://classic.clinicaltrials.gov/ProvidedDocs/78/NCT03078478/Prot_000.pdf)‬

‭Unset‬
aws‬‭
‭ s3‬‭
cp‬‭
NCT03078478.pdf‬‭
s3://genai-blog-pipeline-genaidemoaxxx-cxxxxcxx/raw/‬

‭•‬ ‭Uploading the pdf document to S3 will cause an event to be generated and details stored in the SQS queue. Our‬
‭'genaiblog_pdf_extraction' Lambda function will receive the details of the document, read it from S3, convert pdf to text,‬
‭split the document into chunks, call OpenAI embeddings model to convert the chunks to embeddings and store in the‬
‭Pinecone vector database.‬
‭•‬ ‭After the Lambda has completed execution, check your Pinecone dashboard to confirm the documents were stored in‬
‭the vector database. In the enclosed screenshot, we see 249 vectors were stored after the execution of the Lambda.‬

‭14‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭•‬ ‭This pipeline can be now used to ingest additional pdf documents and saved as a knowledge base in the vector store.‬

‭Frontend Chat User Interface‬


‭We will use the gradio framework to build a Chatbot interface for our question-answering application.‬
‭•‬ ‭Add gradio as a dependency to your project using 'poetry add gradio'.‬

‭Unset‬
‭genai-blog-py3.9)‬‭
( [ssm-user@redacted‬‭
genai_blog]$‬‭
poetry‬‭
add‬‭
gradio‬
Using‬‭
‭ version‬‭
^3.40.1‬‭
for‬‭
gradio‬

‭pdating‬‭
U dependencies‬
Resolving‬‭
‭ dependencies...‬‭
Downloading‬
https://files.pythonhosted.org/packages/72/7d/2ad1b94106f8b1971d1eff0ebb97a81d980c448732a3e62‬

4bba281bd274d/matplotlib-3.7.2-cp310-cp31‬

Resolving‬‭
‭ dependencies...‬‭
Downloading‬
https://files.pythonhosted.org/packages/c3/a0/5dba8ed157b0136607c7f2151db695885606968d1fae123‬

dc3391e0cfdbf/sniffio-1.3.0-py3-none-any.‬

Resolving‬‭
‭ dependencies...‬‭
(15.3s)‬

Package‬‭
‭ operations:‬‭
43‬‭
installs,‬‭
0‬‭
updates,‬‭
0‬‭
removals‬

‭‬‭
• Installing‬‭
rpds-py‬‭(0.9.2)‬
•‬‭
‭ Installing‬‭
sniffio‬‭(1.3.0)‬
•‬‭
‭ Installing‬‭
anyio‬‭
(3.7.1)‬
•‬‭
‭ Installing‬‭
h11‬‭
(0.14.0)‬
•‬‭
‭ Installing‬‭
referencing‬‭(0.30.2)‬
•‬‭
‭ Installing‬‭
filelock‬‭(3.12.2)‬
•‬‭
‭ Installing‬‭
fsspec‬‭
(2023.6.0)‬
•‬‭
‭ Installing‬‭
httpcore‬‭(0.17.3)‬
...........................‬

‭•‬ ‭Our gradio frontend will take user inputs about the protocol documents, and use Langchain to retrieve the relevant‬
‭chunks of information from the Pinecone vector store, pass the query and the retrieved chunks as context to the LLM.‬
‭Therefore, we will need to provide the relevant API keys to the gradio application. In this example, we are running the‬
‭frontend application locally. However, this application could be containerized and deployed on a AWS Elastic‬
‭Kubernetes Service cluster and scaled using Application Load Balancers.‬
‭•‬ ‭For our local deployment, we will use the python-dotenv package to provide the API keys to the application. Create a‬
‭folder 'frontend' in your project directory and add the app.py and .env files.‬

‭15‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭Python‬
‭mport‬‭
i os‬
import‬‭
‭ gradio‬‭
as‬‭
gr‬
from‬‭
‭ langchain.vectorstores‬‭
import‬‭Pinecone‬
from‬‭
‭ langchain.chains‬‭
import‬‭
RetrievalQA‬
from‬‭
‭ langchain.llms.openai‬‭
import‬‭
OpenAI‬
from‬‭
‭ langchain.embeddings‬‭
import‬‭
OpenAIEmbeddings‬
import‬‭
‭ pinecone‬
from‬‭
‭ dotenv‬‭
import‬‭
load_dotenv‬

load_dotenv()‬

‭inecone_api_key‬‭
p =‬‭
os.getenv(‬
"PINECONE_API_KEY"‬
‭ )‬

pinecone_env‬‭
‭ =‬‭
os.getenv(‬
"PINECONE_API_ENV"‬
‭ )‬

pinecone_index_name‬‭
‭ =‬‭
os.getenv(‬
"PINECONE_INDEX_NAME"‬
‭ )‬

openai_api_key‬‭
‭ =‬‭
os.getenv(‬
"OPENAI_API_KEY"‬
‭ )‬

‭inecone.init(api_key=pinecone_api_key,‬‭
p environment=pinecone_env)‬
embeddings‬‭
‭ =‬‭
OpenAIEmbeddings(openai_api_key=openai_api_key)‬

def‬‭
‭ predict(message,‬‭history):‬
# Initialize the OpenAI module, load and run the Retrieval Q&A chain‬

vectordb‬‭
‭ =‬‭Pinecone.from_existing_index(index_name=pinecone_index_name,‬
embedding=embeddings)‬

retriever‬‭
‭ =‬‭
vectordb.as_retriever()‬

‭lm‬‭
l =‬‭
OpenAI(temperature=‬0‬
‭ ,‬‭
‭ openai_api_key=openai_api_key)‬
qa‬‭
‭ =‬‭RetrievalQA.from_chain_type(llm,‬‭ chain_type=‬
"stuff"‬
‭ ,‬‭
‭ retriever=retriever)‬
response‬‭
‭ =‬‭
qa.run(message)‬
return‬‭
‭ response‬

‭r.ChatInterface(predict,‬
g
title=‬
‭ "Clinical Trials Q&A Bot"‬
‭ ,‬

description=‬
‭ "Ask questions about Clinical Trial protocol documents..."‬
‭ ,‬

).launch()‬

TEST‬

‭Run the frontend application using the following command:‬

‭Unset‬
gradio app.py‬

‭•‬ ‭This should bring up a chatbot interface on the following local url: http://127.0.0.1:7860/. Now, you can ask very specific‬
‭questions and have the LLM respond with information from the protocol documents using the Retrieval Augmented‬
‭Generation (RAG) approach..‬

‭16‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭References‬
‭•‬ ‭AWS Cloud Development Kit (AWS CDK) v2 (https://docs.aws.amazon.com/cdk/v2/guide/home.html)‬
‭•‬ ‭Amazon OpenSearch Service (‬‭https://docs.aws.amazon.com/opensearch-service/index.html‬‭)‬
‭•‬ ‭Pinecone vector database (https://www.pinecone.io/)‬
‭•‬ ‭LangChain (https://python.langchain.com/docs/get_started/introduction.html)‬
‭•‬ ‭Poetry (‬‭https://python-poetry.org/docs/‬‭)‬
‭•‬ ‭Gradio (https://www.gradio.app)‬

‭17‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭Contact Us‬

‭18‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬


‭Thank you‬

‭pwc.com‬

‭19‬ ‭PwC‬‭| GenAI Question-Answering‬‭Application Using AWS Serverless Technologies‬

You might also like