FDA-GPT

Description

FDA-GPT is a semantic search engine designed to perform document retrieval and Question Answering (QA) on a collection of FDA approval packets. Essentially, it helps users find relevant information and answers from a large set of FDA documents.

This wiki will focus on the abstract design of the application, making it easier for the infrastructure team to support and expand it.

High-Level Design

src: Lucidchart

User Access: Users access FDA-GPT through test.rdchat.roche.com by selecting FDA-GPT as the model.
Connection to fda-chat Pod: rdchat connects to the fda-chat pod via ClusterIP.
Metadata Extraction Model: The Metadata Extraction Model uses a checkpoint to load a pre-trained Named Entity Recognition (NER) model. This model extracts drug names from user queries to filter relevant documents. The checkpoint for the Metadata Extraction Model can be found at: ai-fda-gpt-dev-bucket/ddi_drug_model.zip. This zip file is downloaded by an initContainer to a Persistent Volume (PV) when the pod starts and is mounted under /model_artifacts/ddi_drug_model.
Embedding Generation: The text-embedding-ada-002 model generates embeddings for user queries.
Vector Storage and Search: Vector embeddings for FDA document chunks are stored in a PostgreSQL database with the pgvector extension. The system enables searching through the database using SQL queries to compare the user query vector to the FDA document vectors.
Conversational QA: The gpt-4 model handles the conversational QA component, summarizing the documents retrieved based on the user query.
Response: The summarized response is sent back to the user.

The FDA pod is running on GPU Nodes.

Team Responsible

The team responsible for maintaining and troubleshooting the application.

Main Point of Contact:

Jaymin Soni
- Email: JAYMIN.SONI@CONTRACTORS.ROCHE.COM
- unix-id: sonij3
Data Engineering Team
- Slack: #ecdi-ace-data-engineering
- Slack: #fda-gpt-ace

GitHub

The repository for the application can be found at github.com/gred-ecdi/ace-nlp-fda-gpt.

Deployment Information

	Staging	Production
git branch	`main`	-
Deployment environment	ace-test (EKS) Namespace: `ace-nlp-fda-gpt`	-
Deployment tool	https://argocd.eks.test.gred.ai/applications/argocd/ace-nlp-fda-gpt	-
URL	via test.rdchat.roche.com	-
S3 Bucket	- `ai-fda-gpt-dev-bucket` - Application connects to s3 via `ai-fda-gpt-dev-read-app-fda-gpt` s3 access point	-
Service Account	- serviceAccount Name: `ace-nlp-fda-gpt` - irsa role: `arn:aws:iam::712649426017:role/irsa.ace-test.ace-nlp-fda-gpt.fda`	-

The Terraform workspace for the infrastructure can be found at terraform-ace-prod/us-west-2/ai-fda-gpt-dev.

How does the application get access to its s3 bucket ?

EKS Service account -> IRSA IAM Role -> The role has the tag id:app -> IAM Policy ace-eks-app-s3-access

Src: https://github.com/gred-ecdi/Containerization-Template

Branching Flow

How does Main Branch get versions?

If there is a ‘fix’, ‘feat’, or ‘hotfix’ present in the git commit, the GitHub action release will bump up the version. So it’s automatically.

How does helm Chart get new versions?

You should manually update the helm verion in Chart.yaml.

Monitoring

Monitoring Dashboard: Grafana
- Metrics being monitored: Memory usage by pod, CPU usage by pod, Replica Availability, PV Capacity
- Alerts and notifications:
  1. PersistentVolumeUsage
Logging Dashboard: Logs are available on the same Grafana Dashboard.
- To search logs in more details, please log in to OpenSearch > EKS ACE Test - Applications Dashboard > then choose namespace ace-nlp-fda-gpt

TODOs

Database data is currently saved via the master admin. Create a dedicated app user for this purpose.
The text-embedding-ada-002 model is outdated. Update to the latest version.
Improve the CI/CD pipeline.

Eyenotate App Ngene