Neuron: A Learning Project and PoC implementing a private ChatGPT like (and beyond) Generative AI platform.

rmcdermo 275 views 29 slides Sep 19, 2024
Slide 1
Slide 1 of 29
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29

About This Presentation

This project aims to build a fully private, ChatGPT-like generative AI platform using a suite of open-source tools, primarily Ollama and Open WebUI. By running models on company-managed servers and storage, the initiative minimizes risk, cuts costs, and ensures complete control over systems, data, a...


Slide Content

1
2024-09-19
Neuron: A Learning Project and PoC
implementing a private ChatGPT like
(and beyond) Generative AI platform.
Robert McDermott

Disclaimer: This system is not used with PHI or other
sensitive information. It’s a private PoC and it not available for
use by any other Fred Hutch staff. This system is running on
Fred Hutch managed systems running behind the campus
firewall. At this point, it’s only a demonstration of a learning
project.

Goal: Create a completely private ChatGPT like Generative
AI platform using a collection of open-source software and
models that can run on Fred Hutch managed servers and
storage to eliminate the risk of data leakage by maintaining
full control of all systems, data and models… and learn a lot
along the way.

Footer
This is not conceptual; it’s a working Proof of Concept of a
general-purpose Generative AI Platform that I use on an
almost a daily basis and has slowly evolved and been
improved over more than a year. The staff roles indicated are
how responsibilities could possibly be distributed if it was
productionalized at an organization.

5Fred Hutchinson Cancer Center
Footer

ChatGPT Interface

Neuron User Interface

ChatGPT vs Neuron
ChatGPT
Neuron Platform
•A SaaS service by OpenAI
•Subscription based
•Access to the GPT series of LLMs
•Multi-modal: text & vision
•Data stored on OpenAI’s systems
•Custom “GPTs” with “plus” subscription
•Hosted on the FH Research Network
•Open-source software
•Access to hundreds of open LLMs (+ custom)
•Multi-modal: text & vision
•All data stays on Fred Hutch managed systems
•Function calling, filters, pipelines + more
•API access for non interactive use cases

How good are the open models compared to commercial?
In general, the SOTA commercial models such as GPT4o and
Claude 3.5 Sonnet perform better than open models, but it’s
getting close, and sometimes open models can perform better
especially with domain specific fine tuning.
It took three tries for it to get it
right ☺

Side-by-side model evaluations

Side-by-side model evaluations
Click to merge responses
Multiple responses provided as
context to generate a new answer

Computer vision with understanding

Acquiring data from real-time internet searches
Correctly Answering a question that
was breaking news just 24 hours
earlier
Links to source pages that were used as
context to answer the question

Pointing to websites for it use as context
Prefix your prompt with a “#”
followed by the URL. Hit enter then
ask your question
The system scrapes the content of
the provided URL and uses it as
context to answer your questions.

Documents as background knowledge (RAG)
The system chunks up the
document into small pieces,
generates embeddings of the
chunks, generates embeddings of
you question and does cosine
similarity to find the chunks
relevant to your question and then
injects those chunks into the
context so it can answer your
question. It’s called “RAG” or
Retrieval Augmented Generation.

Creating custom chat agents is simple

Custom Chat Agent Example
Reference Documents

AI programming with modes trained specifically for writing code

Integrating VSCode with Neuron for AI powered coding assistance

Example application with API - completely written with Neuron
Interactive
IT Org Chart
Required 20+ iterative prompts to get to this state

Human feedback that can be used to improve models (RLHF)
Data can be used for RLHF (Reinforcement
Learning from Human Feedback

Function Calling (aka Tool Usage)
tools_function_calling_prompt='Tools: [{"name": "calendar_check", "description": "Check the calendar to see
scheduled events", "parameters": {"type": "object", "properties": {}, "required": []}}, {"name":
"get_current_time", "description": "Get the current time in a human-readable format.", "parameters": {"type":
"object", "properties": {}, "required": []}}, {"name": " get_user_name_and_email_and_id", "description": "Get the
user name, Email and ID from the user object .", "parameters": {"type": "object", "properties": {}, "required":
[]}}, {"name": "send_email", "description": "Send an email", "parameters": {"type": "object", "properties":
{"subject": {"type": "string", "description": "The subject of the email."}, "body": {"type": "string",
"description": "The body of the email."}}, "required": ["subject", "body"]}}{"name": " get_fredhutch_status",
"description": "Get the status of the Fred Hutch Center and overall Information Technology (IT) systems ."]\n If a
function tool doesn\'t match the query, return an empty string. Else, pick a function tool, fill in the
parameters from the function tool\'s schema, and return it in the format { "name": "function Name", "parameters":
{ "key": "value" } }. Only call a function if the user asks.
Models can be provided a menu of custom internal tools they have at their disposal to help fulfill your request.
If they find a tool that they think they need that can call it and have the results of the call injected into the chat
context to provide an answer. This only works with models that have been trained or fine tuned to support
function calling.
Example of the “Menu” of tools that are provided to an LLM for their consideration:
Custom AI accessible collections of
tools – written in Python

Function Calling (aka Tool Usage)
Custom AI accessible collections of
tools – written in Python
Email received moments later

In-browser code execution and visualization - no client dependencies
Data file for analysis

Content Filtering: Blocking Toxic Usage
Custom filters and pipelines can be defined (in Python) that check, filter, block or modify content both before it gets send to the
LLM, and/or before it is provided back to the user. In this example a small BERT model sits between the user and the LLM and
stops before user provided data is even sent to the selected LLM. The module/model used is for this is located here:
https://github.com/unitaryai/detoxify

Content Filtering: Preventing the use of PII and PHI
Detection and de-identification with the Clinical BERT Model:
•https://huggingface.co/obi/deid_roberta_i2b2
•Robust DeID: https://huggingface.co/obi/deid_bert_i2b2
•https://github.com/obi-ml-public/ehr_deidentification
Info about the BERT
model used

Automated Neuron access via API
Way more than
shown here
User Generated API Keys
Full API Documentation
Python Example using OpenAI module
Curl Example using OpenAI compatible endpoint

What’s next after the PoC?

29
2024-09-19
Questions?
Robert McDermott