[GEMINI EXTERNAL DECK] Introduction to Gemini.pptx
AhmedElbaloug
347 views
146 slides
May 07, 2024
Slide 1 of 146
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
About This Presentation
tripoli
Size: 23.48 MB
Language: en
Added: May 07, 2024
Slides: 146 pages
Slide Content
Intro to Gemini The official Google AI deck! Ahmed Elbaloug GDG Tripoli @elbaloug
https://trygcp.dev/e/build-with-ai-tripoli Learning more https://ai.google.dev/docs/discord
Visit the cookbook: https://github.com/google-gemini/cookbook Start developing Go to Google AI Studio . Login with your Google account. Create an API key. Use a quickstart for Python, or call the REST API using curl . Read on to learn more about the Gemini API Get started! Newly added!
what is generative ai ?
What is a n LLM? LLMs Explained [...] [...] [...] [...] 0.02 0.03 0.9 0.01 0.0 … Dogs Rain Drops Fish Wind … and cats raining It’s
Roses are red, Violets are blue, Sugar is sweet, LLMs Explained
Roses are red, Violets are blue, Sugar is sweet, LLMs Explained
Explain this joke: “ I used to be addicted to soap, but I'm clean now.” Model: The joke is a pun on the word "clean". It has two meanings: Model: User: The joke is funny because it is unexpected. The listener expects the speaker to be talking about the first meaning of "clean", but they are actually talking about the second meaning. This creates a surprise and a laugh. To be free of dirt or other unwanted substances. To be free of addiction. LLMs Explained
Model: 1. **Build a volcano.** This is a classic science fair project that is both fun and educational. You can learn about the science of volcanoes, earthquakes, and plate tectonics. 2. **Make a solar oven.** This is a great project to learn about renewable energy and how to harness the power of the sun. You can use your solar oven to cook food, dry fruit, or even melt chocolate. 3. **Create a 3D printer.** This is a more challenging project, but it is also very rewarding. You can learn about 3D printing technology and design your own custom objects. 4. **Build a robot.** This is a great project for those who are interested in engineering and robotics. You can learn about basic electronics, mechanics, and programming. 5. **Design a water filtration system.** This is a project that can help you learn about the importance of clean water and how to filter it. You can use your water filtration system to clean water from a stream or river. LLMs Explained Give me five neat ideas for science fair projects. User:
Why are large language models different? LLMs are characterized by emergent abilities , or the ability to perform tasks that were not present in smaller models. LLMs contextual understanding of human language changes how we interact with data and intelligent systems. LLMs can find patterns and connections in massive, disparate data corpora. Search Conversation Content generation
Google is the industry pioneer in AI Responsible AI 3,000 Researchers 7,000 Publications Built & Tested for Safety Privacy in design Upholds high scientific standards Accountable to People Socially Beneficial Avoid creating unfair bias 2015 Google DeepMind AlphaGo defeats Go champion 2016 Google’s DeepMind helps detect eye disease 2017 Google invents Transformer kickstarting LLM revolution 2018 Google’s groundbreaking large language model, BERT 2019 Text-to-Text Transfer Transformer LLM 10B P Model Open Sourced 2020 Google LaMDA Model Trained to converse 2022 AlphaFold predicts structures of all known proteins 2023 A conversational AI Service powered by PaLM2 202 4 Family of multimodal LLMs & products
Multimodality
(Android AICore)
ai.google.dev/gemma
The Gemini Ecosystem The most advanced AI from Google For Developers For Consumers For Business and Enterprise Models Gemini API (in Google AI Studio + ai.google.dev) Gemini for Google Workspace Gemini for G oogle Cloud Gemini in Vertex AI Gemini | app and web Gemini in the Google App Gemini in Gmails, Docs… this deck is about
Getting started with the Gemini API Train the Trainer in AI
AI Studio aistudio.google.com Generate API Keys Create, test, and save prompts Customize models in minutes Generate starter code
AI Studio aistudio.google.com
AI Studio aistudio.google.com
AI Studio aistudio.google.com
AI Studio aistudio.google.com
AI Studio
ai.google.dev REST API + Client libraries for Python, Node, Java, and Swift Libraries Libraries SDKs
Vertex AI Enterprise grade support. Full MLOps (Examples: Model evaluation, monitoring, registry) Vertex AI Check it out when you're ready for production Gemini API and Vertex AI Both give access Gemini family models. Vertex AI Cloud Production ai.google.dev/docs/migrate_to_cloud PlatfoEndpoints rms Platforms Platforms
Usage patterns Train the Trainer in AI
We’ll cover Basic Prompting Code generation Search, retrieval and information synthesis Function calling Image understanding Tuning Usage Patterns
Basic prompting Train the Trainer in AI
Write a poem about a magic backpack. In a world of wonder, behold, A backpack with tales yet untold. Its fabric shimmers, alive with might, A portal to realms, day and night. aistudio.google.com
Write a poem about a magic backpack. In a world of wonder, behold, A backpack with tales yet untold. Its fabric shimmers, alive with might, A portal to realms, day and night. ... aistudio.google.com
Generate a short tip about JavaScript. Good tips are short and contain a code example showing the tip. **Tip: Use Array.prototype.flat() to flatten an array of arrays.** ```javascript const array = [[1, 2], [3, 4], [5, 6]]; console.log(array.flat()); // Output: [1, 2, 3, 4, 5, 6] ```
Generate a short tip about JavaScript. Good tips are short and contain a code example showing the tip. **Tip: Use Array.prototype.flat() to flatten an array of arrays.** ```javascript const array = [[1, 2], [3, 4], [5, 6]]; console.log(array.flat()); // Output: [1, 2, 3, 4, 5, 6] ```
Generate a short tip about JavaScript. Good tips are short and contain a code example showing the tip. topic : let and const tip : Prefer using `let` and `const` to initialise variables, instead of `var`, as `var` has no block scope or protection against mutability. e.g. const myVar = "EXAMPLE"; topic : arrow syntax tip :
AI Studio Endpoints Prompting
import google.generativeai as genai model = genai.GenerativeModel( 'models/gemini-pro' ) resp = model.generate_content( 'Write the first paragraph of a story about a magic backpack' ) >>> print (resp.text) In a bustling city, amidst the vibrant tapestry of human existence, there existed a peculiar entity named Archie. Archie, however, was no ordinary backpack...
chat = model.start_chat() response = chat.send_message( "Hello, what should I have for dinner?" ) print (response.text) # 'Here are some suggestions...' response = chat.send_message( "How do I cook the first one?" )
Contents Settings/Configs Text Text Text Tools Tools Safety Settings Safety Settings Generation Config Content of role: "user" containing multimodal Parts: Content of role: "model" "user" content: a text Part Tools specified by the caller Safety setting configured by caller Generation Config includes: temperature, Top P, Top K, stop seq, max output tokens etc. Generate Content Request
Candidate(s) Feedback Text Block Reason Safety Ratings Candidate "Content" note: only one candidate returned today why the model stopped generating feedback on the prompt Finish Reason Finish Message set if finish reason is present Safety Ratings Safety Ratings how safe is the response Generate Content Response …
AI Studio Endpoints Prompting https://ai.google.dev/docs/prompt_best_practices Chained prompts - make a plan, then execute it Context - Few shot prompts Generation parameters - Temperature, Safety settings, Top-P, TopK
Code generation Train the Trainer in AI
Code Generation Content generation Convert human language requests to machine requests Rapid tool building Generate code based on a user prompt
Examples Code generation Generate data
Examples Code generation Generate data Generate a SQL query
Examples Code generation Generate data Generate a SQL query Simulate Execution
Examples Code generation Generate data Generate a SQL query Simulate Execution Quick coding help - Smart pair programer.
Write a unit test for this Python function: def is_diver_neutrally_buoyant(density_water, density_diver): """Check if a diver is neutrally buoyant.""" if density_water == density_diver: return True
Search and Information Synthesis Train the Trainer in AI
Models have knowledge cut-offs LLMs are not fact engines No exposure to private data BYO Data Search & IR
Instructions + Context + Question all in the prompt Easy to implement No extra code, just ask. Use the prompt's context window Search & IR
model = genai.GenerativeModel( 'gemini-pro' ) document = pathlib.Path( 'document.txt' ).read_text() result = model.generate_content( f""" Explain how deep-sea life survives. Please answer based on the following document: {document} """ )
Use the prompt's context window Learning more Limited by the model's context length gemini-1.0-pro: 30K tokens. Search & IR
Use the prompt's context window Learning more gemini-1.5-pro: 1M tokens. Remember: Everything is tokens. Using context may be more flexible than Retrieval Augmented Generation (RAG). More input context means higher latency Join the waitlist: https://aistudio.google.com/app/waitlist/97445851 https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024 Search & IR
Use Attributed Question Answering (AQA) with inline chunks AQA Instructions + Chunks + Question in one request AQA is a new feature Feedback is welcome The AQA model is specifically tuned for RAG
import google.generativeai.answer import google.generativeai as genai chunks = [ "Your Googlecar has a climate control system that allo..." , "Your Googlecar has a large touchscreen display that p..." , "Your Googlecar has an automatic transmission. To shif..." ]
answer = genai.answer.generate_answer( contents= 'How do you make it less windy?' , answer_style=AnswerStyle. ABSTRACTIVE , inline_passages=chunks)
answer { content { parts { text: "To decrease the airflow inside the car, " "turn the airflow knob counterclockwise." }} ... grounding_attributions { source_id { grounding_passage { passage_id: "0" } } content { parts { text: "Your Googlecar has a climate control system that allows..." }}} } answerable_probability: 0.995222747
Use AQA with inline chunks Learning more Limited by the request size - Max 4MB. Inefficient when asking about the same data repeatedly. Simple to use. Example: handle chunks returned by a search tool. AQA
Use AQA with the retriever service Learning more Corpus < Document < Chunk hierarchy. Pass a reference to a document or corpus to the generate_answer function. Corpus Document Chunk AQA
from google.generativeai import retriever my_corpus = retriever.create_corpus(display_name= "My Documents" ) for document in my_documents: doc = my_corpus.create_document(display_name=document.title) doc.batch_create_chunks(htmlchunker.chunk(document.html)) answer = genai.answer.generate_answer( semantic_retreiver_config=my_corpus, ...) Coming soon
Use AQA with the retriever service Learning more Supports filtering based on chunk metadata Limits 10k documents per corpus 1M chunks per corpus 5 corpora per user AQA
Function Calling Train the Trainer in AI
Function Calling https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/function-calling/sql-talk-app goo.gle/gemini-fn-call-sql-github
Function Calling https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/function-calling/sql-talk-app SELECT ROUND(( COUNT(DISTINCT IF(returned_at IS NOT NULL, order_id, NULL)) / COUNT(DISTINCT order_id)) * 100, 2) AS return_rate FROM thelook_ecommerce.orders goo.gle/gemini-fn-call-sql-github
Function Calling Function Calling Describe external functions to the model. The model may ask you to call the function to help it respond to your queries. Endless possibilities for integrating external tools.
Function calling - Basics Function Calling How? The google.generativeai SDK will inspect the function's type hints to determine the schema. Allowed types are limited: AllowedTypes = ( int | float | str | bool | list | dict ) https://ai.google.dev/tutorials/function_calling_python_quickstart
def multiply(a: float , b: float ): """Returns a * b.""" return a*b model = genai.GenerativeModel( model_name= 'gemini-1.0-pro' , tools=[multiply])
Function calling - Basics Function Calling Because function calling requires alternating turns, it's easiest to use through chat. Enable "automatic function calling" when you start a chat, and the ChatSession will call the function(s) for you. You don't have to use automatic function calling, it just makes simple cases eas ier. https://ai.google.dev/tutorials/function_calling_python_quickstart
chat = model.start_chat( enable_automatic_function_calling= True ) response = chat.send_message( 'I have 57 cats, each owns 44 mittens, ' 'how many mittens is that in total?' )
chat = model.start_chat( enable_automatic_function_calling= True ) response = chat.send_message( 'I have 57 cats, each owns 44 mittens, ' 'how many mittens is that in total?' ) print (response.text) # The number of mittens in total is 2508.
Function calling - Basics Function Calling What happened? Use the chat history to find out. The chat history collects all the function calls and responses that took place. https://ai.google.dev/tutorials/function_calling_python_quickstart
for content in chat.history: part = content.parts[ ] print(content.role, "->" , type(part).to_dict(part))
for content in chat.history: part = content.parts[ ] print(content.role, "->" , type(part).to_dict(part)) # user -> { 'text' : 'I have 57 cats, each owns 44 mittens, ' # 'how many mittens is that in total?' } # model -> { 'function_call' : { 'name' : 'multiply' , # 'args' : { 'a' : 57.0 , 'b' : 44.0 }}} # user -> { 'function_response' : { 'name' : 'multiply' , # 'response' : { 'result' : 2508.0 }}} # model -> { 'text' : ' The number of mittens in total is 2508.' }
Contents Tools Text Function Declaration Function Declaration Text Function Calling interaction Function Call Function Response Text model may predict a function call based on user content model can understand the function response and generate text OR another function call if one or more function declarations are provided, function calling feature will turn on
Function calling - More Examples Function Calling Wikipedia research aid Integrates a search tool. Uses the Gemini API inside the function call to summarize pages. def wikipedia_search(queries:list[str]) -> list[str]: ... https://ai.google.dev/docs/search_reranking_using_embeddings/
model = genai.GenerativeModel( 'gemini-pro' , tools=[wikipedia_search]) chat = model.start_chat(enable_automatic_function_calling=True) query = "Explain how deep-sea life survives." res = chat.send_message(instructions.format(query=query))
model = genai.GenerativeModel( 'gemini-pro' , tools=[wikipedia_search]) chat = model.start_chat(enable_automatic_function_calling=True) query = "Explain how deep-sea life survives." res = chat.send_message(instructions.format(query=query)) # Searching for "How do deep-sea creatures survive the extreme pressure?" # Related search terms: [ 'Deep sea' , 'Deep-sea community' , 'Deep-sea fish' ] # Fetching page: "Deep sea" # Information Source: https://en.wikipedia.org/wiki/Deep_sea # Fetching page: "Deep-sea community" # Information Source: https://en.wikipedia.org/wiki/Deep-sea_community # Fetching page: "Deep-sea fish" # Information Source: https://en.wikipedia.org/wiki/Deep-sea_fish # Searching for "How do deep-sea creatures survive the cold temperatures?" # Related search terms: [ 'Deep-sea community' , 'Deep sea' , 'Deep-water coral' ] # Fetching page: "Deep-water coral" # Information Source: https://en.wikipedia.org/wiki/Deep-water_coral # Searching for "What adaptations do deep-sea creatures have to survive?" # Related search terms: [ 'Deep sea' , 'Deep-sea community' , 'Deep-sea fish' ] # Searching for "What are the unique characteristics of deep-sea creatures?" # Related search terms: [ 'Deep-sea community' , 'Deep-sea fish' , 'Deep sea' ] # Searching for "How do deep-sea creatures find food?" # Related search terms: [ 'Deep-sea community' , 'Deep-sea fish' , 'The Deep (TV series)' ] # Fetching page: "The Deep (TV series)" # Information Source: https://en.wikipedia.org/wiki/The_Deep_(TV_series) # Searching for "How do deep-sea creatures communicate with each other?"
Function calling - More Examples Function Calling SQL Talk Use function calling to talk to a database. Live example: https://goo.gle/gemini-fn-call-sql https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/function-calling/sql-talk-app
sql_query_func = FunctionDeclaration( name= "sql_query" , description= "Get information from data in BigQuery using SQL queries" , parameters={ "type" : "object" , "properties" : { "query" : { "type" : "string" , "description" : " SQL query on a single line ...
Function Calling https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/function-calling/sql-talk-app SELECT ROUND(( COUNT(DISTINCT IF(returned_at IS NOT NULL, order_id, NULL)) / COUNT(DISTINCT order_id)) * 100, 2) AS return_rate FROM thelook_ecommerce.orders
Function calling - Schema Function Calling Automatically building the schema from the type hints doesn't [currently 02/2024] work for everything. The allowed types are actually: AllowedType= ( int | float | str | bool | list['AllowedType'] | dict[str, 'AllowedType'] ) https://ai.google.dev/tutorials/function_calling_python_quickstart
Function calling - Schema Function Calling Let's look at how the schema is constructed. https://ai.google.dev/tutorials/function_calling_python_quickstart
model = genai.GenerativeModel( model_name= 'gemini-1.0-pro' , tools=[multiply]) model._tools.to_proto()
[function_declarations { name: "multiply" description: "returns a * b." parameters { type_: OBJECT properties { key: "b" value { type_: NUMBER }} properties { key: "a" value { type_: NUMBER }} required: "a" required: "b" }}]
Function calling - Schema Function Calling It's an OpenAPI schema, written as a protobuf . The protobuf-classes are available in the google.ai.generativelanguage client library. Reference docs: https://ai.google.dev/api/python/google/ai/generativelanguage/FunctionDeclaration https://ai.google.dev/tutorials/function_calling_python_quickstart
import google.ai.generativelanguage as glm calculator = glm.Tool( function_declarations=[ glm.FunctionDeclaration( name= 'multiply' , description= "Returns the product of two numbers." , parameters=glm.Schema( type=glm.Type. OBJECT , properties={ 'a' : glm.Schema(type=glm.Type. NUMBER ), 'b' : glm.Schema(type=glm.Type. NUMBER )}, required=[ 'a' , 'b' ]))])
Function calling - Schema Function Calling They can be written out as JSON-compatible objects as well. https://ai.google.dev/tutorials/function_calling_python_quickstart
model = genai.GenerativeModel( model_name= 'gemini-1.0-pro' , tools=[calculator])
Function calling - Structured data Function Calling Structured data extraction. You can just ask the model to do it and return JSON. https://ai.google.dev/tutorials/structured_data_extraction
response = model.generate_content(textwrap.dedent( "" "\ Please return JSON describing the the people, places, things and relationships from this story using the following schema: { "people" : list[ PERSON ], "places" :list[ PLACE ], "things" :list[ THING ], "relationships" : list[ RELATIONSHIP ]} PERSON = { "name" : str, "description" : str, "start_place_name" : str, "end_place_name" : str} PLACE = { "name" : str, "description" : str} THING = { "name" : str, "description" : str, "start_place_name" : str, "end_place_name" : str} RELATIONSHIP = { "person_1_name" : str, "person_2_name" : str, "relationship" : str} Here is the story: "" ") + story)
Function calling - Structured data Function Calling Asking for JSON often works. Function calling lets you strictly describe the schema. With a strict description, we can strictly enforce that that's what gets returned. https://ai.google.dev/tutorials/structured_data_extraction
model = model = genai.GenerativeModel( model_name= 'gemini-1.0-pro' , tools = [ add_to_database ])
Function calling - Structured data Function Calling Why? Function calling doesn 't return a string. Function calling returns data-structures: You don't parse text. https://ai.google.dev/tutorials/structured_data_extraction
response = model.generate_content(f """ Please add the people, places, things and relationships from this story to the database: {story} """ ) fc = response.candidates[ ].content.parts[ ].function_call pprint.pprint(type(fc).to_dict(fc))
{ 'name' : 'add_to_database' , 'args' : { 'people' : [{ 'description' : 'A curious girl with wild hair of ' 'midnight blue and eyes that shone like... , 'name' : 'Luna' , ... 'places' : [{ 'description' : 'Nestled between rolling hills and ' 'sparkling rivers, where Luna and her ' 'family lived.' , 'name' : 'Small town' }], 'relationships' : [{ 'person_1_name' : 'Luna' , 'person_2_name' : 'Elara' , 'relationship' : 'Mother-daughter' }, { 'person_1_name' : 'Luna' , 'person_2_name' : 'Orion' , 'relationship' : 'Father-daughter' }], 'things' : [{ 'description' : 'A dusty, forgotten backpack that Luna ' 'found in the attic, which possessed a...' , 'end_place_name' : None, 'name' : 'Backpack' , 'start_place_name' : 'Attic' }, ...
Train the Trainer in AI Image understanding
Image understanding Multimodality Images are just tokens in the input Can be used for instructions, context or query subject
import google.generativeai as genai # Use the Gemini vision model. PRO_VISION = 'models/gemini-pro-vision' model = genai.GenerativeModel(PRO_VISION)
!wget -O instrument.jpg -q https://goo.gle/instrument-img import PIL.Image img = PIL.Image. open ( 'instrument.jpg' ) # Preview the image (thumb := img.copy()).thumbnail(( 200 , 200 )) thumb
response = model.generate_content( [ 'What instrument is this?' , img, 'What kinds of music would use it?' ]) print (response.text) This is a pipe organ. It is a musical instrument that produces sound by driving pressurized air through pipes. Organs are often used in churches, concert halls, and other large venues. They can be used to play a wide variety of music, from classical to contemporary.
Structured data extraction Image conditioning RAG Image understanding Multimodality
Image conditioning Multimodality Generate text or structured data from images
AI Studio Endpoints Images Structured Data
AI Studio Endpoints Images Structured Data
AI Studio Endpoints Images Structured Data
import enum from pydantic import BaseModel class DogApplication (BaseModel): """Represents an instance of a completed dog-walking application form.""" dog: Dog owner: Customer schedule: Schedule class Dog (BaseModel): name: str breed: str color: str age_years: float temperament: Temperament class Temperament ( str , enum.Enum): FRIENDLY = 'friendly' SHY = 'shy' AGGRESSIVE = 'aggressive' class Customer (BaseModel): name: str phone: str class Schedule (BaseModel): regular: bool days_and_times: list [ str ]
from llama_index.multi_modal_llms import GeminiMultiModal from llama_index.multi_modal_llms.generic_utils import load_image_urls from llama_index.program import MultiModalLLMCompletionProgram from llama_index.output_parsers import PydanticOutputParser images = load_image_urls([ 'https://goo.gle/dog-form-img' ]) prompt = ( "Please extract each of the fields in this " "form and provide a JSON representation: " ) # Load the Gemini Pro Vision model. gemini_llm = GeminiMultiModal(api_key=GOOGLE_API_KEY, model_name=PRO_VISION)
# Build a callable that will map the images to the Pydantic schema. image_to_application = MultiModalLLMCompletionProgram.from_defaults( output_parser=PydanticOutputParser( DogApplication ), image_documents=images, prompt_template_str=prompt, multi_modal_llm=gemini_llm, ) # Image ORM! form = image_to_application() assert isinstance (form, DogApplication ) assert form.dog.name == 'Fido' assert form.owner.name == 'Mark' assert form.schedule.regular is True assert 'Wednesday Morning' in form.schedule.days_and_times