Literature review for prompt engineering of ChatGPT.pptx

LokerXu2 315 views 12 slides Jun 23, 2024
Slide 1
Slide 1 of 12
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12

About This Presentation

prompt engineering for ChatGPT


Slide Content

Extracting accurate materials data from research papers with conversational language models and prompt engineering

Background Growing effort to replace manual extraction with automated extraction. It still requires large effort, expertise and coding. ChatExtract can fully automate very accurate data extraction with minimal initial effort and background. Close to 90% of precision. Databases for critical cooling rates of metallic glasses and yield strength are developed. Prompt engineering has now become a standard practice in the field of image generation.

The data extraction – two main stages 1. Initial classification with simple prompt, clear out all the sentences that do not contain data. 2. A series of prompts that control the data extraction are categorized. 2.1. Split data into single and multi valued (single entry are more likely to be extracted properly ) 2.2. Include the possibility that some data may be missing from the text . 2.3. Use uncertainty-inducing prompts to let model re- analyze the text . 2.4. Embed all the questions in a single conversation . 2.5. Enforce a strict Yes or No to reduce uncertainty .

Single or multiple values Texts with only a single value are much simpler . Texts with multiple values need a careful analysis of the relations between words to determine the correspondance . Material properties: Material, Value, Unit

Then the text is analyzed . For a single- valued text , can directly ask questions about the data and ask for separations . If nagative answer is given, the text is discarded and no data is extracted . For multi-valued sentence , asking the model to provide structured data in a form of a table . Repetitively provide the text with each prompt. The repetition helps in maintaining all the details about the text that is being analyzed . Enforcing Yes or No will enable the automation of the data extraction process .

Blue boxes represent prompts given to the model . Gray boxes are instructions to the user , ‘ Yes ’, ‘No’, ‘None’. The bold text in ‘ [ ] ’ are to be replaced with appropriate values of the named item: material, value or unit.

Investigated the performance of ChatExtract approach on multiple property examples: bulk modulus , metallic glass critical cooling rate , high entropy alloy yield stress The reason of choosing bulk modulus: very often report other elastic properties, such as Young’s modulus or shear modulus with similar name and range of values. Other measurements have similar units as bulk modulus.

Single-value sentences have higher precisions and recalls than multi-valued sentences for the same models. 2. Two core features of ChatGPT that can be in ChatExtract 2.1. Use of redundant prompts that introduce the possibility of uncertainty about the previous extracted data. 2.2. The information about previous prompts and answers is kept. This will allow the follow-up questions to relate to the entire conversation. 3. Removing the follow-up questions will decrease the overall precision to just 42.7% and 26.5% for ChatGPT-4 and ChatGPT-3.5, from values of 90.8% and 70.1%. 4. Other models: LLaMA2-chat, ChemDataExtractor2(CDE2) have been tested as well.

Application to tables and figures Data can be found in tables and figures. Analyzing figures is still an ongoing challenge. For table, the same method as in the general ChatExtract workflow. For figure, only the figure caption is used in the classification. If positive, figure can be downloaded for later manual data extraction. The precision is very high for extracting tables (98%) Assessment of accuracy for figure classification is more difficult For example, 436 figures, manually classified 45 containing bulk modulus data. (80% precision) Some figures requires expertise to extract data

Results of real-life data extraction Critical cooling rates of metallic glasses Databases presented in raw, cleaned, and standardized forms via manual post-processing and machine learning. Despite challenges, ChatExtract demonstrated reasonable precision and recall in data extraction. Final standardized database contained 557 datapoints, with duplicates retained. A separate database focusing on metallic materials generated, containing 298 unique datapoints. ChatExtract showed efficiency compared to manual methods, with consistent performance.

Results of real-life data extraction Yield strength of high entropy alloys Extracted 10269 raw data points, resulting in 8900 cleaned datapoints. Data ranged from 12 MPa to 19.16 GPa , with a peak around 400 MPa. Extracted 2456 datapoints from tables and classified 1848 figures as relevant. ChatExtract's general and transferable approach proves effective for diverse data extraction tasks. Proposed expansions to ChatExtract workflow to handle additional constraints or data types. Assessing accuracy of expanded ChatExtract functionalities would require further refinement and testing.

Conclusions ChatGPT extracts high-quality materials data from research texts with engineered prompts. Achieved over 90% precision and 87.7% recall on bulk modulus data, and 91.6% precision and 83.6% recall on critical cooling rates. Success attributed to purposeful redundancy, uncertainty, and information retention within conversation. Developed two databases: critical cooling rates for metallic glasses and yield strengths for high entropy alloys. Quality of data and simplicity suggest potential to replace labor-intensive methods. ChatExtract's independence suggests potential for improvement with newer LLMs.
Tags