Generative AI models can greatly enhance the knowledge management, by automatising the process of knowledge discovery, extraction, organisation and sharing. In the best-case scenario, the GPT based tools will help us to make better sense of the ever-growing piles of data and enhance the decision mak...
Generative AI models can greatly enhance the knowledge management, by automatising the process of knowledge discovery, extraction, organisation and sharing. In the best-case scenario, the GPT based tools will help us to make better sense of the ever-growing piles of data and enhance the decision making at all levels – from short-term and personal decisions to long-term strategic planning. In the worst case, the Generative AI tools could, however, add a fuel to heated discussions on societally sensitive topics, by misinforming the service users. Careless application of this technology in crisis and disaster management and related domains, such as climate change adaptation and mitigation, resilient and sustainable development can lead to expensive and potentially disastrous decisions at all levels of responsibility and further aggravate already alarming loss of trust in authorities. This talk will discuss (1) the underlying reasons why Generative AI technology cannot and shouldn’t be blindly used as a knowledge base, (2) commonly used methods for minimising the underlying risks and enhancing the usability of generative AI in knowledge management applications, and (3) real world experiences of the author with use of generative AI models in extraction and synthesis of data in the climate change adaptation and mitigation domain.
Size: 13.79 MB
Language: en
Added: Sep 20, 2024
Slides: 32 pages
Slide Content
Hotel Marriot, Vienna, Austria, 13th September 2024 Enhancing the knowledge management in societally sensitive domains with Generative AI Denis Havlik and Aradina Chettakattu, AIT Austrian Institute of Technology GmbH. DSC DACH CONFERENCE ON AI-POWERED SUSTAINABILITY & RESPONSIBLE AI
DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 What are „societally sensitive domains“? Some examples: Climate Change adaptation and mitigation Loud minority of the population either doesn’t believe the problem exists, it’s human-made or that we can or should do anything about it. Pandemics management Common sense measures against COVID have resulted in a huge backlash and strengthened populist parties in Europe and in the US. Energy transition Most of technology is already in place and often already less expensive than traditional solutions, but “big oil” and co. have huge interest in slowing it down. Societal cohesion, sustainability, migration, ….
DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Why Knowledge Management matters for these domains? Information overflow Stakeholders cannot make informed decisions, because there is too much information available. Misinformation and disinformation Information space is polluted with disinformation and misinformation. “good” information is scattered on various project and policy pages that most people aren’t aware of. Google & co. cannot discern between information and (well made) disinformation, so web-search is a minefield. Lack of actionable knowledge and wisdom It’s difficult to find exactly what one needs, when you need it and in a form that helps to make decisions Illustration. https://sketchplanations.com/dikw
" Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY Why is misinformation and disinformation a problem? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-NC-ND " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-NC-ND
What is the worst that can happen? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY
What is already happening ? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA This Photo by Unknown Author is licensed under CC BY-NC This Photo by Unknown Author is licensed under CC BY-SA-NC This Photo by Unknown Author is licensed under CC BY-SA This Photo by Unknown Author is licensed under CC BY-SA This Photo by Unknown Author is licensed under CC BY
DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Why do we need AI? Amounts of knowledge that needs to be captured, analysed, organized, packaged, and published in some actionable form is enormous and growing fast. Knowledge management is a tedious work for highly skilled people who would prefer working on creation or application of the knowledge instead. AI tools could at least make the work more productive, if not completely automatise it " Dieses Foto " von Unbekannter Autor ist lizenziert gemäß CC BY-SA
Why MAIA? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 What‘s MAIA got to do with AI?
The Project | Context The MAIA project aims to act as an impact multiplier of climate research projects funded under the Horizon Europe and Horizon 2020 programmes. = MAIA Projects results impact multiplier
The Project | Why MAIA? Climate research projects remain largely fragmented, resulting in limited reach , diffusion and exploitation. The visibility of project outputs often diminishes meaningfully once a project comes to its end. Lack of realistic business cases has limited the true impact and cost-efficiency of previous programmes. Current Issues Combined, these issues result in inefficiency, and lost opportunities to expedite progress in reducing the climate vulnerability of Europe’s regions. Limited reach Less visibility Reduced impact 1. 2. 3.
The Project | Context The (project) aim is to make the current disperse knowledge more: Interoperable Accessible Usable & Render economically sustainable outcomes
Preliminary tests (1) DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Prompt-1 -> Key performance indicators show performance achieved and help track the progress of an activity. Can you list all the indicators mentioned in the latest IPCC AR6 reports? Chat systems prompted: OpenAI ChatGPT3.5, ChatClimate.ai GPT3.5-turbo/GPT4 and IPCC Chat GPT. Findings: GPT3.5 wasn’t aware of the IPCC AR6 report (not included in its training materials, not fine-tuned and not using Retrieval Augmented Generation!) ChatClimate with GPT4 provided a partial list, but claimed the report doesn’t contain enough information IPCC Chat claims the report doesn’t contain this information (not clear which FPT version it uses and if the model was fine-tuned or only used in RAG mode ) Common errors in Generative AI systems used for knowledge extraction in the climate action domain; arXiv:2402.00830
Preliminary tests (2) DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Prompt-2 -> Let’s test this approach by looking at a climate-related question together and then checking the sources that you use to answer my question. Could you tell me what the environmental impact of AI generated language models is and provide scientific references in the text and links to the sources mentioned? Prompt-3 -> The references you have provided for your answer are inaccurate and therefore cannot be verified. This is a problem that I noticed when asking you to provide sources for the information you give on a multitude of different issues. Do you think this is a problem that is endemic to the way you work and what do you think the consequences could be for the spread of misinformation about issues such as climate change? Chat systems prompted: OpenAI ChatGPT3.5 Findings: GPT3.5 provided a relatively good answer but botched the details and references. Most references weren’t entirely relevant, one was freely invented Common errors in Generative AI systems used for knowledge extraction in the climate action domain; arXiv:2402.00830
Why do the generative AI systems lie to us? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Large part of the training materials is from “the Web”. These materials inevitably contain errors, especially for domains where misinformation and disinformation is widespread LLMs aren’t knowledge databases, but “probabilistic models of knowledge bases” (G. Dietterich , https://youtu.be/cEyHsMzbZBs ) Interpolating and extrapolating from the knowledge they have is what they were designed to do! Companies and researchers developing the AI models aren’t experts in Climate Change or Crisis Management. They may also lack incentive to tune their model for this type of work and may not share “our” ethical and societal values.
Can we “fix” this? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Generative models will always “invent” their answers, but there are plenty of ways to improve the outputs
Should we just give up and try again in 10 years? DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024
Decision: concentrate on use of the generative AI as support tool for experts, until we have more confidence in its answers.
DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Key Technologies Used/Tested Keyword extraction KeyBERT Databases MinIO : S3 document storage Marqo : Vector Database, Good for AI-powered semantic search across unstructured data, including multimedia LLM framework Llama -Index : Ideal for applications needing efficient and sophisticated querying of text data using LLMs, such as chatbots or knowledge systems. Ollama : A platform simplifies running open-source LLMs locally by managing complexities. AI cores GPT 3.5 Turbo, Mixtral , Gemma2 and LLaMA 3.1
SummarAIse tool DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Expected to be useful for search and cross-correlating content in knowledge services… (Far from perfect, but humans are also quite bad at doing this) AI extracts structured knowledge from input data. Even if results aren’t perfect, they are helpful for the experts as it’s easier to check them than to write it all per hand. Prompt engineering has huge impact on the quality of responses. This is a “chat with your document(s) type of application but designed to work unsupervised / in a batch mode. Typically, the service should be used by experts (e.g. scientists, decision makers, knowledge curators), and the results should not be shown to the public before validation. “Publish first review later” model may be suitable for some applications.
Background Project : Path2DEA Evaluation of agroecological farming tools based on certain performance indicators and rating criteria . Goal : find out what different tools do, so that we can suggest best tools for different use cases in the next step . DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 1: Characterising The Solutions
Setup Defined rating criteria P1, P2, N1, N2 Defined 80 indicators and their descriptions . Description and detailed explanation about the tool that needs to be assesed . Feed the three inputs above into the LLM using Retrieval-Augmented Generation (RAG) DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 1: Characterising The Solutions P1 or N1 rating is for positive or negative effects of the tool on an indicator with direct reference from the document. A P2 or N2 rating is for positive or negative effects inferred or deduced by the tool evaluator, in this case the LLM tool .
Results and Conclusions DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 1: Characterising The Solutions The models appeared unbiased, contributing to the similar ratings across different indicators. LLMs provided nearly similar ratings for different indicators, suggesting difficulty in rating tasks. While ratings were inconsistent, the reasoning and explanations behind each indicator were more reliable and accurate. Model struggled to assign negative ratings, leading to most indicators receiving positive scores.
Background Similar to the experiment 1, but input data is sensitive Instead of farming tools , here we assess the eligibility of different cities for a certain use case using specific evaluation criterias . The ratings where from 1 – 15, with 1 being the lowest . Goal : Find the 5 best cities that match the use case from the roughly 50 candidate cities . DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 2: Assessing The Regional Adaptation Plans
Experiment 2: Assessing The Regional Adaptation Plans DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Similar to exp 1, the LLMs were unbiased and gave similar scores to each city . Almost all criteria had above avg evaluation scores (in range 11-15) While reasoning , LLMs tried to be mostly positive. Results and Conclusions
Background Project : MAIA+ Access to the marketplace via 'MAIA Discovery Services' uses a widget that provides tailored results based on user type and descriptions. The results will be from EU project data from CORDIS. DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 3: RAG- based “Matching” using vector DB
Setup Feed Data: From marketplace to vector DB Question/Query to Vector DB , find top 3 projects that match with query and pass it to LLM An LLM of the user's choice interprets the project context provided by the vector database Results: Answer from LLM and details of top 3 matched project DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 3: RAG- based “Matching” using vector DB
Experiment 3: RAG- based “Matching” using vector DB DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Improved results by using vector DB. RAG can suppress hallucinations by LLMs (not eliminate!). LLMs are good at rephrasing the content for different user types. Due to LLM context limits in RAG, some relevant projects may be missed when delivering content to users. Results and Conclusions
Background Project : MAIA+ Tag different EU funded climate projects in CORDIS with relevant keywords . Two approaches are keyword extraction just from the text and keyword assignmnet assigning keywords from a well defined taxonomy like IPCC. DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 4 : Keyword Extraction and Tagging
Setup User input : Documents in pdf/docx format and vocabularies/taxonomies to use. KeyBERT Process : Step 1: KeyBERT will convert the text to BERT embeddings. Retrieve N-grams1 and embed these N-grams. Step 2: Computes cosine similarity between document embedding and each N-gram embedding. Step 3: Sorts the keyword in descending order according to the score and gives the best matching terms. DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 4 : Keyword Extraction and Tagging
Results and Conclusions Good for keyword extraction ie . Extracting best keywords that describes the document from the text. Doesn’t work well with keyword assignment or topic modelling ie . It won’t find out semantically or contextually related words from text/document and taxonomies. For example, if the document has a word “CO2” and taxonomy has the keyword “carbon dioxide” then KeyBERT won’t find a match between these two. In the world of LLMs, there's no straightforward or advanced method for effectively assigning keywords. Alternative approach Expanding the taxonomies using synonyms and find the best matching terms using AI and map them back to the taxonomy keywords DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Experiment 4 : Keyword Extraction and Tagging
Generative models occasionally produce false statements PER DESIGN. Authorities should be aware of he risks before endorsing “GPT chat” type of applications for “societally sensitive” domains (and concentrate on applications for experts for now) Some, but not all, of the issues encountered can be resolved with prompt engineering, RAG, fine-tuning… At a training phase, misinformation and bias can be suppressed through reinforced learning. However, this enforces moral and ethical norms of the model owner => this can be dangerous. DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 Final thoughts Don‘t ignore GPT, but don‘t trust it too much! OpenAI says it‘s perfect, but it‘s really not! (Inspired by Baby L asagna song „Don‘t hate yourself, but don‘t love yourself too much“)
Acknowledgements and disclaimer DSC DACH conference „AI-powered sustainability & responsible AI“, Vienna 11-13 September 2024 The information presented in this presentation has been generated within the context of the MAIA project, in cooperation with several other EU-funded research projects (KNOWING, PATH2DEA, ClimEmpower, ICARIA). MAIA (Maximising impact and accessibility of European climate research) project has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No 101056935. The contents of this presentation does not necessarily reflect the opinion or the position of the European Commission .