Chat(GPT)? ChatGPT is a generative artificial intelligence chatbot developed by OpenAI and released on November 30, 2022 By January 2023, ChatGPT had become the fastest-growing consumer software application in history, gaining over 100 million users in two months. As of May 2025, ChatGPT's website is among the 5 most-visited websites globally
Chat(GPT) ChatGPT originated from the problem of language translation. In any language, a word changes meaning according to context. ChatGPT solves the problem by having a database of the likely phrases that can follow a given phrase and it is based on probabilities determined from billions of text on web pages. It has access to all the internet pages developed before September 2021.
The infinite monkey theorem The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, including the complete works of William Shakespeare.
Bayes Rule Suppose the event B has occurred. What is now the probability that A occurs knowing that B has occurred? This is called conditional probability and we denote it by P(A|B).Clearly, P(A|B)P(B) = P(A∩B)= probability that both A and B have occurred .This is called Bayes’ rule.
Graph theory and adjacency matrices A graph is an arrangement of nodes (sometimes called vertices) together with an “adjacency relation” represented by an edge .
Markov process In a certain town, it is observed that if it is sunny today, then there is 80% chance it will be sunny tomorrow and 20% chance of rain. If it is rainy today, then there is 60% chance of sunshine tomorrow and 40% chance of rain. With P as the transition matrix, the situation two days from now is given by P2 and n days from now, by Pn . Markov’s theorem is that Pn converges to a matrix limit which can be determined by solving vP =v.
P as the transition matrix, the situation two days from now is given by P2 and n days from now, by P n . Markov’s theorem is that Pn CONVERGES to a matrix limit which can be determined by solving vP =v. In the above example, v=(3/4,1/4).In other words ,it is sunny 3⁄4 of the time and rainy 1⁄4 of the time. The j- th component of the vector v encodes the probability of the process being in state j at time infinity.
The facial recognition problem How to distinguish between a “cat” and a “dog”? By feeding the computer millions of images of cats and millions of images of dogs, it is able to determine the weight vectors and eventually “learn” to distinguish between the two by deriving the mathematical function.
GPT=generative pre-trained transformer The “pre-trained” means that the weight vectors have been corrected empirically with existing data and this is often called “back-propagation”.