Data as the new commodity AI, Sustainability and Financial Innovation Prof. Luis Seco
XX Century Small data Lots of equations Physical space 2 XXI Century Lots of data No equations Mental space Innovation Evolution
Ptolemy - بطليموس 3
Complexity 1 NO POVERTY 2 ZERO HUNGER 3 GOOD HEALTH AND WELL-BEING 4 QUALITY EDUCATION 5 GENDER EQUALITY 6 CLEAN WATER AND SANITATION 7 AFFORDABLE AND CLEAN ENERGY 8 DECENT WORK AND ECONOMIC GROWTH 9 INDUSTRY, INNOVATION AND INFRASTRUCTURE 10 REDUCED INEQUALITIES 11 SUSTAINABLE CITIES AND COMMUNITIES 12 RESPONSIBLE CONSUMPTION AND PRODUCTION 13 CLIMATE ACTION 14 LIFE BELOW WATER 15 LIFE ON LAND 16 PEACE, JUSTICE AND STRONG INSTITUTIONS 17 PARTNERSHIPS FOR THE GOALS
Socium driven investing May 24, 1990
100 years of financial innovation 6 Textual data NLP. LLM, etc. AI 1930 Company accounting information Value investing 1950 Time series analysis Portfolio theory Computer 2000 Numerical methods PC 2020 News Web Social media Documents Filings …
Meta-data Traditional data sources are being replaced by unstructured data sources Numerical data Textual data Sensor data Images, satellite, video, etc. 7 Database Numbers Words Sensors
ESG - Sibli (LLM)
9 Decision making $900 Choice A: 100% $1000 $0 Choice B: 90% 10% $-900 Choice A: 100% $-1000 $0 Choice B: 90% 10% Game 1 Game 2
10 Game theory $900 $1000 $0 Choice A: Choice B: 100% 90% 10% $-900 $-1000 $0 Choice A: Choice B: 100% 90% 10% Same Game! Two Players 1 2 If they disagree on the value they trade with each other
Risk Trading Code of Hammurabi (1750 BC) Beginning of the insurance sector Article 2.1.c (Paris Accord) New risk trading paradigm
Correlation distance (asset management) Use representations learned from financial time-series data with deep neural networks in a supervised fashion: Propose a distance metric for hierarchical clustering using embeddings Perform an inverse-variance allocation (risk parity) with weights calculated between hierarchical clusters Financial Neural Embedding and Its Applications Luis Seco Alik Sokolov Joshua Ha Rim Kim Classical correlation distance Embedding distance
Results 14
Index replication 15
Non-linear regression (SVR-Climate Risk) Single risk factor: CO2 Emissions
Contributors to GhG emissions 17
Emission benchmarking 18 Support vector regression Wendy Xie Davis Li Luis Seco
Emission Chromodynamics actual - fitted values
Causal relationships Feature variables represented via Directed Acyclic Graphs (DAG) We establish relationships using numerical data and LLM for causality (GPT-4) Feature cluster engineering Towards Automating Causal Discovery in Financial Markets and Beyond Alik Sokolov, Fabrizzio Sabelli, Behzad Azadie Faraz, Wuding Li, Luis Seco
Steps for causal DAG Causal exploration Causal inference Causal validation Generate the set of all possible valid causal relationships for a group of features using GPT-4 and model this as an undirected graph. Generate a causal DAG from the set of causal relationships obtained in the previous step by determining the directions of all the causal relationships using GPT-4. Verify the validity of all the causal relationships and correct any mistakes made during the initial generation of the DAG using GPT-4. Verify the corrections made were valid and verify if there are any remaining mistakes in the DAG using GPT-4. If there were any final mistakes, correct these mistakes and return the final version of the causal DAG.
Feature Cluster relationships Inter-cluster relationships Feature relationships Clusters of individuals features