Profiling Cryptocurrency Influencers with Few-shot Learning 2023

frandzi 37 views 38 slides Sep 09, 2024
Slide 1
Slide 1 of 38
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38

About This Presentation

En el marco del proyecto Pro2Haters y como parte del tercer hito hemos organizado la tarea internacional “Profiling Cryptocurrency Influencers with Few-shot Learning” cuyo
objetivo ha sido, en un entorno de bajos recursos (el emergente mundo cripto donde hay muchos datos pero pocos recursos cura...


Slide Content

Profiling Cryptocurrency Influencers with Few-shot Learning - Overview Public Mara Chinea-Rios 1* , Ian Borrego-Obrador 1 , Marc Franco-Salvador 1 , Francisco Rangel 1 , and Paolo Rosso 2 1 Symanto Research Spain 2 Universitat Politècnica de València *[email protected]

Background Author profiling Identifying personal traits such as demographics, personality traits, native language, language variety, etc. from writings . Previous Tasks Demographics : Gender ( Rangel et al., 2013, Rangel et al., 2015, Rangel et al., 2016, Rangel et al., 2017, Rangel et al., 2018 ) Age ( Rangel et al., 2013, Rangel et al., 2014, Rangel et al., 2015 ) Bots ( Rangel et al., 2019 ) Fake News ( Rangel et al., 2020 ) Hate Speech ( Rangel et al., 2021 ) Profiling Irony and Stereotypes Spreaders ( IROSTEREO) ( Ortega Bueno et al., 2022 ) 2

Cryptocurrency Profiling - Motivation Cryptocurrencies have massively increased their popularity in recent years. This complex and dynamic ecosystem encompasses a wide range of elements, including various cryptocurrencies like Bitcoin and Ethereum, smart contracts, and decentralized applications ( DApps ), among others. We believe that many users trust social media influencers to bridge the gap in their lack of knowledge to later take investment decisions. As a consequence, profiling those influential actors becomes relevant. 3

Profiling Cryptocurrency Influencers with Few-shot Learning

Task goal We aim to  profile cryptocurrency influencers  in social media, from a low-resource perspective .  ​ ​ Specifically, we focus on English Twitter posts for three different sub-tasks: ​ ​ Low-resource influencer profiling (subtask1) ​ Classes:  ​ (1) non-influencer,  ​ (2) nano, (3) micro, (4) macro,  ​ (5) mega. Low-resource influencer interest identification (subtask2) ​ Classes: (1) technical information​, (2) price update​, (3) trading matters​, (4) gaming​, (5) other. Low-resource influencer intent identification (subtask3) ​ Classes: (1) subjective opinion, (2) financial information, ​(3) advertising, (4) announcement. Profile examples {"twitter user id":"05ca545f2f700d0d5c916657251d010b","texts":[{" text":"I got $20 on Boston winning tonight, who trying to bet? \ud83d\udc40"},{" text":"Is there an alternative search engine besides Google? I hate when I search a question and the answer has absolutely nothing to do with the question"}],"tweet ids":[{"tweet id":"65408feeb147b509e4bc47280c062e16"},{"tweet id":"10a27a0fae34f8411a2ed3b0631db42d"}]} {"twitter user id":"062492818c984febba843b650a4a602e","texts":[{"text":"@1inch, my favorite aggregator has a sweet booth this year, if you ignore the glare lol https:\/\/t.co\/BJdG60wpKR"},{" text":"Takes 2 $Matic or 100 $BANK on polygon. That's one of the lowest cost ways to now flex commitment to community, and value the work of your peers in so doing."}],"tweet ids":[{"tweet id":"841834bda2a5703a27a9e2a2e0e11471"},{"tweet id":"6539be1639225a7b362e71dba7dcf18a"}] 5

Datasets - Methodology We built the datasets as follows: Identified user who are crypto influencers. C lassified their interest and intent. 6

Datasets - Methodology Identified user who are crypto influencers Crypto influencer candidates -> two conditions: user with tweets that contain the ticker cashtag for different crypto projects e.g. $ETH , $BTC , $UNI . tweets with mentions in the name of the crypto projects e.g. Ethereum , Bitcoin and Uniswap . We extract the number of followers for those users and we use a scale to determine their influence grade : Non-influencer: Individuals with a minimal social media following; 0 to 1k followers. Nano influencers: Individuals with a smallsocial media following; 1k to 10k followers. Micro influencers: Individuals with a moderately sized social media following; 10k to 100k followers. Macro influencers: Individuals with a substantial social media following; 100k to 1M followers. Mega influencers: Individuals with an extensive social media following, more than 1M. Cashtags are like hashtags, but instead of using a hash sign (#), they use a dollar sign($). They are used to tag tweets related to a specific cryptocurrency or stock. 7

Datasets - Methodology C lassified their interest and intent Annotation process: we provide a list of labels and definitions for each task: Interest labels: Technical information: Post related to the technical aspects of a crypto project or asset. Trading matters: Post related to technical analysis, volatility, trading decision, etc. Price update: Post related to price, price speculation and price prediction , etc. Gaming : Post with mentions of a project or topic related to Gaming, etc. Other: If none of other labels applies. Intent labels: Announcement: Post related to the announcement of events, progress, etc. Subjective opinion: Post with some subjective opinion, response, or advice, etc. Financial information: Post related to the update of price without subjective comment. Advertising: Post related to introducing or promoting a project, account, user, website, etc. Corpus Construction: Three independent annotators labelled the data. M ajority voting to select the class, and in the case of annotators’ disagreement, the tweet was discarded. Fleiss Kappa agreement between the annotators: interest -> 0.65; intent -> 0.67 8

Datasets - Statistics Task Partition Total number of users per class SubTask 1 train macro:32, mega:32, micro:30, nano:32, non-influencer:32 test macro:42, mega:45, micro:46, nano:45, non-influencer:42 SubTask 2 train technical information:64; trading matters:64; price update:64; gaming:64; other:64 test technical information:42; trading matters:112; price update:108; gaming:40; other:100 SubTask 3 train announcement:64; subjective opinion:64; financial information:64; advertising:64 test announcement:37; subjective opinion:160; financial information:43; advertising:52 Train Partition SubTask 1- Maximum of ten English tweets per user. SubTastks 2 and 3- One English tweets per user. Test Partition Maximum of fifty English tweets per user. 9

Evaluation measures Since the datasets are very un-balanced, we have used the Macro F 1 measure and ranked the performance of the systems by that metric. We included also in our analysis accuracy and F1 per class to show the participants’ performance. 10

Baselines Random: labels are randomly selected with equal probability. Character 𝑛-grams with logistic regression (user-char- lr ): We use [1..5] character 𝑛-grams with a TF-IDF weighting calculated using all texts. LDSE: The Low Dimensionality Semantic Embedding (LDSE) method represents documents on the basis of the probability distribution of occurrence of their tokens in the different classes, e.g., non-influencer vs. mega. The distribution of weights for a given document is expected to be closer to the weights of its corresponding class. T5-large (bi-encoders) - ZS: Zero shot (ZS) text classification employing a T5-large model with bi-encoders. T5-large (label tuning) - FS: Few shot (FS) text classification employing a T5-large model with a label-tuning training strategy. 11

Participation 27 teams  participated in the cryptocurrency influencer task.  12  of them submitted their working notes 1 .  8 countries. 12 1 We received the description of the systems from the teams that did not submit their working notes. ​ https://mapchart.net/world.html

Approaches Participants' approaches from 4 viewpoints: Data preprocessing Data argumentation F eatures C lassification methods 13

Participants ' approaches - Preprocessing 13 teams cleaned the text. To this end, most of them removed or masked specific elements such as URLs, user mentions, hashtags, emojis, emails, dates, money, or numbers. Two teams lowercased the tweets, removed stopwords , and treated character flooding . Team ids : core (Espinosa et al.) and k umar (Kumar et al.) Hive team ( Huallpa et al.) removed words with less than a given number of characters. 14

Participants ' approaches - Data Augmentation Some teams employed data augmentation techniques to increase the size of the training data and solve the overfitting and underfitting problems . Holo team (Balanzá García et al.) used the pegasus paraphrase model to generate variations of each original tweet while keeping its meaning. Two teams proposed different approaches using ChatGPT. Team ids : stellar (Villa-Cueva et al.) and ethereum (Coto et al.) . Other authors employed back translation augmentation methods . Team ids: alchemy-pay ( Siino et al.) and wax (Lomonaco et al. ) 15

Participants ' approaches - Features n-grams features Two teams used TF-IDF n-grams . Team ids : terra ( de Castro Isasi) and icon ( Muslihuddeen et al.). Icon team ( Muslihuddeen et al.) have combined different types of n-grams, TF-IDF n-grams with word2vec embeddings. deep learning-based features Five different teams employed different transformers models to extract features. For instance, two teams employed the sentence T5 model . Team ids : shashirekha (Girish et al.) and sushiswap ( García Bohigues et al.) In another case, ethereum team (Coto et al. ) tested different BERT models , e.g. BERT, BERTweet . Also, two teams used mpnet models trained on SNLI 1 and MNLI 2 datasets to get the text representations. Teams ids: waves (Jaramillo-Hernández et al.) and MRL-LLP (López Cuerva et al.) 1 Stanford Natural Language Inference (SNLI) corpus. 2 Multi- Genre Natural Language Inference ( MultiNLI ) corpus. 16

Participants ' approaches – Classification methods (T raditional Methods ) Only a two teams approached the task with traditional classifier methods : Support Vector Machine (SVM). Random Forest Logistic Regression . Teams ids : icon ( Muslihuddeen et al.) and shashirekha ( Girish et al.) . Five teams teams complemented the analysis with the following approaches : Gradient Boosting Classifier . Stochastic Gradient Descent (SGD) classifier . K- neighbors classifier . Gaussian Naive Bayes. Multi-layer Perceptron classifier . Teams ids : shiba-inu (Carbonell Granados et al.), vechain (Cardona-Lorenzo), tron ( Casamayor Segarra ) , terra (de Castro Isasi), ethereum (Coto et al.). 17

Participants ' approaches – Classification methods ( transformer-based models ) Most participants based on fine-tuning transformer-based models. We can group the transformer models used into three main groups: Encoder models : models use only the encoder of a transformer model . Encoder-decoder models : are encoder -decoder models with the goal to reconstruct [MASK] tokens in between texts . Decoder models: are decoder models with the goal to complete continuations of texts. 18

Participants ' approaches – Classification methods ( transformer-based models ) Encoder models : Sixteen teams employed models based on BERT architecture e.g. BERT, BERTweet , CryptoBERT , TwHIN -BERT, and roBERTa , DistilBERT , Twitter- roBERTa , RoBERTuito , XML- RoBERTa and DeBERTa , DeBERTaV3. Encoder-decoder models : Three teams tested the XLNet and different T5 models e.g. T5-base, T5-large and FLAN-T5-XL. Teams ids: nexo (Siino et al.), shiba-inu ( Carbonell Granados), iota (Iranzo Sánchez et al.). Decoder models: Harmony team (Rodríguez Ferrero et al.) presented a prompt tuning approach with BLOOM model and GPT2 . 19

Participants ' approaches – Classification methods ( Stacking ensemble) Stellar team ( Villa-Cueva et al.) proposed an ensemble approach that combines fine-tuned language models and natural language inference models, employing a soft voting mechanism . They tested their approach with some models, for example, BERT, CryptoBERT , FinBERT , roBERTa , DeBERTa . Symbol team ( Giglou et al.) studied the integration of Bi-Encoders with Large Language Models (Flan-T5), to enhance the semantic representation of authors. Dogecoin team ( Labadie et al.) , they focused on leveraging metric learning and Parameter-Efficient Fine-Tuning (PEFT) and an application of PEFT techniques to CryptoBERT and BLOOM-1b1 models. Kumar team (Kumar et al.) proposed to combine the bodies of two pre-trained transformer models , the models employed were TwHIN -BERT and roBERTa . 20

Participants ' approaches – Classification methods ( other deep learning approaches ) Waves team (Jaramillo-Hernández et al.) employed Convolutional Neural Network (CNN) with sentence embedding . Hive team ( Huallpa et al.) used Prototypical networks 1 with a sentence transformer model ( MiniLM-L12 ) . 1 Jake Snell and Kevin Swersky and Richard S. Zemel . Prototypical Networks for Few-shot Learning. Advances in neural information processing systems , 201 7. 21

Global results Subtask 1 -> H olo team ( Balanzá García et al.) (62.32) following the approach of fine-tuning the RoBERTuito model. Subtask 2 -> Stellar team (Villa-Cueva et al.) (67.12) with data augmentation techniques and a method of transformer-based models ensemble. Subtask3 -> Terra-classic team (Cano-Caravaca et al.) (67.46) obtained the best results with fine-tuning DeBERTaV3. 22

Global results Results shows around 46% of the final submissions outperformed our best baseline for each subtask. The best systems could achieve an improvement of up to 10% absolute macro F1 score compared to our best baselines. The traditional approaches could not outperform the deep learning-based , positioned on average from the twelve positions of the ranking. 23

Global results Distribution of results regarding Macro F1 and Accuracy for each subtask. SubTask1 the median results are 50.21 Macro F1 and 52.27 accuracy with an inter-quartile range of 12.50% and 11.59% . SubTask2 and Subtask3 (55.05 vs. 57.62) Macro F1 with an inter-quartile range of 14.80% and 10.53% . Nevertheless, in the case of Subtask1 , the standard deviation is higher ( 12.01 vs. 10.97-9.97 ), due to the higher number of outliers. 24

Global results Distribution of results in terms of F1 for each class. Subtask 1 the standard deviation values are high in almost all classes (mega: 20.52, macro: 17.43, micro: 9.06, nano:15.67 and non-influencer: 15.31. SubTask 2 , the standard derivation on average is 11.79, and the median on average is 54.35. SubTask 3 , the standard deviation on average is 10.87, and the average median of 57.83. 25

Error analysis- Subtask1 Macro and mega are easier to identify with respect to the other class but they get confused with each other (30% and 43%). Similar behavior between nano and micro with lower values (34% and 27%) N on-influencer could be confused with nano and micro influencer (12% and 20%). 26

Error analysis- Subtask2 The most difficult class t o classify is the other class (44%) and the easier is gaming (77%). Also, the trading matters class is mainly confused with the price updates class (23%). 27

Error analysis- Subtask3 The financial information class got the highest value (79%), from which we could interpret that authors with this intent are more recognizable. On the contrary, the advertising class is confused with the announcement class (24% and 15%). 28

Conclusions Looking at the results and the error analysis, we can conclude that: The participants used different features to address the task, mainly: n-grams and deep learning-based such as classical word embeddings (e.g. word2vec) and more modern transformer-based ones (e.g. BERT). Some teams employed data augmentation techniques to increase the size of the training data and solver the low recourse issue. Concerning the classification part, most of the teams approached the task with deep learning techniques, being transformer-based classifiers (BERT and DeBERTA ) the most popular. Transformer-based models outperform traditional approaches for the task of cryptocurrency influencers. This highlights their potential for low-resource author profiling scenarios. It is possible to automatically identify cryptocurrency influencers and their interest and intent with competitive Macro F1 (while there is room for improvement). 29

Task impact 30

Industry at PAN ( Author   Profiling ) 31 Organisation Sponsors

On behalf of the author profiling task organizers: Thank you very much for participating!! 32

33

34

35

Grant PLEC2021-007681 funded by MCIN/AEI/ 10.13039/501100011033 and by European Union NextGenerationEU /PRTR. XAI- DisInfodemics : eXplainable AI for disinformation and conspiracy detection during infodemics 36

Grant CPP2021-008994 funded by MCIN/AEI/ 10.13039/501100011033 and by European Union NextGenerationEU /PRTR. ANDHI: ANomalous Difussion of Harmful Information

IMINOD/2022/106 OBULEX : OBservatorio del Uso del Lenguaje sEXista en la red Projecte cofinançat pels fons FEDER, dins del programa Operatiu FEDER de la Comunitat Valenciana
Tags