Unit 5 Honors in AIML- SOCIAL MEDIA ANALYTICS ppt.pptx

Varad76 36 views 37 slides Aug 08, 2024
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

HAIML


Slide Content

Text,Web & Social media Analytics Unit 5 Prof. Sampada Lovalekar Assistant Professor Dept. of Information Technology, SIES Graduate School of Technology 1

UNIT 5:Social Media Mining With more than half of the world's population using one or more social media platforms every day, businesses in all industries have noticed the importance of social media data mining As its name implies, social media data mining refers to the process of mining social data. Unlike regular data mining, social media data mining explores beyond the internal databases and systems of a given company or research firm. It typically involves the  collection, processing, and analysis of raw data obtained from social media platforms such as Facebook, Instagram, Twitter, TikTok, LinkedIn, YouTube, and others, to uncover meaningful patterns and trends, draw conclusions, and provide insightful and actionable information. 2

What data is collected for social media data mining? Social media data mining harvests various types of social data that are either publicly available (e.g., age, gender, job profession, geographic location, etc.) or are generated on a daily basis on social media platforms (e.g., comments, likes, clicks, etc.). Typically, the data represents people’s attitudes, connections, behavior, and feelings towards a certain topic, product, or service. Depending on the social media platform in question, this data may include the number of followers, comments, likes, or shares, if the targeted social media data comes from Facebook, Twitter’s retweets or the number of impressions, or Instagram’s engagement rates and hashtag usage. 3

Social media mining refers to social computing. Social computing is defined as "Any computing application where software is used as an intermediary or Centre for a social relationship." Social computing involves application used for interpersonal communication as well as application and research activities related to "computational social studies" or Social behavior.“ Social media platform refers to various kinds of information services used collaboratively by many people placed into the subcategories shown below. 4

5 Category Examples Blogs Blogger, LiveJournal, WordPress Social news Digg, Slashdot Social bookmarking Delicious, StumbleUpon Social networking platform Facebook, LinkedIn, Myspace, Orkut Microblogs Twitter, GoogleBuzz Opinion mining Epinions, Yelp Photo and video sharing Flickr, YouTube Wikis Scholarpedia , Wikihow , Wikipedia, Event

6

The broad use of social media platforms is not limited to one geographical region of the world. Orkut, a popular social networking platform operated by Google has most of the users from the outside the United States, and the use of social media among Internet users is now mainstream in many parts of the globe including countries Aisa , Africa, Europe, South America, and the middle east. Social media also drive significant changes in company and business need to decide on their policies to keep pace with this new media. 7

Motivations for Data Mining in Social Media: First, social media data sets are large. Consider the example of the most popular social media platform Facebook with 2.41 billion active users. Without automated data processing to analyze social media, social media data analytics becomes inaccessible in any reasonable time frame. Second, Social media site's data sets can be noisy. For example, Spam blogs are large in number in the blogosphere, as well as unimportant tweets on Twitter. Third, data from online social media platforms are dynamic, regular modifications and updates over a short period are not common but also a significant aspect to consider in dealing with social media data. 8

Approaches of data mining Data mining can help us understand large sets of data. There are two approaches when it comes to data mining– supervised and unsupervised approaches . These approaches provide algorithms that you can use in identifying the hidden patterns in your data. The supervised approach depends on the inferred knowledge from previous data. Meanwhile, an unsupervised approach automatically characterizes data by classifying them into similar elements. 9

One example of an unsupervised approach in data mining is clustering. In clustering algorithms, the given data is characterized without any prior instruction as to what kinds of patterns the algorithms will generate. In other words, it partitions similar elements in a data set into one homogenous group. The main function of this algorithm is to segregate groups with similar elements and turn them into clusters . 10

For the supervised approach, however, we have, for example, the classification technique. In this technique, the algorithm learns from training data. It then automatically categorizes the newly found data into the distinct classes generated by the first set of the data that was previously gathered. 11

For the supervised approach, however, we have, for example, the classification technique. In this technique, the algorithm learns from training data. It then automatically categorizes the newly found data into the distinct classes generated by the first set of the data that was previously gathered. 12

How does social data mining work? Generally, the process of mining social data involves a combination of statistical techniques, mathematics, and machine learning. The first step is to gather and process social data from different social media sources. Apart from social media platforms such as Facebook, Twitter, or YouTube, data miners also extract data from various blogs, news sites, forums, or any other public pages where users interact and leave comments. All of this information must then be processed before proceeding to the next step. 13

Once data is collected and processed, what follows is the application of various data mining techniques which allow for easier identification of common patterns and the correlation of various data points in large datasets. Some of the more commonly used social media data mining techniques include classification, association, tracking patterns, predictive analytics, keyword extraction, sentiment analysis, and market/trend analysis. 14

Moreover, social media data mining also employs a number of social media data mining software solutions to optimize the process of mining. Some of the best-known data mining software solutions include the following: Microsoft SharePoint, Sisense, IBM Cognos, RapidMiner, and Dundas BI. Provided that a more in-depth examination of data is needed, data miners may decide to use machine learning in the process as well. 15

The final step in the mining process is to create a visual representation of the insights obtained from the whole process in order to deliver the information to the targeted audience. This is usually done by using  social media analytics  or a variety of  data visualization  tools, such as Infogram , ChartBlocks , Tableau, and Datawrapper 16

How is social media used and who’s using it? Companies, hotels, retailers, airlines, manufacturers, and even political groups buy data sets from data mining companies to help them personalize the customer's experience, improve marketing strategies and service satisfaction, and optimize their businesses, in general. Here are some examples of who and how social media data mining is used: • Some of its major uses in businesses include targeted marketing campaigns, market research, sales enablement, predictive analytics, influencer marketing, and monitoring of brand reputation. 17

• Trend analysis - Businesses use social media data mining to gain valuable insights into currently trending keywords, mentions, and topics on social media platforms.   Social spam detection - Social media data mining allows for easier detection of spammers and bots on social media platforms like Instagram and Twitter. • Ecommerce - Social media data mining is used to analyze how people talk about products 18

Digital media - Social media data mining is also applied to the field of digital media. For example, the content that is to be shown on a particular digital billboard may be decided upon through conducting a social media data mining process in order to cater to the audience’s preferences or needs. • Bloggers and social media influencers - Social media data mining is often used by bloggers and social media influencers to help them analyze the attitudes and feelings of their followers, what they are talking about, and how they feel about certain topics of discussion. 19

• Brands - Social media data mining helps brands with important decision-making, for example, when deciding about potential future markets. • Research purposes - Researchers find the use of social media data in their research a valuable asset to their work due to the magnitude and easy accessibility of the data. Social media data mining can be applied in different research domains, including social science, research, health research, and technology research. Some of its uses in the research field include gathering opinions, conducting research, recruiting study participants, undertaking participative ‘citizen science', or fostering stakeholder involvement. 20

Government agencies - Social media data mining is also increasingly being used by government agencies for the purpose of welfare-focused interventions. One way social media data mining does this is by tracking residents’ moves as they document their activities at tagged locations throughout the day. Clearly, social media mining can be a powerful tool that can help improve residents’ lives and the safety of communities. 21

Examples of social media data mining software 22

Challenges of recommender system 23

24

25

Classical Recommendation algorithms 1. Content based 2.Colaborative filtering Content based method Assumption:A user’s interest should match the description of item that the user should be recommended by the system. Goal: find the similarity between the user & all the existing item is the core of this type of recommendation sytem 26

A Content-Based Recommender works by the data that we take from the user, either explicitly (rating) or implicitly (clicking on a link). By the data we create a user profile, which is then used to suggest to the user, as the user provides more input or take more actions on the recommendation, the engine becomes more accurate. 27

28

User Profile: In the User Profile, we create vectors that describe the user’s preference. In the creation of a user profile, we use the utility matrix which describes the relationship between user and item. With this information, the best estimate we can make regarding which item user likes, is some aggregation of the profiles of those items. Item Profile: In Content-Based Recommender, we must build a profile for each item, which will represent the important characteristics of that item. For example, if we make a movie as an item then its actors, director, release year and genre are the most significant features of the movie. We can also add its rating from the IMDB (Internet Movie Database) in the Item Profile. 29

Utility Matrix: Utility Matrix signifies the user’s preference with certain items. In the data gathered from the user, we have to find some relation between the items which are liked by the user and those which are disliked, for this purpose we use the utility matrix. In it we assign a particular value to each user-item pair, this value is known as the degree of preference. Then we draw a matrix of a user with the respective items to identify their preference relationship. 30

31

Some of the columns are blank in the matrix that is because we don’t get the whole input from the user every time, and the goal of a recommendation system is not to fill all the columns but to recommend a movie to the user which he/she will prefer. Through this table, our recommender system won’t suggest Movie 3 to User 2, because in Movie 1 they have given approximately the same ratings, and in Movie 3 User 1 has given the low rating, so it is highly possible that User 2 also won’t like it. 32

1. Describe the items to be recommended 2.Create a profile of a user that describes types of item the user likes 3.Compare items with user profile to determine what to recommend 33

Collaborative filtering Collaborative Filtering Recommender system can be either personalized or non-personalized. Non-personalized system can be simpler but personalized system tends to work better as it caters to the needs of each individual user. Collaborative filtering is a common method of personalized recommender system which filters information such as interactions data from other similar users. Since it works by predicting user ratings, it is considered as performing regression task. There are two general types of collaborative filtering: User to user Item to item 34

User to user collaborative filtering  basically operates under the assumption that users who gave similar ratings to a certain item are likely to have the same preference for other items as well. Therefore this method mainly relies on finding similarity between users. However, in some cases, user preference might be to abstract to be broken down. This is where  item to item collaborative filtering  comes in handy. Here, similarity between items is used instead of similarity between users. In this article, we’ll be focusing on user to user collaborative filtering. 35

36

The process starts by converting the rating data into a utility matrix where the list of users are the rows and list of items are the columns. The next step is the Neighborhood collaborative filtering model where we use a similarity function to compute similarity between users with the output being a similarity matrix. A certain amount (K) of similar users (also known as neighbors) is taken and the rating prediction will be obtained by doing regression on these neighbors’ rating data. The items will then be sorted based on the highest rating and the top items will be recommended to the user. 37