Fundamentals Concepts on Text Analytics.pptx

aini658222 35 views 17 slides May 20, 2024
Slide 1
Slide 1 of 17
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17

About This Presentation

Text analytics, also known as text mining, is the process of deriving high-quality information from text sources using software. It is a multidisciplinary field that combines elements of data mining, machine learning, statistics, and natural language processing (NLP) to process and analyze large amo...


Slide Content

FUNDAMENTALS CONCEPTS OF TEXT ANALYTICS

Content Abstract Introduction What is Text Analytics@Text Mining ? Text Analytics Methods Text Analytics Tasks Application Trends

INTRODUCTION A large amount of text information can be analyzed objectively and efficiently with Text Analytics or Text Mining . The field of text analytics@text mining has received a lot of attention due to the ever-increasing need for managing the information that resides in the vast amount of available text documents .

What is Text Analytics@Text mining? “The objective of Text Mining is to exploit information contained in textual documents in various ways , including …discovery of patterns and trends in data, associations among entities, predictive rules, etc.” ( Grobelnik et al., 2001) “It’s the process of discovering new , previously unknown knowledge by computer, using the automated extraction of information from various textual resources.” (Fouad Sabry, 2022)

Quantitative and Qualitative data To really understand text mining, we need to establish some key concepts, such as the difference between: Most of the human language we find in everyday life is qualitative data. It describes the characteristics of things – their qualities – and expresses a person’s reasoning, emotion, preferences and opinions. Qualitative data can be very rich and complex. It’s also often highly subjective, since it comes from a single person, or in the case of conversation or collaborative writing, a small group of people. Qualitative Data Quantitative Data Quantitative data is numerical – it tells you about quantity. And it’s excellent at telling you precisely about measurements and quantities, in the past or present, which makes it invaluable for analysis and predictions. However, it can’t provide the ‘why’ information that textual data gives a human reader or speaker. vs Text analysis takes qualitative textual data and turns it into quantitative, numerical data. It does things like counting the number of times a theme, topic or phrase is included in a large corpus of textual data, in order to determine the importance or prevalence of a topic. It can also do tasks like assessing the difference between multiple data sources in terms of the words or topics mentioned per quantity of text.

Structured and unstructured data Unstructured data  is language in its natural form, as created for and by human beings. Most of the text we consume day to day is in the form of unstructured documents, and even this article is an example of unstructured data. As well as text documents, unstructured data can take the form of video or audio files. Structured data  is information presented in a consistent format so that it’s easy for computers to analyze and store. A list of phone numbers is an example of structured data. So is a spreadsheet showing a business’ annual accounts. Semi-structured data  is somewhere between the two — essentially, semi-structured data is in an organized form but it lacks the format computers need in order to analyze it. An example would be an email inbox. Data is somewhat organized into received, sent, spam, junk and so on, but the data within each email is not organized in any consistent way by the email software.

General Text Mining Methods

NATURAL LANGUAGE PROCESSING MACHINE LEARNING DATA VISUALIZATION A subfield of computer science and artificial intelligence that focuses on the interaction between computers and human language. (Pre-processing Phase) A type of artificial intelligence that uses algorithms to identify patterns in data and make predictions. (Modelling/Learning Phase) The graphical representation of data and information to facilitate understanding and communication. (Evaluation & Validation Phase) Text Mining Methods

Text Mining Techniques

General Text Mining Process

Text Mining Process

Text Mining Resource

Text Mining Tasks

Applications

Text Mining Trends

Thanks