Qualitative Research Methodology and NLP.pptx

kimrichies 64 views 38 slides Jun 10, 2024
Slide 1
Slide 1 of 38
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38

About This Presentation

Qualitative research methodology and an introduction to NLP. There is also an example of how to use a pre-trained model to perform sentiment analysis on user feedback. A Google Colab Notebook is provided in the slides.


Slide Content

Qualitative Data Analysis Introduction to NLP-Sentiment Analysis Richard Kimera

Qualitative research - Approaches The common approaches are; Grounded theory Ethnography Action research Phenomenological research Narrative research Content Analysis

G rounded Theory

G rounded Theory

G rounded Theory: A Grounded Theory Analysis of Introductory Computer Science Pedagogy After the interviews, codes and concepts were formed by earning from the data

G rounded Theory: Example C ategories were formed to giuide theory formation Theory I: Novice programmers attempt to adapt problem solving strategies from other domains, such as mathematics or essay composition. Theory II: Planning by means of pseudocode is achievable for novice programmers Theory III: Students who planned their programs are more likely to report a positive experience in the class.

2. Ethnography

3. Action Research Researchers and participants collaboratively link theory to practice to drive social change. https://sonyaterborg.com/2016/02/17/action-research/

4. Phenomenological research Researchers investigate a phenomenon or event by describing and interpreting participants’ lived experiences. They interpret the participants’ feelings, perceptions, and beliefs to clarify the essence of the phenomenon under investigation. Phenomenological research design requires the researcher to bracket whatever a priori assumption they have about the experience or phenomenon. Data collection methods may include; Participant observation, Interviews, Conversations with  participants, Analysis of personal text, Action research and Focus group discussions/meetings https://delvetool.com/blog/phenomenology

5. Narrative research Researchers examine how stories are told to understand how participants perceive and make sense of their experiences. http://edrm600narrativedesign.weebly.com/steps.html

6. Content Analysis It is used to evaluate parterns within a piece of content e.g a collection of political speeches. Y o u can be able to identify the number of times an item has been mentioned e.g quantum computing. D ata must be grouped according to codes and frequency of a specific word is checked. It works effectively if you do t he search with a specific question Drawback It can be very time consuming due to its large text It concentrates on a specific timeline as it doenst coneider t he before and after Ref: Mariette Bengtsson, “ How to plan and perform a qualitative study using content analysis”

Content Analysis: Planning

Content Analysis: Data Collection Data collection: The information collected is susceptible to half-truth, influence of the interviewer, etc. Methods such as audio and video need to be transcribed and things like text, tone, poses, etc might be hard to write down. In all, the researcher must have a transcription plan.

Content Analysis: Data Analysis Whether the research is to be interpreted as a latent or manifest, the same methods can be used. A table mapping how the process was done should be provided. Validity and reliability must be investigated, and bias must be avoided. Validity internal - quality of research design and methods as applied to the research question external - generalization of results Reliability replication of methods gives us the same results 4 stages have been identified, the decontextualization: The researcher must understand the research (transcribed) before its broken down into meaningful units and each is labelled with a code as per the context. Codes can be created before (deductively) or after (inductively) data collection the recontextualization: checking the original text alongside the units to check whether all has been captured as per the aim. The researcher must try distancing and letting some context off

Content Analysis: Data Analysis the categorization: merge or converge some content (be careful especially if the research is latent) before, and categories can be made basing on questions or as per identified context and themes. the compilation: When performing a qualitative content analysis, the investigator must consider the data collected from a neutral perspective and consider their objectivity. For latent research, the researcher must keep referencing back to text, and for a manifest one where he must keep identifying meaning. Compare your results with existing literature to see how logical your results are. You can test the validity of the results by talking to the respondents. Two theories do exist: credibility vs validity, dependability vs reliability transferability vs generalization.

Content Analysis: Process

Qualitative Research data collection methods One of the following data collection methods are used ; Observations: Recording what you have seen, heard, or encountered in detailed field notes. Interviews:  Personally asking people questions in one-on-one conversations. Focus groups: Asking questions and generating discussion among a group of people. Surveys:  Distributing questionnaires with open-ended questions. Secondary research: Collecting existing data in the form of texts, images, audio or video recordings, etc.

Example To research the culture of a large tech company, you decide to take an ethnographic approach. You work at the company for several months and use various methods to gather data: You take field notes with observations and reflect on your own experiences of the company culture. You distribute open-ended surveys to employees across all the company’s offices by email to find out if the culture varies across locations. You conduct in-depth interviews with employees in your office to learn about their experiences and perspectives in greater detail.

Data Analysis Qualitative data can take the form of texts, photos, videos and audio. For example, you might be working with interview transcripts, survey responses, fieldnotes, or recordings from natural settings. Most types of qualitative data analysis share the same five steps: Prepare and organize your data. This may mean transcribing interviews or typing up fieldnotes. Review and explore your data. Examine the data for patterns or repeated ideas that emerge. Develop a data coding system. Based on your initial ideas, establish a set of codes that you can apply to categorize your data. Assign codes to the data. For example, in qualitative survey analysis, this may mean going through each participant’s responses and tagging them with codes in a spreadsheet. As you go through your data, you can create new codes to add to your system if necessary. Identify recurring themes. Link codes together into cohesive, overarching themes.

Data Analysis methods

1. C ontent Analysis

2. Thematic Analysis

3. Textual Analysis

4. Discourse analysis

Advantages and disadvantages Advantages of qualitative research Flexibility: The data collection and analysis process can be adapted as new ideas or patterns emerge. They are not rigidly decided beforehand. Natural settings: Data collection occurs in real-world contexts or in naturalistic ways. Meaningful insights: Detailed descriptions of people’s experiences, feelings and perceptions can be used in designing, testing or improving systems or products. Generation of new ideas: Open-ended responses mean that researchers can uncover novel problems or opportunities that they wouldn’t have thought of otherwise. Disadvantages of qualitative research Unreliability: The real-world setting often makes qualitative research unreliable because of uncontrolled factors that affect the data. Subjectivity: Due to the researcher’s primary role in analyzing and interpreting data, Qualitative research cannot be replicated. The researcher decides what is important and what is irrelevant in data analysis, so interpretations of the same data can vary greatly. Limited generalizability: Small samples are often used to gather detailed data about specific contexts. Despite rigorous analysis procedures, it is difficult to draw generalizable conclusions because the data may be biased and unrepresentative of the wider population. Labor-intensive: Although software can be used to manage and record large amounts of text, data analysis often has to be checked or performed manually.

Example 1: A qualitative study of the causes and circumstances of drowning in Uganda : Introduction: Reports by world health ranking in 2017 stated that 3,186 deaths results from drowning equaling to 1.23% of total deaths. Planning: Aim: Research was deductive in nature according to the way the aim was stated that semi structured interviews were to be used to describe circumstance of drowning contextually appropriate interventions for drowning prevention and surveillance in Uganda. Purpose sampling was used to select 14/60 districts and separate quantitative and qualitative studies were done. The focus group discussions were balanced, and back-to-back translation was used to translate to various languages. Minimizing bias and quality checks: Interviews were done in a local language, translated and then results were discussed with the interviewees. Data Analysis: 1) FGD were analyzed using the Framework analysis approach (as described in the content analysis process), and 2) Interviews (Primary) as well as secondary data (published articles basing on inclusion and exclusion criteria) were analyzed using the Atlas ti software of which themes were formed. The Identified themes were a basis of results: reporting and recording drowning incidences, describing drowning circumstances and identifying potential interventions for the prevention and surveillance of drowning.

Example 2: “Then they prayed, they did nothing else, they just prayed for the boy and he was well”: A qualitative investigation into the perceptions and behaviors surrounding snakebite and its management in rural communities of Kitui county, Kenya Snake bites are common in sub-Saharan Africa basically because the weather harbors the growth of snakes and secondary because most people stay in structures that are not strong enough to withstand entry of these species. There are 140 known snake species in Kenya, of which 29 are venomous and 13 are medically important. Envenoming by spitting cobras, puff adders, neurotoxic cobras and mambas are reported to dominate snakebite incidences Aim: To explore the beliefs and perceptions regarding snakes and snakebites and methods of prevention and management among members of communities in four sub-counties in Kitui County, Kenya. Ethical consideration: Approval was sought from the University, informed consent, patient anonymity was maintained, and no compensation was given to respondents. Study design: semi structured interviews were used. Sampling and recruitment: Used stratified purposeful sampling to have representation from all selected villages. Households for interviewing were selected using random number method. The total interviews were 23. Type of research was deductive since themes were formed prior. However, they changed during the process of testing and consultations. Data Analysis: Recoding and transcribing was done using f4transkript software, transcripts were anonymized using NVivo 12.0. themes were made and coding was done. Results were provided and some sample respondents quotations were provided as evidence for the provided result. The discussion section was well referenced to previously done research. The conclusion section provided the key takeaway from the research.

Tools used in Qualitative Analysis

Case study: NLP usage

Problem case!

NLP usage

Natural Language Processing (NLP)

Qualitative NLP pipeline

NLP Example: Sentiment Analysis I used a pre-trained RoBERTa model t o determine whether the tweet is positive, neutral or negative.

I used a pre-trained model called RoBERTa t o determine whether the tweet is postive, neutral or negative. from transformers import AutoTokenizer , AutoModelForSequenceClassification from scipy.special import softmax //Example of a a tweet tweet = '@ kimrichies My students are the best @ HGU 😒 https:// globalautosystems.co.ug /’ # preprocess tweet Separate the tweet from the rest of the text using lists # load model and tokenizer roberta = " cardiffnlp /twitter- roberta -base-sentiment” model = AutoModelForSequenceClassification.from_pretrained ( roberta ) #determine probabilities using a function scores = softmax (scores) Example: Sentiment Analysis

Output The tweet is 95% positive

Running the Notebook Google Colab Notebook