Research Methodology-Data Processing

26,092 views 25 slides Jan 06, 2022
Slide 1
Slide 1 of 25
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25

About This Presentation

Research Methodology-Data Processing


Slide Content

Data processing is concerned with editing, coding, classifying, tabulating and charting and diagramming research data Data processing in research consists of five important steps 1. Editing of data 2. Coding of data 3. Classification of data 4. Tabulation of data 5. Data diagrams Data processing

Data processing occurs when data is collected and translated into usable information. Data processing starts with data in its raw form and converts it into a more readable format (graphs, documents, etc.), giving it the form and context necessary to be interpreted by computers and utilized by employees throughout an organization. DATA PROCESSING

Data collection Collecting data is the first step in data processing. Data is pulled from available sources, including data lakes and data warehouses. It is important that the data sources available are trustworthy and well-built so the data collected is of the highest possible quality. 2. Data preparation Once the data is collected, it then enters the data preparation stage. Data preparation, often referred to as “pre-processing” is the stage at which raw data is cleaned up and organized for the following stage of data processing. During preparation, raw data is thoroughly checked for any errors. The purpose of this step is to eliminate incomplete, or incorrect data and begin to create high-quality data for the best business intelligence. Six stages of data processing

3. Data input The clean data is then entered into its destination and translated into a language that it can understand. Data input is the first stage in which raw data begins to take the form of usable information. 4 . Processing During this stage, the data inputted to the computer in the previous stage is actually processed for interpretation. Processing is done using machine learning algorithms, though the process itself may vary slightly depending on the source of data being processed (data lakes, social networks, connected devices etc.) and its intended use (examining advertising patterns, medical diagnosis from connected devices, determining customer needs, etc.).

5. Data output/interpretation The output/interpretation stage is the stage at which data is finally usable to non-data scientists. It is translated, readable, and often in the form of graphs, videos, images, plain text, etc.). Members of the company or institution can now begin to self-serve the data for their own data analytics projects. 6 . Data storage and Report Writing The final stage of data processing is storage. After all of the data is processed, it is then stored for future use. While some information may be put to use immediately, much of it will serve a purpose later on. Plus, properly stored data is a necessity for compliance with data protection legislation like GDPR. When data is properly stored, it can be quickly and easily accessed by members of the organization when needed.

First step in analysis is to edit the raw data. Editing detects errors and omissions, corrects them whatever possible. Editor’s responsibility is to guarantee that data are – accurate; consistent with the intent of the questionnaire; uniformly entered; complete; and arranged to simplify coding and tabulation. Editing of data may be accomplished in two ways – ( i ) field editing Field editing is preliminary editing of data by a field supervisor on the same data as the interview. Its purpose is to identify technical omissions, check legibility, and clarify responses that are logically and conceptually inconsistent. When gaps are present from interviews, a call-back should be made rather than guessing what the respondent would probably said. Supervisor is to re-interview a few respondents at least on some pre-selected questions as a validity check. (ii) in-house also called central editing. In center or in-house editing all the questionnaires undergo thorough editing. It is a rigorous job performed by central office staff. EDITING

Coding refers to the process of assigning numerals or other symbols to answers so that responses can be put into a limited number of categories or classes. Such classes should be appropriate to the research problem under consideration. They must also possess the characteristic of exhaustiveness (i.e., there must be a class for every data item) Coding is necessary for efficient analysis and through it the several replies may be reduced to a small number of classes which contain the critical information required for analysis. Coding decisions should usually be taken at the designing stage of the questionnaire. This makes it possible to precode the questionnaire choices and which in turn is helpful for computer tabulation as one can straight forward key punch from the original questionnaires. But in case of hand coding some standard method may be used. One such standard method is to code in the margin with a coloured pencil. The other method can be to transcribe the data from the questionnaire to a coding sheet. Whatever method is adopted, one should see that coding errors are altogether eliminated or reduced to the minimum level. CODING

CODING SHEET EXAMPLE

Most research studies result in a large volume of raw data which must be reduced into homogeneous groups if we are to get meaningful relationships. This fact necessitates classification of data which happens to be the process of arranging data in groups or classes on the basis of common characteristics. Data having a common characteristic are placed in one class and in this way the entire data get divided into a number of groups or classes. CLASSIFICATION

DATA CLASSIFICATION EXAMPLE

(a) Classification according to attributes: As stated above, data are classified on the basis of common characteristics which can either be descriptive (such as literacy, sex, honesty, etc.) or numerical (such as weight, height, income, etc.). Descriptive characteristics refer to qualitative phenomenon which cannot be measured quantitatively; only their presence or absence in an individual item can be noticed (b) Classification according to class-intervals: Unlike descriptive characteristics, the numerical characteristics refer to quantitative phenomenon which can be measured through some statistical units. Data relating to income, production, age, weight, etc. come under this category. Such data are known as statistics of variables and are classified on the basis of class intervals. Classification can be one of the following two types, depending upon the nature of the phenomenon involved:

Tabulation is a systematic & logical presentation of numeric data in rows and columns to facilitate comparison and statistical analysis. It facilitates comparison by bringing related information close to each other and helps in further statistical analysis and interpretation. In other words, the method of placing organised data into a tabular form is called as tabulation. It may be complex, double or simple depending upon the nature of categorisation . TABULATION

TABULATION EXAMPLE

Graphical Representation  is a way of analysing numerical data. It exhibits the relation between data, ideas, information and concepts in a diagram. It is easy to understand and it is one of the most important learning strategies. It always depends on the type of information in a particular domain. GRAPHICAL REPRESENTATION 

Line Graphs  – Line graph or the linear graph is used to display the continuous data and it is useful for predicting future events over time. Bar Graphs  – Bar Graph is used to display the category of data and it compares the data using solid bars to represent the quantities. Histograms  – The graph that uses bars to represent the frequency of numerical data that are organised into intervals. Since all the intervals are equal and continuous, all the bars have the same width. Line Plot –  It shows the frequency of data on a given number line. ‘ x ‘ is placed above a number line each time when that data occurs again. Frequency Table  – The table shows the number of pieces of data that falls within the given interval. Circle Graph  – Also known as the pie chart that shows the relationships of the parts of the whole. The circle is considered with 100% and the categories occupied is represented with that specific percentage like 15%, 56%, etc. Stem and Leaf Plot –  In the stem and leaf plot, the data are organised from least value to the greatest value. The digits of the least place values from the leaves and the next place value digit forms the stems. Box and Whisker Plot  – The plot diagram summarises the data by dividing into four parts. Box and whisker show the range (spread) and the middle ( median) of the data. There are different types of graphical representation

A hypothesis is  a statement of the researcher's expectation or prediction about relationship among study variables . The research process begins and ends with the hypothesis. It is core to the entire procedure and, therefore, is of the utmost importance. Hypothesis is nothing but the heat of the research. HYPOTHESIS

Following are the characteristics of hypothesis: The hypothesis should be clear and precise to consider it to be reliable. If the hypothesis is a relational hypothesis, then it should be stating the relationship between variables. The hypothesis must be specific and should have scope for conducting more tests. The way of explanation of the hypothesis must be very simple and it should also be understood that the simplicity of the hypothesis is not related to its significance. Characteristics of Hypothesis

TYPES OF TESTING HYPOTHESIS Simple Hypothesis Complex Hypothesis Null Hypothesis Alternative Hypothesis Logical Hypothesis Empirical Hypothesis Statistical Hypothesis

Simple Hypothesis A simple hypothesis  predicts the relationship  between two variables: the independent variable and the dependent variable. This relationship is demonstrated through these examples. Drinking sugary drinks daily leads to being overweight. Smoking cigarettes daily leads to lung cancer. Complex Hypothesis A complex hypothesis describes a  relationship between variables . However, it’s a relationship between two or more independent variables and two or more dependent variables. You can follow these examples to get a better understanding of a complex hypothesis. Adults who 1) drink sugary beverages on a daily basis and 2) have a family history of health issues are more likely to a) become overweight and b) develop diabetes or other health issues.

Null Hypothesis A  null hypothesis,  denoted by H , proposes that two factors or groups are unrelated and that there is no difference between certain characteristics of a population or process. You must test the likelihood of the null hypothesis, in tandem with an alternative hypothesis, in order to disprove or discredit it. Some examples of a null hypothesis include: There is no significant change in a person’s health during the times when they drink green tea only or coffee only. Alternative Hypothesis An alternative hypothesis, denoted by H 1  or H A  , is a claim that is contradictory to the null hypothesis. Researchers will pair the alternative hypothesis with the null hypothesis in order to prove that there is no relation. If the null hypothesis is disproven, then the alternative hypothesis will be accepted. If the null hypothesis is not rejected, then the alternative hypothesis will not be accepted. Some examples of alternative hypotheses are: A person’s health improves during the times when they drink green tea only, as opposed to coffee only.

Logical Hypothesis A logical hypothesis is a proposed explanation using limited evidence. Generally, you want to turn a logical hypothesis into an empirical hypothesis, putting your theories or postulations to the test. In reference to these examples, there is currently no evidence to support these hypotheses. However, you can form a hypothesis based on the data available to draw a logical conclusion. Beings from mar would not be able to breathe the air in Earth's atmosphere. Empirical Hypothesis Examples An empirical hypothesis, or working hypothesis, comes to life when a theory is being put to the test using observation and experiment. It's no longer just an idea or notion. Rather, it is going through trial and error and perhaps changing around those independent variables. Roses watered with liquid Vitamin B grow faster than roses watered with liquid Vitamin E. Women taking vitamin E grow hair faster than those taking vitamin K.

Statistical Hypothesis A statistical hypothesis is an examination of a portion of a population or statistical model. In this type of analysis, you use statistical information from an area. For example, if you wanted to conduct a study on the life expectancy of people from Savannah, you would want to examine every single resident of Savannah. This is not practical. Therefore, you would conduct your research using a statistical hypothesis or a sample of Savannah's population. 50% madurai’s population lives beyond the age of 70. 45% of the poor in the tamilnadu are illiterate.  

 1. the act or the result of interpreting : explanation. 2 :  a particular adaptation or version of a work , method, or style. Interpretation is  the act of explaining, reframing, or otherwise showing your own understanding of something . ... Interpretation requires you to first understand the piece of music, text, language, or idea, and then give your explanation of it. Interpretation

Inference is  using observation and background to reach a logical conclusion . You probably practice inference every day. For example, if you see someone eating a new food and he or she makes a face, then you infer he does not like it. Or if someone slams a door, you can infer that she is upset about something Inference
Tags