Introduction to AWK utility on unix.pptx

abuadu 27 views 27 slides May 02, 2024
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

Notes on AWK utility


Slide Content

SYSTEM PROGRAMMING OLUFEMI OLOLADE OLAEWE ASSISTANT SOFTWARE DEVELOPER [BSc. , M.PHIL]

Introduction to AWK utility AWK is a programming language created by Aho , Kernighan, and Weinberger. It is useful for: manipulation of data files, text retrieval and processing, generation of reports, and for prototyping and experimenting with algorithms.

Versions: awk, nawk , mawk, pgawk , and gawk (GNU).

AWK CONTD An AWK program is a sequence of pattern {action} pairs and function definitions. Short programs are entered on the command line usually enclosed in ' ' to avoid shell interpretation. Longer programs can be read in from a file with the -f option.

AWK CONTD.

Syntax

AWK CONTD.

• pattern {action} . • One, but not both, of pattern {action} can be omitted. • i.e., a program must have either pattern or {action}, or both. • If pattern is missing, action is applied to all lines (it is implicitly matched), • if action is missing, the matched line is printed (it is implicitly {print}). • E.g., the command: awk '/for/' testfile prints all lines containing string “ for ” in testfile

Basic Terminology of Input files • Data on the input file is broken into records as determined by the record separator variable, RS. • By default, RS = "\n" i.e. new line. • Each line of data or text on the input file is referred to as a record. • Records are read in one at a time, and the current record is stored in the field variable $0.

A record is split into fields which are stored in the field buffers $1, $2, ..., $NF. • A field is thus, a unit of data in a line (record). • Each field in a record is separated from the other fields by the field separator, FS. • The default field separator is whitespace.

Some System/Built-in Variables

EXAMPLE 1

EXAMPLE 2

• A pattern can be: BEGIN, END, expression expression, expression • Note that, BEGIN and END patterns require an action. • An AWK script can be divided into three main parts as follows:

• BEGIN: performs pre-processing that must be completed before awk starts reading records from the input file. • Mostly to initialize variables and to create report headings. • BODY: contains main processing logic to be applied to input records, • like a loop that processes input data one record at a time: • the body executes mostly ones for each record. • END: post-processing contains logic to be executed after all input data have been processed. • Logic such as printing report grand total are performed in this part of the script.

Statements • Statements in an AWK program are terminated by newlines, semi-colons or both. • Groups of statements such as actions or loop bodies are blocked via {...} as in C. • The last statement in a block doesn't need a terminator. • Blank lines have no meaning; an empty statement is terminated with a semicolon. • Long statements can be continued with a backslash, \. • A statement can be broken without a backslash after a comma, left brace, &&, ||, do, else, the right parenthesis of an if, while or for statement, and the right parenthesis of a function definition. • A comment in AWK starts with #.

Expressions and operators Primary AWK expressions are – numeric constants, – string constants, – variables, – fields, – arrays and – function calls.

The identifier for a variable, array or function can be a sequence of – letters, digits and underscores – and does not start with a digit. • Variables are not declared; they exist when first referenced and are initialized to null.

New expressions are composed with the following operators in order of increasing precedence.

Expression pattern types • uses marching • either searches through an entire record for a possible march using regular expression enclosed by ‘/’s • or explicitly searches for a march in a particular field or group of fields using the expressions ~ (march) or !~ (not march).
Tags