1 00:00:01 --> 00:00:04 Welcome to the spoken tutorial on awk command. 2 00:00:05 --> 00:00:08 In this tutorial we will learn awk command. 3 00:00:09 --> 00:00:11 We will do this through some examples. 4 00:00:12 --> 00:00:22 To record this tutorial, I am using: Ubuntu Linux 12.04 OS GNU BASH v. 4.2.24 5 00:00:23 --> 00:00:28 Please note, GNU Bash version 4 or above is recommended to practice this tutorial. 6 00:00:29 --> 00:00:32 Let us start with an introduction to awk. 7 00:00:33 --> 00:00:37 The awk command is a very powerful text manipulation tool. 8 00:00:38 --> 00:00:43 It is named after its authors, Aho, Weinberger and Kernighan. 9 00:00:44 --> 00:00:45 It can perform several functions. 10 00:00:46 --> 00:00:50 It operates at the field level of a record. 11 00:00:51 --> 00:00:55 So, it can easily access and edit the individual fields of the record. 12 00:00:56 --> 00:00:58 Let us see some examples. 13 00:00:59 --> 00:01:03 For demonstration purpose, we use the awkdemo.txt file. 14 00:01:04 --> 00:01:08 Let us see the contents of awkdemo.txt file. 15 00:01:09 --> 00:01:16 Now open the terminal window by pressing Ctrl + Alt and T keys simultaneously on your keyboard. 16 00:01:17 --> 00:01:21 Let us see how to print using awk command. 17 00:01:22 --> 00:01:37 Type: awk space (within single quote) (front slash) '/Pass (front slash)/(opening curly bracket) {print (closing curly bracket)} (after the quotes) space awkdemo.txt 18 00:01:38 --> 00:01:39 Press Enter. 19 00:01:40 --> 00:01:43 Here, Pass is the selection criteria. 20 00:01:44 --> 00:01:48 All the lines of the awkdemo where Pass occurs are printed. 21 00:01:49 --> 00:01:51 The action here is print. 22 00:01:52 --> 00:01:55 We can also use regular expressions in awk. 23 00:01:56 --> 00:02:00 Say, we want to print records of students with name "Mira." 24 00:02:01 --> 00:02:26 We would type: awk space '/M (opening square bracket) [ ei (closing square bracket) ]*ra */{print}' space awkdemo.txt 25 00:02:27 --> 00:02:28 Press Enter. 26 00:02:29 --> 00:02:32 "*" will give one or more occurrences of previous character. 27 00:02:33 --> 00:02:39 Thus, entries with more than one occurrence for i, e and a will be listed. 28 00:02:40 --> 00:02:41 For example, 29 00:02:42 --> 00:02:44 *Mira 30 00:02:45 --> 00:02:46 *Meera 31 00:02:47 --> 00:02:51 *Meeraa 32 00:02:52 --> 00:02:57 awk supports the extended regular expressions (ERE) 33 00:02:58 --> 00:03:02 Which means we can match multiple patterns separated by a PIPE. 34 00:03:03 --> 00:03:04 Let me clear the prompt. 35 00:03:05 --> 00:03:05 Now type: 36 00:03:06 --> 00:03:22 electrical(front slash)space (opening curly brackets)/{print}(closing curly brackets) after the quotes spaceawkdemo.txt 37 00:03:23 --> 00:03:25 Press Enter. 38 00:03:26 --> 00:03:30 Now entries for both "civil" and "electrical" are given. 39 00:03:31 --> 00:03:33 Let us go back to our slides. 40 00:03:34 --> 00:03:40 Parameters: awk has some special parameters to identify individual fields of a line. 41 00:03:41 --> 00:03:44 $1(Dollar 1) would indicate the first field. 42 00:03:45 --> 00:03:52 Similarly we can have $2, $3 and so on for respective fields. 43 00:03:53 --> 00:03:55 $0 represents the entire line. 44 00:03:56 --> 00:03:58 come back to our terminal. 45 00:03:59 --> 00:04:04 Note that each word is separated by PIPE in the file awkdemo.txt. 46 00:04:05 --> 00:04:08 In this case PIPE is called a delimiter. 47 00:04:09 --> 00:04:12 A delimiter separates words from each other. 48 00:04:13 --> 00:04:15 A delimiter can also be a single white space. 49 00:04:16 --> 00:04:23 To specify a delimiter, we have to give - capital F flag followed by a delimiter. 50 00:04:24 --> 00:04:24 Let us see. 51 00:04:25 --> 00:04:50 Type: awk space minus capital F space within double quotes PIPE space within single quote front-slash civil PIPE electrical front-slash opening curly bracket print space dollar0 closing curly bracket after the quotes space awkdemo.txt 52 00:04:51 --> 00:04:52 Press Enter. 53 00:04:53 --> 00:04:57 This prints the entire line since we have used $0. 54 00:04:58 --> 00:05:03 Notice that names and stream of students are the second and third fields. 55 00:05:04 --> 00:05:07 Say we only want to print two fields. 56 00:05:08 --> 00:05:14 We will replace $0 with $2 and $3 in the above command. 57 00:05:15 --> 00:05:17 Press Enter . 58 00:05:18 --> 00:05:20 Only two fields are shown. 59 00:05:21 --> 00:05:25 Though it gives the right result, the display is all jagged and un-formatted. 60 00:05:26 --> 00:05:31 We can provide formatted output by using the C style printf statement. 61 00:05:32 --> 00:05:39 We can also provide a serial number by using a built-in variable NR. 62 00:05:40 --> 00:05:43 We will see more about built-in variables later. 63 00:05:44 --> 00:06:32 Now Type awk space minus capital F within double quotes (Pipe) after the double quotes space 'front-slash Pass front slash opening curly bracket printf (within double quotes) "percentage sign 4d space percentage sign -25s space percentage sign minus 15s space backslash nĂ¢?, after the double quotes NR,$2,$3 closing curly bracket' after the single quote space awkdemo.txt 64 00:06:33 --> 00:06:33 Press Enter. 65 00:06:34 --> 00:06:36 We see the difference. 66 00:06:37 --> 00:06:40 Here, NR stands for number of records. 67 00:06:41 --> 00:06:44 Records are integers, hence we have written %d. 68 00:06:45 --> 00:06:49 Name and Stream are strings. So we have used %s. 69 00:06:50 --> 00:06:54 Here 25s will reserve 25 spaces for Name field. 70 00:06:55 --> 00:07:00 15s will reserve 15 spaces for Stream field. 71 00:07:01 --> 00:07:04 The minus sign is used to left justify the output. 72 00:07:05 --> 00:07:07 This brings us to the end of this tutorial. 73 00:07:08 --> 00:07:09 Let us move back to our slides. 74 00:07:10 --> 00:07:10 Let us summarize. 75 00:07:11 --> 00:07:15 In this tutorial we learnt: * To print using awk 76 00:07:16 --> 00:07:20 * Regular expression in awk * To list the entries for a particular stream 77 00:07:21 --> 00:07:23 * To list only the second and the third fields 78 00:07:24 --> 00:07:27 * To display a formatted output. 79 00:07:28 --> 00:07:28 As an assignment, 80 00:07:29 --> 00:07:33 display roll no., stream and marks of Ankit Saraf. 81 00:07:34 --> 00:07:36 Watch the video available at the link shown below. 82 00:07:37 --> 00:07:39 It summarizes the Spoken Tutorial project. 83 00:07:40 --> 00:07:44 If you do not have good bandwidth, you can download and watch it. 84 00:07:45 --> 00:07:47 The Spoken Tutorial Project Team: Conducts workshops using spoken tutorials. 85 00:07:48 --> 00:07:51 Gives certificates to those who pass an online test. 86 00:07:52 --> 00:07:57 For more details, please write to contact@spoken-tutorial.org 87 00:07:58 --> 00:08:00 Spoken Tutorial Project is a part of the Talk to a Teacher project. 88 00:08:01 --> 00:08:06 It is supported by the National Mission on Education through ICT,MHRD,Government of India. 89 00:08:07 --> 00:08:11 More information on this Mission is available at: [1] 90 00:08:12 --> 00:08:17 This is Ashwini Patil from IIT Bombay, signning off. Thank you for joining.