首页 awk中文手册(Awk Chinese Manual)

awk中文手册(Awk Chinese Manual)

举报
开通vip

awk中文手册(Awk Chinese Manual)awk中文手册(Awk Chinese Manual) awk中文手册(Awk Chinese Manual) 1. brief introduction of awk Awk is a programming language for processing text and data under the linux/unix. Data can come from standard inputs, one or more files, or other commands. It supports advan...

awk中文手册(Awk Chinese Manual)
awk中文手册(Awk Chinese Manual) awk中文手册(Awk Chinese Manual) 1. brief introduction of awk Awk is a programming language for processing text and data under the linux/unix. Data can come from standard inputs, one or more files, or other commands. It supports advanced functions such as user-defined functions and dynamic regular expressions, and is a powerful programming tool under linux/unix. It is used on the command line, but more as a script. The way awk handles text and data is that it scans files from line one to last line, looking for lines that match a specific pattern, and do what you want on those lines. If no action is specified, the matched row is displayed to the standard output (screen), and all lines specified by the operation are processed if no pattern is specified. Awk represents the first letter of the author's last name. Because it is the author of three people, namely Alfred, Aho, Brian, Kernighan, Peter, Weinberger. Gawk is the GNU version of awk, which provides some extensions to the Bell lab and GNU. The awk described below is GUN's example of gawk, which links awk to gawk in the Linux system, so all of the following are described in awk. 2. awk command format and options There are two forms of syntax for 2.1. awk * Awk, [options],'script', var=value, file (s) * Awk, [options], -f, scriptfile, var=value, file (s) 2.2. command options -F, FS, or, --field-separator, FS The specified input file main separator, FS is a string or a regular expression, such as -F. -v, var=value, or, --asign, var=value Assign a user-defined variable. -f, scripfile, or, --file, scriptfile Read the awk command from the script file. -mf, NNN, and, -mr, NNN Sets the intrinsic limit to the NNN value; the -mf option restricts the maximum number of blocks allocated to the NNN; the -mr option restricts the maximum number of records. These two functions are extensions of the Bell lab version of awk, which are not applicable in standard awk. -W, compact, or, --compat, -W, traditional, or, --traditional Run awk in compatible mode. So the behavior of gawk is exactly the same as the standard awk, and all awk extensions are ignored. -W, copyleft, or, --copyleft, -W, copyright, or, --copyright Print short copyright information. -W, help, or, --help, -W, usage, or, --usage Print short descriptions of all awk options and each option. -W lint or --lint A warning that printing can not be transplanted to the traditional UNIX platform. -W lint-old or --lint-old Print warnings about structures that cannot be transplanted to the traditional UNIX platform. -W POSIX Open compatibility mode. But with the following restrictions, does not recognize: \x, func, and change the function keyword sequence, when FS is a box, insert a new row as a field separator; and cannot replace the operator * * turned ^ and ^ fflush is invalid. -W re-interval or --re-inerval The use of regular expressions is allowed, the reference (Posix character class in grep), such as "alpha:]]: the expression in brackets. -W, source, program-text, or, --source, program-text Using program-text as the source code, you can mix with the -f command. -W version or --version Print the version of the bug report information. 3. mode and operation Awk scripts are made up of patterns and actions: Pattern {action}, such as $awk,'/root/', test, or $awk'$3 < 100' test. Both are optional; if there are no patterns, the action is applied to the full record, and if no action is, the output matches all the records. By default, each input line is a record, but the user can be delimited by specifying different separators through the RS variable. 3.1. mode The pattern can be any of the following: * / regular expression /: an extended set using wildcards. * Relational expressions: you can use the relational operators in the following operator table to compare strings or numbers, such as $2>%1, which selects second fields that are longer than the first field. * Pattern matching expression: using operator ~ (matching) and ~ (mismatch). * Mode: Specifies the range of a row. The syntax cannot include BEGIN and END modes. * BEGIN: let the user specify what happens before the first input record is processed, and you can always set global variables here. * END: the action that allows the user to take after the last entry record is read. 3.2. operation Operations are made up of one or more commands, functions, expressions, separated by line breaks or semicolons, and are inside braces. There are four main parts: * Variable or array assignment * Output command * Built-in function * Control flow command 4. awk environment variables Table 1. awk environment variables Variable description $n the N field of the current record, separated by FS between fields. $0 complete input record. ARGC the number of command line arguments. The location of the current file in the ARGIND command line (counted from 0). ARGV contains an array of command line arguments. CONVFMT digital conversion format (default value is%.6g) ENVIRON environment variable associative array. ERRNO description of the last system error. FIELDWIDTHS field width list (separated by space bar). FILENAME current file name. FNR with NR, but relative to current file. FS field delimiter (default is any space). If IGNORECASE is true, case matching is ignored. NF the number of fields in the current record. NR current record number. OFMT digital output format (default value is%.6g). OFS output field separator (default value is a blank space). The ORS output record separator (the default value is a newline character). RLENGTH the length of the string to be matched by the match function. The RS record separator (default is a newline character). RSTART is the first location of the string to be matched by the match function. SUBSEP array index delimiter (default value is \034). 5. awk operator Table 2. operator Operator description = = = = = / = ^ turned assignment :: C conditional expression || or logic & & Logic and ~ ~ matches regular expressions and mismatched regular expressions < < = > > = = = =! Relational operator Space connection + plus minus * / / multiply, divide and save + - one yuan plus, minus, and logic non! ^ * * * exponible + + - increase or decrease, as a prefix or suffix $field reference In array members 6. records and domains 6.1. record Awk calls each line ending with a newline as a record. Record separator: the default input and output separators are carriage returns. They are stored in the built-in variables ORS and RS. $0 variable: it refers to the entire record. Such as $awk'{print $0}'test will output all records in the test file. Variable NR: a counter. Each time a record is processed, the value of NR is increased by 1. Such as $awk,'{print, NR, $0}', test will output all records in the test file and display the record number before the record. 6.2. domain Each word in the record is called a domain and is delimited by spaces or tab by default. Awk tracks the number of fields and stores the value in the built-in variable NF. If $awk,'{print, $1, $3}', test prints the first and third columns (fields) separated by spaces in the test file. 6.3. domain delimiter The built-in variable FS holds the value of the input field delimiter, and the default is space or tab. We can modify the value of FS through the -F command line option. If $awk, -F:,'{print, $1, $5}', test will print the first with colons as separators, The contents of the fifth column. You can use multiple domain separators at the same time, and you should write the separators in square brackets, such as $awk, -F'[, \t]','{print, $1, $3}', test, which represent spaces, colons, and tab as separators. The separator of the output field defaults to a space and is stored in OFS. A comma is the value of OFS, such as $awk, -F:,'{print, $1, $5}', test, $1, and $5. 7. gawk regular regular expression meta characters The generic meta character set is not available. Refer to my Sed and Grep learning notes. The following are gawk specific ones that are not suitable for the UNIX version of awk. \Y An empty string that matches the beginning or end of a word. \B Matches an empty string within a word. \< The null string that matches the beginning of a word is anchored to begin with. \> Matches the empty string at the end of a word and anchors the end. \w A word that matches an alphanumeric number. \W Matches a word that is not alphanumeric. ' An empty string at the beginning of the matching string. ' Matches an empty string at the end of the string. 8. POSIX character set You can refer to my Grep learning notes 9. match operator (~) Used to match regular expressions in records or fields. Such as $awk'$1 ~/^root/'test will show the line beginning with root in the first column of the test file. 10. compare expressions Conditional, expression1, expression2:, expression3, for example: $awk,'{max = {$1 > $3}, $1:, $3:, print, max}', test. If the first domain is greater than third domains, the $1 is assigned to max, otherwise $3 is assigned to max. $awk'$1 + $2 < 100'test. If the first and second fields are added up to 100, print the rows. $awk'$1 & & $2 < 10'> 5 test, if the first domain is greater than 5, and the second area is less than 10, the line is printed. 11. scope template The range template matches all rows between the first appearance of the first template and the first occurrence of the second template. If a template does not appear, match to the beginning or end. Such as $awk,'/root/, /mysql/', test will show all rows between root for the first time to the first occurrence of mysql. 12. an example of validation of passwd files 1$cat /etc/passwd awk -F: | '\ 2NF = 7{\ 3printf (line%d, does not have 7 fields:%s\n, NR, $0)}\ 4$1! ~ /[A-Za-z0-9]/{printf ("line%d non alpha and numeric user id:%d:%s\n, NR, $0)}\ 5$2 = = "*" ("{printf line%d, no password:%s\n, NR, $0)}' One Cat exports the result to awk, where awk sets the separator between the domains as colon. Two If the number of domains (NF) is not equal to 7, the following program is executed. Three Printf print string "line, does, not, have, 7, fields", and display the bar record. Four If the first field does not contain any letters and numbers, printf prints "no alpha and numeric user ID", and displays the number of records and records. Five If the second field is an asterisk, print the string "no passwd", followed by the number of records displayed and the record itself. 13. several examples * '/^ $awk (no|so) test----- / no or so mode to print all at the beginning of the line. * The $awk'/^[ns]/{print $1}'test----- prints the record if the record starts with N or S. * The $awk'$1 ~/[0-9][0-9]$/ (print $1}'test-----) prints the record if the first field ends with two digits. * $awk'$1 = = 100 || $2 < 50'test----- if the first or equal to 100 or second domain is less than 50, then print the line. * $awk'$1! = 10'test-----, print the line if the first field is not equal to 10. * $awk'/test/{print $1 + 10}'test----- if the record contains the regular expression test, the first field is added 10 and printed out. * $awk'{print ($1 > 5? "OK" $1: "error" $1) expression}'test----- if the first domain more than 5 print mark behind the value, otherwise the value of expression print colon. * $awk,'/^root/, /^mysql/', test---- print the records beginning with the regular expression root to all records within the record range beginning with the regular expression mysql. If a new regular expression root is found, then the print continues until the next record at the beginning of the regular expression mysql, or at the end of the file. 14. awk programming 14.1. variable * In awk, variables do not need to be defined and can be used directly. Variable types can be numbers or strings. * Assignment format: Variable = expression, such as'$1 ~/test/{count $awk = $2 + $3; print count}'test, on the role of awk is to scan the first domain, once the test match, the second field values plus third field values, and the results are assigned to the variable count, then print out. * Awk can assign variables to the command line, and then transfer this variable to the awk script. Such as $awk, -F:, -f, awkscript, month=4, test, year=2004, month, and year are all custom variables, assigned to 4 and 2004, respectively. In awk scripts, these variables are used as if they were created in scripts. Note that the variables in the BEGIN statement cannot be used if the test appears in front of the argument. * The domain variables can also be assigned and modifications, such as $awk'{$2 = 100 + $1; print}'test type said that if the second domain does not exist, awk will calculate the expression of 100 plus the value of $1, and assigned to the $2, if the second domains exist, with the value of the expression $2 original cover the value of. For example: $awk'$1 = "root" {$1 = "test"; print}'test, if the first domain value is "root", it is set to "test", note that the string must be enclosed in double quotes. * Use of built in variables. The list of variables is listed before, and now give an example. $awk -F:'{IGNORECASE=1; $1 = = "MARY" {print, NR, $1, $2, $NF}'test, IGNORECASE set to 1 representative ignorecase, print the first domain is Mary the number of records, the first domain, the second domains and the last domain. 14.2. BEGIN module The BEGIN module follows the action block, which is executed before the awk processes any input files. So it can be tested without any input. It is often used to change values for built-in variables such as OFS, RS, and FS, and print titles. Such as: $awk'BEGIN{FS=::; OFS=; "\t"; ORS=; "\n\n";}{print $1; $2; $3}; test. The upper representation indicates that the domain separator (FS) is set as a colon before the input file is processed, the output file separator (OFS) is set as a tab, and the output record separator (ORS) is set to two newline characters. $awk'BEGIN{print "TITLE TEST"} print only the title. 14.3. END module END does not match any input file, but does all the actions in the action block, which is executed after the entire input file is processed. If $awk,'END{print, The, number, records, is, NR}', test, the of will print all the records being processed. 14.4. redirects and pipes * The awk shell can be used to redirect output symbols, such as: $awk'$1 = 100 {print $1 > "output_file"}'test. The upper expression indicates that if the value of the first domain is equal to 100, it is output to output_file. Can also be used to redirect output, But not empty files, only additional operations. * Output redirection requires the getline function. Getline gets input from the standard input, pipes, or other input files outside the currently processed file. It is responsible for getting the next line from the input and assigning the built-in variables such as NF, NR, and FNR. If you get a record, the getline function returns 1 and returns 0 if you arrive at the end of the file. If an error occurs, such as failing to open the file, return the -1. Such as: $awk'BEGIN{"date" getline D print d}'test |. Execute the date command of Linux and export it to the getline through the pipe, then assign the output to the custom variable D, and print it. $awk'BEGIN{"date" | getline D; split (D, Mon); print mon[2]}'test. Execute the shell date command, and to the getline output through the pipeline, and then getline from the pipeline will be assigned to read and input D, split function to variable d into an array of Mon, and then print the mon array of second elements. $awk'BEGIN{while ("LS getline print}'" |), output LS command passed to geline as input, the getline loop to read a line from the output of LS, and it is printed to the screen. There is no input file because the BEGIN block is executed before opening the input file, so you can ignore the input file. $awk,'BEGIN{printf, What, your, name, is, getline, name,}, $1, ~name, {print, Found, on, line, name, NR,}, END{print, See, you, name, /dev/tty,} test. Print "What is your name" on the screen "," and wait for the user to answer. When a line is entered, the getline function receives the row input from the terminal and stores it in the custom variable name. If the first field matches the value of the variable name, the print function is executed, and the END block prints the values of See, you, and name. $awk'BEGIN{while (getline < /etc/passwd > 0) lc++; print lc}'. Awk reads the contents of the file /etc/passwd line by line, and the counter LC increases until the end of the file, and prints the value of LC to the end. Note that if the file does not exist, getline returns -1 if the file is returned at the end of 0, if read a line, return to 1, so the command while (getline < /etc/passwd) in the absence of documents will go into an infinite loop, because the return -1 logic really. * You can open a pipe in awk, and only one pipe exists at the same time. The pipe can be closed via close (). Such as: $awk'{print $1, $2 | "sort"}'test END {close ("sort")}. AWD outputs the output of the print statement via the pipe as input to the Linux command sort, and the END block performs the closed pipe operation. * The system function can execute the command of Linux in awk. Such as: $awk'BEGIN{system ('clear') ''. * The fflush function is used to flush the output buffer. If no arguments are available, the buffer for the standard output is refreshed. If the empty string is used as a parameter, such as fflush ("" "), the output buffer for all files and pipes is refreshed. 14.5. conditional statement Conditional statements in awk are learned from the C language, and can control the flow of programs. 14.5.1. if statement Format: {if (expression) { Statement; statement;... } } $awk'{if ($1 <$2) print $2 "too high test"}'. If the first domain is less than second domains, then print. $awk'{if ($1 < $2) {count++; print "OK" test.}}'if the first domain domain is less than second, count plus one, and print ok. The 14.5.2. if/else statement is used for double judgment. Format: {if (expression) { Statement; statement;... } Else{ Statement; statement;... } } $awk '{if ($1 > 100) print $1 "bath"; else print "ok"}' test. 如果 $1大于100则打印 $1 bath, 否则打印ok. $awk '{if ($1 > 100) (count + +; print $1} else {count - -; print $2}' test.如果 $1大于100, 则count加一, 并打印 $1, 否则count减一, 并打印 $1. 14.5.3. the if / else else if语句, 用于多重判断. 格式: {if (expression) { statement; statement;. } else if (expression) { statement; statement;. } else if (expression) { statement; statement;. } else { statement; statement;. } } 14.6. 循环 * awk有三种循环: while循环; for循环; special for循环. * $awk '{i = 1; while (< = nf) {print $nf in; i + +}}' test. 变量的初始值为1, 若i小于可等于nf ( 记录 混凝土 养护记录下载土方回填监理旁站记录免费下载集备记录下载集备记录下载集备记录下载 中域的个数), 则执行打印语句, 且i增加1.直到i的值大于nf. * $awk '{(i = 1; / * breadkcontinue语句.break用于在满足条件的情况下跳出循环; continue用于在满足条件的情况下忽略后面的语句, 直接返回循环的顶端.如: {(x = 3; x < = nf; x + +) if ($x < 0) {print "bottomed out!" break;}} {(x = 3; x < = nf; x + +) if ($x = = 0) {print "get next item"; continue}} * next语句从输入文件中读取一行, 然后从头开始执行awk脚本.如: {if ($1 ~ / test /) {next} } else {print } * exit语句用于结束awk程序, 但不会略过end块.退出状态为0代 关于同志近三年现实表现材料材料类招标技术评分表图表与交易pdf视力表打印pdf用图表说话 pdf 成功, 非零值表示出错. 14.7. 数组 awk中的数组的下标可以是数字和字母, 称为关联数组. 14.7.1. 下标与关联数组 * 用变量作为数组下标.如: $awk {name [x + +] = $2}; {(i = 0; / * special for循环用于读取关联数组中的元素.格式如下: {(item in arrayname) { print arrayname [item] } } $awk '^ tom / {name [no] = $1}; {(in name) {print name [in]}}' test.打印有值的数组元素.打印的顺序是随机的. * 用字符串作为下标.如: count ["test"] * 用域值作为数组的下标.一种新的for循环方式, (index _ value in array) statement.如: $awk '{count [$1] + +} over {(name in count) print name, count [name]}' test.该语句将打印 $1中字符串出现 的次数.它首先以第一个域作数组count的下标, 第一个域变化, 索 引就变化. * delete 函数用于删除数组元素.如: $awk '{line [x + +] = $1} over {(x in line) delete (line [x])}' test.分配给数组line的是第 一个域的值, 所有记录处理完成后, special for循环将删除每一个 元素. 14.8. awk的内建函数 14.8.1. 字符串函数 * sub函数匹配记录中最大、最靠左边的子字符串的正则表达式, 并用 替换字符串替换这些字符串.如果没有指定目标字符串就默认使用整 个记录.替换只发生在第一次匹配的时候.格式如下: sub (regular expression, the substitution string): sub (regular expression, the substitution string, string target) 实例: $ awk {子(/测试/,“考试”);}的打印测试文件 $ awk {子(/测试/,“考试”);} } 1美元;打印的测试文件 第一个例子在整个记录中匹配,替换只发生在第一次匹配发生的时候如要在整个文件中进行匹配需要用到gsub。 第二个例子在整个记录的第一个域中进行匹配,替换只发生在第一次匹配发生的时候。 * gsub函数作用如子,但它在整个文档中进行匹配。格式如下: gsub(正则表达式,替换字符串) gsub(正则表达式,替换字符串,目标字符串) 实例: $ awk { gsub(/测试/,“考试”);}的打印测试文件 $ awk { gsub(/测试/,“考试”),1美元} };打印的测试文件 第一个例子在整个文档中匹配试验中,匹配的都被替换成。 第二个例子在整个文档的第一个域中匹配,所有匹配的都被替换成mytest。 * 指数函数返回子字符串第一次被匹配的位置,偏移量从位置开始格式如下1: 指数(字符串,字符串) 实例: 美元指数(awk {打印“测试”、“考试”)}的测试文件 实例返回试验中在的位置,结果应该是3。 * 函数返回记录的字符数格式如下长度: 长度(字符串) 长度 实例: $ awk {打印长度(“测试”)}” $ awk {打印长度}的测试文件 第一个实例返回测试字符串的长度。 第二个实例返回个文件中第条记录的字符数。 * 在函数返回从位置1开始的子字符串,如果指定长度超过实际长度, 就返回整个字符串。格式如下: substr(字符串,起始位置) substr(字符串,起始位置,字符串长度) 实例: $ awk {打印substr(“Hello World”,11)}” 上例截取了世界子字符串。 * 比赛函数返回在字符串中正则表达式位置的索引,如果找不到指定的正则表达式则返回0。比赛函数会设置内建变量rstart为字符串中子字符串的开始位置,RLENGTH为到子字符串末尾的字符个数。可利于这些变量来截取字符串函数格式如下函数: 匹配(字符串,正则表达式) 实例: $ awk {开始=匹配(“这是一个测试”,/ [A-Z] +美元);打印开始}” $ awk {开始=匹配(“这是一个测试”,/ [A-Z] +美元);打印开始,RSTART,RLENGTH }” 第一个实例打印以连续小写字符结尾的开始位置,这里是11。 第二个实例还打印rstart和RLENGTH变量,这里是11(开始)、11(rstart)、4(RLENGTH)。 * 目前和降低函数可用于字符串大小间的转换,该功能只在中有效格式如下目瞪口呆: toupper(字符串) tolower(字符串) 实例: $ awk {打印toupper(“测试”),降低(“测试”)}” * 分函数可按给定的分隔符把字符串分割为一个数组。如果分隔符没提供,则按当前值进行分割格式如下FS: 拆分(字符串、数组、字段分隔符) 拆分(字符串,数组) 实例: $ awk {分裂(“20:18:00”时间,“:”);打印时间[ 2 ] }” 上例把时间按冒号分割到时间数组内, And show second array elements 18. 14.8.2. time function * The systime function returns the total number of seconds from January 1, 1970 until the current time (excluding leap year). Format as follows: Systime () Example: $awk'{now = systime (print); now}' * The strftime function uses the strftime function in the C library to format the time. Format as follows: Systime ([format, specification][, timestamp]) 3. date and time format specifier Table Format description %a (Sun) %A complete Sunday of the week %b month name abbreviation (Oct) The full name of%B month name (October) %c local date and time %d decimal date %D date 08/20/99 %e date, if only one will make up a space %H uses decimal notation for hours in 24 hours %I uses decimal notation for hours in 12 hours %j from January 1st onwards, the first few days of the year? %m decimal represents the month %M decimal representation of minutes %p 12 hour notation (AM/PM) %S decimal indicating seconds %U decimal representation of the first few weeks of the year (Sunday as the beginning of a week) What day is%w decimal representation (Sunday is 0)? %W decimal representation of the first few weeks of the year (Monday as the beginning of a week) %x resets the local date (08/20/99) %X resets local time (12:00:00) %y two digit representation of the year (99) %Y current month %Z time zone (PDT) %% percent (%) Example: '{$awk now=strftime ("%D"); print now (systime)}' '{$awk now=strftime ("%m/%d/%y"); print now}' 14.8.3. built in mathematical functions Table 4. Function name return value Atan2 (x, y) y, the range of X. Cos (x) cosine function Exp (x) exponible Int (x) integral Log (x) natural logarithm Rand () random number Sin (x) sine Sqrt (x) square root Srand (x) x is the seed of the rand () function Int (x) process without rounding, rounding Rand () produces a random number greater than or equal to 0 and less than 1 14.8.4. custom functions In awk you can also customize the function, the format is as follows: Function, name (parameter, parameter, parameter,...) { Statements Return expression the return statement and expression are # optional } 15. How-to * How to make a vertical data conversion in a row? Awk'{printf ("%s," $1 filename,}')
本文档为【awk中文手册(Awk Chinese Manual)】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_954223
暂无简介~
格式:doc
大小:70KB
软件:Word
页数:27
分类:生活休闲
上传时间:2018-02-22
浏览量:78