=================== swap col 3 and 4 sed 's/^\([^\t]*\t[^\t]*\t\)\([^\t]*\t\)\([^\t]*\t\)\(.*\)/\1\3\2\4/' ------------------- Regular expressions c a single char, if not special, is matched against text. * matches a sequence of zero or more repetitions of previous char, grouped RE, or class. \+ as *, but matches one or more. \? as *, but only matches zero or one. \{i\} as *, but matches exactly sequences (a number, between 0 and some limit -- in Henry Spencer's regexp(3) library, this limit is 255) \{i,j\} matches between and , inclusive, sequences. \{i,\} matches more thanor equal to sequences. \{,j\} matches at most (or equal) sequences. \(RE\) groups RE as a whole, this is used to: - apply postfix operators, like `\(abcd\)*' this will search for zero or more whole sequences of "abcd", if `abcd*', it would search for "abc" followed by zero or more "d"s - use back references (see below) .. match any character ^ match the null string at beginning of line, i.e. what what appears after ^ must appear at the beginning of line e.g. `^#include' will match only lines where "#include" is the first thing on line, but if there are one or two spaces before, the match fail $ the same as ^, but refers to end of line \c matches character `c' -- used to match special chars, referred above (and some more below) [list] matches any single char in list. e.g. `[aeiou]' matches all vowels [^list] matches any single char NOT in list a list may be composed by -, and means all chars between (inclusive) and to include `]' in the list, make it the first char to include `-' in the list, make it the first or last RE1\|RE2 matches RE1 or RE2 \1 \2 \3 \4 \5 \6 \7 \8 \9, => \i matches the th \(\) reference on RE, this is called back reference, and usually it is (very) slow Notes: ------ - some implementations of sed, may not have all REs mentioned, notably `\+', `\?' and `\|' - the RE is greedy, i.e. if two or more matches are detected, it selects the longest, if there are two or more selected with the same size, it selects the first in text Examples: --------- `abcdef' matches "abcdef" `a*b' matches zero or more "a"s followed by a single "b" , like "b" or "aaaaaab" `a\?b' matches "b" or "ab" `a\+b\+' matches one or more "a"s followed by one or more "b"s, the minimum match will be "ab", but "aaaab" or "abbbbb" or "aaaaaabbbbbbb" also match `.*' all chars on line, of all lines (including empty ones) `.\+' all chars on line, but only on lines containing at least one char, i.e. empty lines will not be matched) `^main.*(.*)' search for a line containing "main" as the first thing on the line, that line must also contain an opening and closing parenthesis being the open paren preceded and followed by any number of chars (including none) `^#' all lines beginning with "#" (shell and make comments) `\\$' all lines ending with a single `\' (there are two for escaping `\') -- line continuation in C and make, and shell, etc... `[a-zA-Z_]' any letters or digits `[^ ]\+' (a tab and a space) -- one or more sequences of any char that isn't a space or tab. Usually this means a word `^.*A.*$' match an "A" that is right in the center of the line `A.\{9\}$' match an "A" that is exactly the last tenth character on line `^.\{,15\}A' match the last "A" on the first 16 chars of the line ======================================================================== Substitution ------------ This command is so often used that it deserves a whole section! (2)s/RE//[flags] -- (s)ubstitute, substitute - on specified lines, text matched by RE, if any, is replaced by - if replacement is done, the flag that permits the `test' command to be performed is set (more about this on `t' command) - the `/' separator, in fact could be ANY character. Usually it is `/' due to the fact that almost every program with regular expressions can use it. Exceptions are grep and lex, that don't use any char as a delimiter. - is raw text. The only exceptions are: & it is replaced by all text matched by RE Being so, then s/RE/&/ is a null op, whatever the RE, except for setting the test flag \d where `d' is a digit (see below for more), is replaced by the d-th grouped \(\) sub-RE some implementations of sed (more precisely, some implementations of regex(3) library, that some implementations of sed use), limit `d' to be a single digit (1-9). Others, such as GNU sed (2.05 at least) accept a valid number. GNU sed also accepts and understands `\0' as a `&'. i.e. the whole matched RE. I don't know if this behavior is standard. If there isn't a d-th grouped \(\), then \d is replaced by the null string. \c where `c' is any char except digits, quote `c' Note that besides the above, _all_ other text is raw, so `\n' or `\t' doesn't work as one might expect. To insert a newline for instance, one must do s/foo/bar-on-this-line\ - are optional, and can be combined g replace all occurrences of RE by (the default is to replace only the first) p write the pattern space only if the substitution w as successful w work as `p' flag, but the pattern space is written to d where `d' is a digit, replace the d-th occurrence, if any, of RE by