The awk, cut, grep, expr, sed and xargs commands provide many useful options for manipulating text. Credit: Afonasey/Shutterstock There are quite a few ways to extract words and substrings from lines of text on a Linux system, replace them with other strings, select delimiters, and even get rid of white space at the beginning and end of lines. These techniques can be extremely useful when you’re building scripts that may be used to process large amounts of data, cleaning up data files, or simply trying to grab a string for use in a subsequent command. This post describes many different commands that can make these tasks easier than you might imagine. Full words or portions of strings One important factor to keep in mind is whether you are trying to extract a complete word or a sequence of characters by position. The commands that you use will depend on what you want to grab or modify. Full words and substrings One of the easiest ways to grab a word from a segment of text based on its position (e.g., 3rd word) is to use the awk command. For example, to pull the third word from a phrase, you could use an awk command like this one: $ echo "Focus on Peace" | awk '{print $3}' Peace The $3 represents the third word in the phrase since the space character is the default delimiter. To do the same thing using the cut command, you would need to include the delimiter with the -d option in a command like this where -f3 represents the third word: $ echo "Focus on Peace" | cut -d' ' -f3 Peace You can also select multiple fields with the cut command as in these next examples: $ echo "Focus on Peace on Earth" | cut -d' ' -f3,5 Peace Earth $ echo "one two three 4 5 6" | cut -d' ' -f1-3,6 one two three 6 To use an alternate delimiter (in this case, a colon), use a command like this: $ cut -d':' -f1-3,5,6 /etc/passwd | tail -n 5 justme:x:1004:JustMe:/home/justme lola:x:1006::/home/lola dumdum:x:1007::/home/dumdum With awk, you can use more than one delimiter. In the following example, two delimiters are specified, so the awk command accepts either a colon or a blank to separate fields. The first two lines display the file, and the last two lines show the command and result. $ cat file Monday:1 Tuesday:2 Wednesday:3 Thursday:4 Friday:5 $ awk -F'[: ]' '{OFS=" ";print $1,$3,$4}' file Monday Tuesday 2 Selecting substrings To select an arbitrary sequence or characters from a string, you can use an awk command like the one below in which the $0 represents the entire phrase, 10 represents the first character position to be grabbed and 5 is the length of the string to be displayed. $ echo "Focus on Peace" | awk '{print substr($0,10,5)}' Peace To do the same kind of thing with the cut command, you would use a command like this in which the 13th through 22nd characters are extracted from the phrase and displayed. $ echo "Linux is an impressive OS" | cut -c 13-22 impressive In this next command, the cut command displays the 7th-12th characters from the lines in a file. The head command simply limits the display to the first 4 lines of output. $ cut -c 7-12 sayings | head -4 with 3 and ov nd be and be Using grep You can use the grep command to select multiple words from a file. In this example, only the selected words are displayed, not the entire lines. This is because the -o (display only the matched items) option is being used. $ cat sayings | grep -o "quarrel\|empty" quarrel quarrel empty Without the -o option, you would see the complete lines. $ cat sayings | grep "quarrel\|empty" They do not quarrel, So no one quarrels with them. Is that an empty saying? You can also select multiple-word phrases as shown in this example: $ cat sayings | grep -o "never falter\|do not quarrel" never falter do not quarrel Using expr The expr command can also be used to select a portion of a phrase by specifying its starting position and length. $ str="Learning to use Linux is fun" $ expr substr "$str" 13 9 use Linux Using sed The sed command provides a very convenient way to replace words in a string. $ echo "They never falter" | sed 's/falter/forget/' They never forget You can also use this kind of command to replace multiple words as in this example: $ echo "They never falter" | sed 's/never falter/always forget/' They always forget Using xargs To remove blanks at the beginning and endings of phrases, use the xargs command. $ echo " Keep your nose to the grindstone " | xargs Keep your nose to the grindstone The xargs command also removes blank lines and tabs. In the example below, a file containing two lines with only tabs and blanks and one starting with four blanks and ending with a saying is cut down to just the saying. $ cat eg Keep your nose to the grindstone $ cat eg | xargs Keep your nose to the grindstone Using bash parameter expansion When using bash parameter expansion, you can specify the starting and ending positions for the text that you want to extract. For example, you can create a variable by assigning it a value and then use syntax like that shown below to select a portion of it. $ string="Happy days are here again" $ echo ${string:1:10} appy days $ echo ${string:0:9} Happy days Note that the example above makes it clear that this technique starts position numbering at 0. So, in the next example, the 7 represents the eighth character in the string and the -2 means to drop the last 2 characters. As a result, the substring in the first example below has a single character and the second has all but the last two. $ string="1234567890" $ echo ${string:7:-2} 8 $ echo ${string:0:-2} 12345678 In this next example, we first create a variable using “set –” and then use echo to display the eighth and ninth characters. In other words, it starts with the eighth character (7) and then displays two characters. $ set -- 01234567890abcdef $ echo ${1:7:2} 78 NOTE: You could display the string created with the set command by simply using the command “echo $1”. This is what is referenced by the “1” in the example above. $ set -- 01234567890abcdef $ echo $1 01234567890abcdef Wrap-up Linux provides many commands to help you manipulate text. The awk, cut, grep, expr, sed and xargs commands along with bash parameter expansion provide you with many useful options. Related content how-to How to examine files on Linux Linux provides very useful options for viewing file attributes, such as owners and permissions, as well as file content. By Sandra Henry Stocker Oct 24, 2024 6 mins Linux how-to 8 easy ways to reuse commands on Linux Typing the same command again and again can become tiresome. Here are a number of ways you can make repeating commands – or repeating commands but with some changes – a lot easier than you might expect. By Sandra Henry-Stocker Oct 15, 2024 5 mins Linux news SUSE Edge upgrade targets Kubernetes and Linux at the edge SUSE Edge 3.1 includes a new stack validation framework and an image builder tool that are aimed at improving the scalability and manageability of complex Kubernetes and Linux edge-computing deployments. By Sean Michael Kerner Oct 15, 2024 6 mins Edge Computing Linux Network Management Software how-to Lesser-known xargs command is a versatile time saver Boost your Linux command line options and simplify your work with xargs, a handy tool for a number of data manipulation tasks. By Sandra Henry Stocker Oct 11, 2024 6 mins Linux PODCASTS VIDEOS RESOURCES EVENTS NEWSLETTERS Newsletter Promo Module Test Description for newsletter promo module. Please enter a valid email address Subscribe