Summary

The tr command performs transforms on a stream of text, producing a new stream as its output. You can substitute, delete, or convert characters according to rules you set on the command line.

Do you need a no-frills method for manipulating a stream of text in Linux? Look no further than the tr command, which can save you time in replacing, removing, combining, and compressing input text. This is how it’s done.

Using tokens to change the case of strings of text

What Is the tr Command?

The Linuxtrcommand is a fast and simple utility for stripping out unwanted characters from streams of text, and for other neat manipulation tricks. It gets its name from the word “translate,” andtr’s roots run deep in the Unix tradition.

As we all know, Linux is an open-source rewrite of Unix. It adds its own stuff into the mix, too. It isn’t a byte-for-byte clone, but it clearly takes much of its design principles and engineering steerage from the Unix operating system.

Splitting a line of text into one word per line, with tr

Although only two Linux distributions have so far been certified asPOSIXcompliant and rubber-stamped as being officially accepted as implementations of Unix—EulerOSandInspur K-UX—Linuxhas almost completely supplanted Unix in the business world.

All Linux distributions, at least in their core utilities, adhere to the Unix philosophy. The Unix philosophyencapsulates the visionthe Unix pioneers had for their newoperating system. It’s often para-phrased as being “Write programs that do one thing well.” But there’s more to it than that.

Splitting the $PATH environment variable into separate directory paths, one per line, with tr

One of the most powerful innovations was that programs should generate output that could be used as the input to other programs. The ability to daisy chain command line utilities together, using the output stream from one program as the input stream to the next program in line, is massively powerful.

Sometimes you’ll want to fine-tune or tweak the output from one program before it reaches the next program in line. Or perhaps you’re not taking your input from a Linux command line tool, you’re streaming text out of a file that hasn’t been created with your particular needs in mind.

Combining multi-line input into a single line of text, using tr

This is wheretrcomes into its own. It allows you to perform a set of simple transformations on its input stream, to produce its output stream. That output stream can be redirected into a file, fed into another Linux program, or even into another instance oftrto have multiple transforms applied to the stream.

Replacing Characters

Thetrcommand operates on its input stream according to rules. Used without any command line options, the default action oftris to substitute characters in the input stream for other characters.

Commands totrusually require two sets of characters. The first set holds the characters that will be replaced if they are found in the input stream. The second set holds the characters that they will be replaced with.

A pipeline of four instances of tr

The way this works is occurrences of the first character in set one will be replaced by the first character in set two. Occurrences of the second character in set one will be replaced by the second character in set two, and so on.

This example will look for the letter “c” in the input stream totr, and replace each occurrence with the letter “z.” Note thattris case-sensitive.

to push some text intotr.

All occurrences of “c” are replaced with “z” and the new string is written to the terminal window.

This time we’ll search for two letters, “a” and “c.” Note that we’re not searching for “ac.” We’re looking for “a”, then looking for “c.” We’re going to replace any occurrence of “a” with “x” and any occurrence of “c” with “z.”

For this to work you must have the same number of characters in both sets. If you don’t, you’ll get predictable, but probably unwanted, behavior.

There are more characters in set one than in set two. The letters “d” to “m” have no corresponding character in set two. They’ll still get replaced, but they’re all replaced with the last character in set two.

It’s just about possible that this could be useful in some cases, but if you want to prevent this you can use the-t(truncate) option. This only replaces those characters contained in set one that have a matching character in set two.

Using Ranges and Tokens

Set one and set two can contain ranges of characters. For example,[a-z]represents all the lowercase letters, and[A-Z]represents all the uppercase letters. We can make use of this to change the case of a stream of text.

This will convert the input stream to uppercase.

To flip the case in the other direction, we can use the same command but with the uppercase and lowercase ranges swapped on the command line.

There are tokens that we can use for some of the common cases that we might want to match with.

We can perform our lowercase to uppercase and uppercase to lowercase conversions just as easily, using tokens.

Inverting the Matches

The-c(complement) option matches all characters apart from those in the first set. This command converts everything apart from the letter “c” to a hyphen “-”.

This command adds the letter “a” to the first set. Anything apart from “a” or “c” is converted to a hypen “-” character.

Deleting and Squeezing Characters

We can usetrto remove characters altogether, without any replacement.

This command uses the-d(delete) option to remove any occurrence of “a”, “d”, or “f” from the input stream.

This is one instance where we only have one set of characters on the command line, not two.

Another is when we use the-s(squeeze-repeats) option. This option reduces repeated characters to a single character.

This example will reduce repeated sequences of the space character to a single space.

It’s a little confusing that the[:blank:]token represents the space character, and the[:space:]token represents all forms of whitespace, including tabs and newline characters.

Deleting Characters

The differences between[:blank:]and[:space:]become apparent when we delete characters. To do this, we use the-d(delete) option, and provide a set of characters thattrwill look for in its input stream. Any that it finds are removed.

The spaces are deleted. Note that we get a newline after the output stream is written in the terminal window. If werepeat that commandand use[:space:]instead of blank, we’ll get a different result.

This time we don’t start a new line after the output, the command prompt is butted right up against it. This is because[:space:]includes newlines. Any spaces, tabs, and newline characters are removed from the input stream.

Of course, you could use an actual space character as well.

We can just as easily delete digits.

By combining the-c(complement) and-d(delete) options we can delete everything apart from digits.

Note that everything apart from digits mean all letters, and all whitespace, so once again we lose the terminating newline.

Combining and Splitting Lines

If we substitute newline characters for spaces, we can split a line of text and place each word on its own line.

We can change the delimiter that separates words, too. This command substitutes colons “:” for spaces.

We can find whatever delimiter is in use, and replace it with newline characters, splitting difficult to read text into easier to manage output.

The path environment variable is a long string of many directory paths. A colon “:” separates each path. We’ll change them to newline characters.

That’s much easier to visually parse.

If we have output that we want to reformat into a single line, we can do that too. The file “lines.txt” contains some text, with one word on each line. We’ll feed that intotrand convert it to a single line.

Using tr With Pipes

We can use the output fromtras the input for another program, or even totritself.

Related:How to Use Pipes on Linux

Simple Is as Simple Does

Thetrcommand is great because it is simple. There’s not much to learn nor remember. But its simplicity can be its downfall, too.

Make no mistake, frequently you’ll find thattrlet’s you do what you need without having to reach for more complicated tools likesed.

Related:How to Use the sed Command on Linux

However, if you’re struggling to do something withtr, and you find yourself building long daisy chains of commands, you probably should beusingsed.