Some Unix Tips

Here are some Unix commands and shortcuts that I have found useful, especially when working with large files of data.

Some of these commands work only in the csh or tcsh shells. However, those shells are the default on the Math Department computers, and most likely you are set up to use these shells. You can check this by doing "finger login", where "login" is your login name.

Shell shortcuts

Pipes and Redirection

One of the most useful features of Unix is the ability to "pipe" the output of one command into another command, and to "redirect" the output to a file. Pipes are denoted by vertical bars (|); redirections are denoted by "greater than" signs (>). Spaces around these symbols are not required, though for clarity, you might want to surround the symbols by spaces.

Redirection of output is a very simple concept; instead of displaying the output on the screen the program dumps its output onto a file. A Unix pipe works much like a plumber's pipe: it takes the output of the command to the left of the pipe symbol (|) and uses this as the input of the command to the right of the pipe symbol. Multiple pipes can be stacked together. Also, you can combine history shortcuts with pipes. For example, if, after displaying a sorted file with the "sort" command, you find that the file is too big to fit on one screen, the command "!! | less" will redisplay the file, one screen at a time.

The following examples illustrate the use of pipes. Most are self-explanatory; "file" stands for a generic filename.

Tools for analyzing data files

The following are some handy tools for analyzing data files. The first three are part of any standard Unix distribution and come with man pages that you can consult. Perl comes with multiple man pages, there is an elaborate online documentation system (accessed with perldoc), and an enormous amount of information available, both online and in print. See the separate Perl Tips page for more.

While learning the full power of these tools takes time and effort (years in the case of Perl), it is easy to learn enough to be able to use these utilities on the command line for simple data analysis tasks. To illustrate this, assume you have a file of numbers, three per line, separated by blanks (this is the most convenient format for the above utilities), like the following:

123  398  17359
317  19  2909
 39  -399  -5789
 49  33   200
255 33   -378
Here is how you could accomplish various tasks using one of the mentioned utilities. (Here "file" is assumed to be the filename. Recall that you can save the output of each of these commands to a file by appending a command like "> file.out" or pipe it through a pager with "| less".)

Miscellany


Last modified: Mon 20 Jul 2009 09:43:54 AM CDT A.J. Hildebrand