Tips and tricks for the Unix shell environment. Shell examples assume a non-csh-based shell, such as bash or zsh. Consult the manual for the commands in question if you see errors, as tools vary depending on the flavor of Unix.
Always check the error status of chdir, to avoid running commands in the wrong directory. Alternatively, use fully qualified paths to obviate the need for chdir(2).
#!/bin/sh
cd $nosuchdir || exit 1
rsync elsewhere:/foo .
Without || exit 1 to abort the script should cd fail, the subsequent rsync command could move files to the wrong location or fill up the wrong partition.
Redirection can take place (almost) anywhere, not just at the end.
$ echo a b >c
$ echo >c a b
$ >c echo a b
Placing the filename at the beginning allows easier editing of the search term at the end of the command.
$ </var/log/messages grep foo
$ </var/log/messages grep bar
$ </var/log/messages grep user1
I use xargs(1) frequently to convert output from something (file, or another program) to arguments to another command. For instance, to commit only modified files in a cvs sandbox where there may be conflicted, new, or other troublesome files mixed in, use the following.
$ cvs up | perl -ne 'print if s/M //' | xargs cvs ci
Depending on the editor, one can use concept above to open certain files for editing, for example, files in a cvs sandbox that have conflicts.
$ cvs up | perl -ne 'print if s/C //' | xargs vi
ex/vi: Vi's standard input and output must be a terminal
$ cvs up | perl -ne 'print if s/C //' | xargs emacs
emacs: standard input is not a tty
$ cvs up | perl -ne 'print if s/C //' | xargs bbedit
The bbedit utility is part of BBEdit for Mac OS X, and avoids terminal issues by sending the files to the BBEdit application. Using emacs in server/client mode may avoid this problem for emacs. The alternative is to use backticks to make the files available as arguments to the program, instead of feeding them in through xargs.
$ vi `cvs up | perl -ne 'print if s/C //'`
xargs can be chained with other programs. For instance, one may want to find perl scripts containing the text While and do something with them.
$ find . -name '*.pl' \
| xargs fgrep -l While \
| xargs perl -i -ple 's/While/while/g'
xargs will fail or do the wrong thing if passed filenames contain spaces. This is common on filesystems that traditionally have allowed spaces in filenames (Mac OS), or where file trees have been uploaded to Unix from other platforms. If using find/xargs pairs, the spaces-in-filenames problems can be avoided as follows.
$ find . -type f -print0 | xargs -0 echo
Dealing with files that have odd characters in their names can often be a chore on Unix, as one cannot type in the names in question. One could use a graphical file manager tool, but I find those cumbersome, ill suited to dealing with large numbers of files, and usually not installed on server systems.
To simply delete the bad filenames, there are a few options.
Files that being with a hyphen (such as a file -rf) will trigger option processing, as the shell is very stupid. These can be avoided by either disabling option processing, or prefixing a directory name to the file path.
$ ls
-rf
$ rm -rf
$ ls
-rf
$ rm *
$ ls
-rf
$ rm -- -rf
$ ls
$ touch ./-rf
$ ls
-rf
$ rm ./-rf
$ ls
$
The -- argument only works on systems whose getopt(3) library supports the syntax. On other systems, or for portability, the qualified path option must be used.
Each file on a Unix filesystem has a inode number associated with it; knowing the inode number of the bad file allows us to search for and delete it.
$ ls -i *
615383 foo
$ find . -inum 615383 -exec rm {} \;
If there are large numbers of files with wacky characters in their filenames, something more powerful than the shell is usually required to filter out the files in question. For instance, to list the inode number of files in the current directory with non-printable characters in their names, use perl.
$ ls -i | perl -nle 'print if /[[:^print:]]/' \
| while read inum name; do echo $inum; done
For situations where the mangled filenames are in deep directory trees, or where the mangling is consistent (uploaded filenames from a DNA sequencer come to mind), use File::Find and write a standalone script.
Directories with huge numbers of files will cause rm * to fail, as the wildcard expands to a list the shell cannot cope with. To delete all the files, remove the parent directory. If only deleting by a pattern, use readdir to loop over each file in turn, and apply a match to each filename.
$ rm -rf /the/bad/dir
$ perl -le 'opendir D, shift or die "$!"' \
-e 'while (readdir D) { unlink if -f and m/\.doc/ }' /the/bad/dir
echo foo bar | sed 's/ /\
/g' | xargs -n 1 echo ls
Commands can be run over ssh(1), though how the shell handles more complex commands can cause problems.
client$ ssh example.org hostname
server.example.org
client$ ssh example.org sleep 3 && hostname
client.example.org
The && is handled by the local shell, not the remote server. Quoting can fix the problem.
client$ ssh example.org 'sleep 3 && hostname'
server.example.org
Shell Loop Interaction with ssh talks about problems with ssh and the shell while builtin.
There are several ways one can empty the contents of an existing file without removing and touch(1)-ing the file in question. Using echo(1) is not portable, as some systems do not support the -n flag, such as Digital UNIX without CMD_ENV=bsd set. The use of the shell null operator : is a clever way, and saves typing.
$ cat /dev/null >file
$ echo -n >file
$ : >file
grep can match itself. To avoid this problem, the regular expression can be altered so grep does not match itself, which is easier than appending | grep -v grep to a command.
$ ps wwo pid,command | grep 'ss[h]'
The regular expression ss[h] cannot match the literal string ssh[h] in the process listing, but will match any process name containing ssh. Another option: use commands such as pgrep(1).
Some utilities are controlled via command interfaces. Full control of command interfaces may require the expect utility, or Expect. Simple needs can be met by printing commands on standard input, then parsing the output. For example, the Mac OS X scutil can be queried for information:
SERVICE_GUID=`cat <<EOF | scutil | awk '/PrimaryService/{print $3}'
open
get State:/Network/Global/IPv4
d.show
EOF`
echo $SERVICE_GUID
If parsing a list of filenames from a file, spaces in filenames may cause shell interpolation problems. To workaround, use xargs -0, and convert the file to a null delimited list:
$ touch "foo bar"
$ echo foo bar > test
$ < test tr '\n' '\0' | xargs -0 file
foo bar: empty
Modern getopt(3) implementations support -- to stop option processing, allowing commands such as fgrep -- -search to work.