« Clouds 2 | Main | Tell and Seek »

Unix Utility Invocation Overview

Unix utilities may enforce bizzare restrictions on where options can appear on the command line, or support any number of incompatible option formats. This charming mess results from open development on multiple branches of Unix, and a healthy “invent as need be” attitude. The Rosetta Stone for Unix does a great job mapping common tasks to various commands on various Unix. This article presents a selection of common option processing methods with commentary and example uses.

Technorati Tags:

  • ls(1) uses the typical command options items format. Options must always appear before the items. To distinguish between options and items that look like options, most Unix flavors now ship with getopt(3) that support -- to stop option processing.

    $ touch -- -a $ ls | grep a$ -a $ rm -- -a

    Shell scripts may fail without --, for example when external input contains data getopt(3) considers an option. However, be aware -- is not portable. In Perl, always use the list syntax when calling external programs to avoid shell interpretation.

    Adding options requires history up, beginning of line, word forward, space, then editing.

  • find(1) employs yet another grammar: find global-options directories-to-search search-expression-options. The expression options may easily be confused with the seldom used global options, and can be mixed with other find syntax that conflicts with shell metacharacters.

  • cat(1) and other utilities support a trailing hyphen (sometimes optional, sometimes mandatory) to read from standard input instead of a named file.

    $ echo a line | cat a line $ echo a line | cat - a line

    Perl follows the optional hyphen syntax with the <> operator. However, be aware some (of the many) Perl Getopt::* modules may read a trailing - as an option and remove it from @ARGV.

    $ echo a line | perl -e 'print while <>' a line $ echo a line | perl -e 'print while <>' - a line

  • tar(1), depending on the flavor, supports a plethora of options and option syntaxes. Perhaps the worst: the option list xvzfC followed by the named arguments for the f and C option: tar xvzfC filename.tar directory-to-extract-to optionally followed by filenames. This requires careful consideration of the option list order and corresponding values. Modern tar thankfully allow -C directory-name or --directory directory usages.

    Long option names better document what the script does, and are no harder to lookup in the man page than short options. The --option value syntax must be included in the options portion of the command, while --option=value presumably could appear anywhere on the command line. Additional confusion: some utilities use --version, and others -version. I usually determine the version of Java installed on the third try. (Tip: neither -v nor --version work.)

  • ps(1) suffer from historical differences and option creep between different Unix distributions. Old ps fall into BSD and SysV flavors, each with their adherents. Modern ps usually support all of the above and more. The documentation for ps thus becomes nearly unreadable, and new programs such as killall(1) (dangerous portability problems) and pgrep(1) (not widely available) see increased usage.

  • cvs(1) uses a common cvs global-options subcommand subcommand-options items syntax. The subcommand encapsulates many commands that otherwise would separate compilation and installation, as done for the various RCS commands. I use this format in utility scripts that perform many related functions, and where writing and managing a new script for each would be excessive.

    This format suffers from confusion between the global options and subcommand options, and the related difficulty editing these option areas.

  • dd(1) requires key=value operands, which may appear in any order. This facilitates adding new options (simply append them), and allows shell “forward|back by word” to skip between operands. I favor this syntax style, as it avoids -- workarounds, and enforces no artificial position requirements for options versus items. However, this syntax does not suit the needs of every utility.

On a somewhat related note, Unix shell redirects (usually) need not appear at the end of the command line. The following commands have the same result as the usual trailing position for the redirect:

$ echo >target-file some text $ >target-file echo some text $ echo some >target-file text