Main

May 06, 2008

System Debugging

Computer system debugging benefits from both experience with and knowledge of the system. It also benefits from many questions being asked, until a cause is known, or at least potential causes being eliminated. As an example, a junior admin may note that a filesystem fails to unmount, and eventually ask a senior admin for help.

(As an aside, learning how to ask smart questions can help avoid time wasted over an “it doesn’t work” exchange.)

Continue reading "System Debugging" »

May 05, 2008

Lode Runner

Reaching level 150 in Lode Runner is possible; I never had the patience, nor time to leave the Apple //e running for the duration required. Thanks to a virtual machine, the game can be saved and resumed as necessary. Following level 150, the levels start over again, except the enemies are faster. This happens again at level 300.

Level 52 is tricky the second time through, and requires cutting corners to reach the falling enemy in time. The third time through, a different strategy is required, as the enemy is now too fast to reach in time. Instead, one must dig to get past the upper two enemies, then fall into the usual pattern.

Curtains fall at level 357, where the now twice sped enemy always intercepts one before crossing the first ladder (only eight bits were allocated for the level counter, as following level 256, it wraps, hence level 101 being displayed):

Lode Runner Level 357

Note: lode_runner1.po on the Asimov Apple //e archive is corrupt at level 130-something. I am using lode_runner.do.gz. If paranoid, advance through the levels with control+U or control+6 to confirm the levels look right.

Next, Championship Lode Runner.

May 01, 2008

Chart & Graph

Chart

Choice

No, not Visio. Visio drives me batty. I use OmniGraffle, except at work.

Graph

Graphjam.

Technorati Tags:

January 04, 2008

Microsoft Apologists

States struggling to maintain antitrust against Microsoft. The antitrust charges miss the leading flaw of Microsoft: massive security problems from an inferior product that no consumer should rightly pay for.

Microsoft systems are the most compromised system on the Internet. Millions of hacked Microsoft systems participate in zombie networks. Anti-virus software, forever out of date against new threats, tracks many thousands of viruses and malware for Microsoft. In response to these charges, Microsoft defenders always trot out the “well, if Linux had as large a market share as Windows, then Linux would have more malware and thereby zombie networks and so forth.”

This argument is inane. Notice the quick subject change from the clear and present danger of Microsoft systems to rampant speculation. The apologist hopes the conversation will be wasted arguing about a subjective claim of vulnerability equality in totally different code bases, not the fact that Microsoft systems have been and remain deeply and widely compromised.

Defenders, if pressed, will claim “well, you need to support Microsoft systems properly”. True, because Windows is needlessly insecure, and therefore a poor choice for consumers who does not know about these hidden costs. For HIPAA, SOX, and PCI compliance, Windows systems must run anti-virus software. Linux systems can bypass this requirement by virtue of not being Windows. On Linux, the primary reason to run anti-virus software is to keep the malware as far away from the flawed Windows systems as possible. This means, unless someone can present compelling evidence to the contrary, Microsoft systems are more expensive and less secure, by virtue of requiring antivirus software.

Worse, this cost of running Microsoft is also passed on to any Unix system that must also run antivirus software to keep the big bad Internet away from Windows, firewalls that protect Windows systems, and other expenses.

Why do people support Microsoft? The software is both insecure and expensive!

Windows is also “business ugly” (my summary of the overall Windows experience), and needlessly buggy (“oh, yeah, Outlook 2007 will freeze when you do that, you should check for a new service patch”), but that’s a totally different rant.

December 31, 2007

Old Games

Due to another retro game rumination and the ever curious winds of nostalgia, I fired up the old Apple IIe emulator, and played some Lode Runner. The game has color, though our monitor did not growing up, so I prefer the monochrome.

Lode Runner Level 23 screenshot

Despite the limited controls (left, right, dig left, dig right, up, down, and hold), the game requires thought and strategy on harder levels. Learning the behavior of the enemies takes time, as they will in some circumstances run away, or change their approach in response to a slight change in position. The variety of levels is also interesting, some with digging puzzles, while others require constant motion to evade enemies. If possible, I trap the enemies, despite the extra time and effort required.

November 13, 2007

A Brief Primer on Unix Environment Variables

The shell used will not make a significant difference, assuming one adheres to the Bourne—or ideally a Bourne derived—shell. At present, I favor ZSH.

$ echo $SHELL is being used /bin/zsh is being used $ echo this shell uses $(tty), process id $$ this shell uses /dev/ttyp8, process id 19741 $ echo $MY_ENV_VAR $ MY_ENV_VAR="something of value" $ echo $MY_ENV_VAR something of value $ $SHELL $ echo this shell uses $(tty), process id $$ this shell uses /dev/ttyp8, process id 19875 $ echo $MY_ENV_VAR $ exit $ echo $MY_ENV_VAR something of value

Note that though MY_ENV_VAR shows a value in the first shell, it does not in the subshell (opened via the $SHELL command). After closing the subshell, the custom MY_ENV_VAR is still defined in the original shell. Moving on, under the same session:

$ export MY_ENV_VAR $ echo $MY_ENV_VAR something of value $ $SHELL $ echo this shell uses $(tty), process id $$ this shell uses /dev/ttyp8, process id 22585 $ echo $MY_ENV_VAR something of value $ exit $ echo $MY_ENV_VAR something of value

The export builtin ensures child processes inherit the custom environment setting. The export only needs to be done once on the variable. In modern Bourne shells, this can either be done via a single export SOME_ENV=some_value command, or, as shown above, separate commands.

Processes may strip environment variables, usually for security reasons. This would account for otherwise exported variables not being present in child processes. For confirmation, run ktrace or strace on the process, and determine what—such as setenv(3), or the %ENV hash in Perl—manipulate the evironment.

Technorati Tags:

September 02, 2007

sed does not support \n

Only some modern variants of sed support \n and other useful features. A traditional sed(1) (as still preserved on OpenBSD) does not support -i “in-place editing” nor even \n. These versions require a literal newline be inserted into the command line or script:

$ echo aaa | sed 's/a/\n/g' nnn $ echo aaa | sed 's/a/\\n/g' \n\n\n $ echo aaa | sed 's/a/\ /g' | od -bc 0000000 012 012 012 012 \n \n \n \n 0000004

Tim Maher in Minimal Perl discusses the evolution of sed and awk, including command line features stolen from Perl, such as \n support.

Technorati Tags: , ,

Continue reading "sed does not support \n" »

August 22, 2007

Unix Hard vs. Soft Links

An exploration of Unix symbolic (soft) and hard links. Commands used: ln(1), ls(1), and touch(1).

$ ls $ touch original $ ln original hardlink $ ln -s original softlink $ ls -li total 8 4891103 -rw-rw-r-- 2 jdoe jdoe 0 Aug 17 23:18 hardlink 4891103 -rw-rw-r-- 2 jdoe jdoe 0 Aug 17 23:18 original 4891112 lrwxrwxr-x 1 jdoe jdoe 8 Aug 17 23:18 softlink -> original $ touch four $ ln -s four longername $ ls -l four longername -rw-rw-r-- 1 jdoe jdoe 0 Aug 17 23:27 four lrwxrwxr-x 1 jdoe jdoe 4 Aug 18 11:55 longername -> four

Observations:

  • The hardlink file uses the same inode number as the original.

  • Both the hardlink and original are empty, and share the same file type (a hyphen).

  • The softlink file is not empty, and has a l instead of a hyphen in the file type column.

  • The softlink file has a different inode number than the hard linked file.

  • The size of the symbolic link file appears directly proportional to the length of the filename it points to.

  • ls -l somehow knows how to display the file the symbolic link points to.

Technorati Tags:

Continue reading "Unix Hard vs. Soft Links" »

August 02, 2007

E-mail Administration Methods

Mail Transport Agents (MTA) such as Sendmail relay e-mail from senders to recipients, except when things break. This and subsequent articles cover methods to debug e-mail delivery problems. Focus will remain on understanding and debugging SMTP and Unix MTA, though hopefully the methods will abstract to other systems.

A mail administrator should be able to answer the following low-level questions regarding MTA and SMTP. Knowledge of networking protocols and debugging Unix systems will be very helpful.

  • How do DNS and hostname settings affect e-mail, and MX records in particular?
  • In what ways does the MTA accept e-mail?
  • Where does it send the e-mail to?
  • Does it route or reject depending on the envelope addresses?
  • What is the difference between a envelope address, and a body address?
  • Where is the configuration for the MTA located?
  • How is the configuration updated? When does the MTA need to be restarted following configuration changes?
  • Where do the MTA logs go?
  • What do the MTA logs show? Can a message be traced across a system, and then looked up on the next and subsequent SMTP servers?
  • What commands show the state of the MTA queue directories?
  • What is the difference between a synchronous and an asynchronous bounce of a message?
  • How are rejected messages handled?

A mail administrator must be able to generate command line or SMTP test messages, and know how to vary the envelope and body content of these test messages. For example, a report of a mail server that “does not work” should prompt “can I send e-mail to it?” and “what happens to that sent e-mail?” questions easily answered by firing off test messages. Test messages can also narrow the scope of a problem. If a remote server across a Wide Area Network (WAN) has problems receiving e-mail from a server of a different type, a quick way to rule out the WAN would be to test the same set of SMTP software over a Local Area Network (LAN). If the LAN message also fails, the two servers are likely incompatible. If not, what is wrong with the WAN? Is there a firewall that mangles the message? Is the link too slow, or corrupting traffic? Something else?

Also, a mail administrator should also understand the big picture at a site:

  • Does the site have any e-mail infrastructure? If not, what do they need?
  • What MTA does the site use?
  • If more than one MTA, how do they interact? Were any special settings made to support this interaction?
  • How does mail route? Is the system centralized, or decentralized?
  • How many different e-mail workflows are there? This would include both user e-mail, and any newsletter or mailing list type systems. Do different departments use different e-mail systems?
  • Where does bounce e-mail arrive? How is it handled? How does this e-mail feed into different departments that need to handle bounces?

Subsequent articles will cover debugging methods and big picture thoughts in more detail.

Technorati Tags: , ,

April 16, 2007

Incident Handling

Software applications issue logs, allowing log scanning software to detect and act on these events. For example, sec.pl can detect a disk full log message, and cut a trouble ticket. This article considers a different approach, one that does not rely on log scanners. The method best suits applications with low incident counts, those where new incidents appear on a weekly or longer basis.

Method Overview

Application software, upon detecting a fault, writes an *.error file into an incident directory. These files contain logs and other data concerning the fault. Monitoring software periodically checks the incident directory, and cuts a ticket if at least one error file exists. An operator investigates the issue, and after resolving the problem, moves the incident file aside.

Advantages include simplified monitoring software: alert if an incident file exists, rather than continuously scanning a logfile for rare events. By including relevant data in the error file, the operator need not delve through gigabytes of logs during the investigation. Monitoring software could also submit the entire error file as part of a ticket auto-cut, instead of simply alerting on the presence of error files.

Disadvantages include an operator mistakenly moving aside unresolved incident files, or where a major problem creates a flood of files, well beyond the low numbers this method assumes. I have not yet used this method in practice, so other disadvantages may exist.

Technorati Tags: ,

Continue reading "Incident Handling" »

April 06, 2007

Big / vs. Small /.*s

The LOPSA tech mailing list has a good ongoing discussion on partition layouts. I’m currently in the “one big partition” camp for most servers (user desktops, front-end webservers, and other similar throwaway systems) that can be reimaged and brought back up to speed with configuration management. Have not maintained custom servers in some time—databases, mail servers, and the like—that would justify time spent doing partition layout planning, creation, and maintenance.

More details over on my KickStart page.

April 04, 2007

Daylight Saving Time Considered Useless

Daylight Saving change useless at best. Deploying DST patches, restarting applications, and figuring out when the meetings Exchange botched actually took place wasted far too much time at work these last few months.

I say, abolish the useless time shift. If congress really wanted to implement energy savings, perhaps they could tax the lumens Americans burn away nightly?

And I’m still trying to abolish useless uses of the localtime call at work, but that’s a different battle.

March 08, 2007

Sue Spammers

A non-technical method to eliminate spam: court victory for man who took on spammers. British man sues spammers, wins:

"The courts have sent a clear message that spam will not be tolerated," he said. "This is not a technical problem, as technical solutions last only so long and will always be overcome. It is a social and legal problem, and we can do something about it."

March 04, 2007

Slow Queue for Spammers

Slow Queue Setup with Packet Filter (PF) illustrates PF queuing rules and supporting scripts and configuration to relegate traffic from spammers into a slow traffic queue.

For more information on blocking spam, visit JunkBusters.com. This site includes resources and information on blocking telemarketing calls, junk mail, junk faxes, and more.

February 03, 2007

Debug Daylight Time Changes

The forum post how to test Daylight Saving Time settings for 2007 contains a script that tests daylight saving changes without altering the system time.

Like Y2K, the early daylight savings change this year gives management something to fuss over. I’m personally in favor of more gmtime(3) and less localtime(3) use, as a code review at work reveals many needless localtime uses. Unfortunately these uses will require a fair amount of work to convert to gmtime, especially where the data wanders off to other groups.

February 01, 2007

URL Monitoring Thoughts

In addition to the HTTP status code, URL monitors must capture the request latency: a site may be responding, but slowly enough to impact customers. The following code outlines a URL monitor in Perl, using LWP::UserAgent to request the URL and Time::HiRes to measure latency:

#!/usr/bin/perl -w use strict; my $url = shift || die "Usage: $0 url\n"; use LWP::UserAgent (); use Time::HiRes qw(gettimeofday tv_interval); my %output; my $ua = LWP::UserAgent->new; my $start = [gettimeofday]; my $response = $ua->get($url); $output{latency} = sprintf '%.2f', tv_interval($start); $output{status} = $response->code; # TODO template %output as demanded by # the monitoring system for my $key ( keys %output ) { print "$key=$output{$key}\n"; }

Metrics & Alerts

Graph the HTTP status code along with the latency. This avoids the information loss produced by mapping the codes into arbitrary “good” or “bad” values. The alerting framework should handle translation of the code into an alert, as appropriate for the URL: >= 400 pages someone, while >= 300 only warns about the unexpected redirection response.

The latency graph provides clues into the problem: assuming a 200 HTTP status code in each case, a ledge at N seconds indicates a timeout of some sort (check for DNS problems), while a scattered graph points to network packet loss or similar load induced problem. Watch the average latency over time, then set an appropriate alert threshold, perhaps five seconds. Another option: alert should a “major” increase occur in the latency, perhaps one or two standard deviations above a moving average. This will catch sudden latency increases, but will fail to alert in the event latency slowly rises beyond the Service Level Agreement (SLA) threshold.

Latency ledge exampleExample latency due to packet loss

If possible, negotiate the SLA in advance, to ensure a proper solution can be developed. Also consider how far below the SLA alerts must be set, to allow triage before a system breaks SLA.

Check Multiple URL

A single script can check multiple related URL as plugin for Nagios or comparable monitoring systems. Do not number the URL arguments in the output; instead, associate short aliases for each URL monitored. Numbered URL force the question “well, what URL is actually in error?” while aliases provide a hint while not exceeding any length limits on monitoring output. If the monitoring system allows, include the full URL in the output, so a reader can copy or click on the URL directly from the alarm message.

#!/usr/bin/perl -w use strict; die "Usage: $0 alias.url [.. alias.url]\n" unless @ARGV; use LWP::UserAgent (); use Time::HiRes qw(gettimeofday tv_interval); my @results; my $ua = LWP::UserAgent->new; for my $target (@ARGV) { my %output; ( $output{alias}, $output{url} ) = split /\./, $target, 2; my $start = [gettimeofday]; my $response = $ua->get( $output{url} ); $output{latency} = sprintf '%.2f', tv_interval($start); $output{status} = $response->code; push @results, \%output; } for my $result (@results) { print join( ':', map { $result->{$_} } qw(alias status latency) ), "\n"; }

Consider Deeper Content Checks

If necessary, also setup content checks that ensure websites contain the expected content. A site may return 200 status codes within the SLA, but contain no data if a software bug or caching problem omits some or all of the page content.

Technorati Tags:

January 27, 2007

\command

The \ character performs multiple functions in Unix shells. One less known use is to bypass shell aliases. For example, I use srm(1) by default, and revert to the insecure (yet much faster) rm(1) for large files that do not require secure deletion (install packages, in particular).

$ alias rm rm='srm -s -z'

Practice with echo(1) to get a feel for \command behavior.

$ echo test test $ alias echo='echo foo' $ echo test foo test $ \echo test test $ unalias echo

However, nothing stops \command from being an alias itself:

$ alias echo='echo foo' $ alias \\echo='echo bar' $ echo test foo test $ \echo test foo bar test $ \\echo test zsh: command not found: \echo

You can also create a \echo command, and so forth:

$ chmod +x ~/bin/\\echo $ cat ~/bin/\\echo exec echo $@ $ \\echo test test

Technorati Tags:

January 11, 2007

Timezone surprise for U.S. systems

Thanks to the Energy Policy Act of 2005, United States computers must account for the altered daylight savings time shifts. Systems that use UTC (and the misnamed gmtime(3) system call) instead of a local timezone that wanders will not be affected. On Unix, the fix should be as simple as deploying an updated timezone file.

If possible, always run Unix systems in UTC, though converting existing systems or ensuring UTC is used properly by all applications may be time consuming and expensive to implement. On known conversion problem: a system runs in UTC, and starts an application in UTC. A user with a custom timezone set restarts the application via sudo, causing the application to run under their custom timezone. Hilarity ensues.

Tweaking sunrise and sunset to save energy, while providing kickbacks to the oil industry, all while vehicles still guzzle down insane amounts of fuel: your friendly local government hard at work.

On a somewhat related note, TLS (the protocol formerly known as SSL) stops working in 2038 due to the use of a 32-bit Unix time value. But that’s years away. No need to worry!

December 10, 2006

Backgrounding Shell Commands

Updated notes on running Unix shell commands in the background with references to other options, such as nohup, screen, and disown, and better log handling with httplog.

Technorati Tags:

December 03, 2006

OpenSSL S/MIME

Wrote up new documentation on using the openssl smime command. This page includes mime-util to help remove MIME left behind by openssl smime -decrypt or openssl smime -verify, as well as test scripts to help encrypt and decrypt S/MIME data.

Technorati Tags:

December 02, 2006

Commenting Changes

Thoughts on marking significant changes in computer systems with a somewhat formalized change notice format.

Change notices become increasingly relevant when multiple groups (Developers, Quality Assurance, Operations) manage an application. All information about the current state of systems and how they differ from the norm cannot be communicated, as the meetings would be onerous, and humans will forget things, especially at 04:20 following an hour of bad sleep. Detailed change information next to an important change, where the on call should eventually look, will clue them in, and save time wasted hunting down who made the change, or worse, enabling of jobs that should not run.

Significant changes must be commented, so another person can revert the changes to their normal state, or know why the value differs from the norm. Uncommented, the changes could be undone by someone attempting to solve another problem. This could include an application setting, such as a timeout, or whether or not special jobs run from crontab(5).

If the change will be soon reverted, include the previous values—increase timeout to support large files (was 30)—and sufficient instructions on how and when to revert the change. Periodic changes due to load, for example during the holiday season rush, may be grouped into known configuration blocks and switched between depending on the time. These blocks should use a common keyword easily searched for, or perhaps could be hosted on an wiki.

Technorati Tags:

Continue reading "Commenting Changes" »

October 10, 2006

Daylight Savings vs. Cron

“In the U.S., clocks change at 2:00 a.m. local time. In spring, clocks spring forward from 1:59 a.m. to 3:00 a.m.; in fall, clocks fall back from 1:59 a.m. to 1:00 a.m.” — http://webexhibits.org/daylightsaving/b.html

Therefore, Unix systems using a United States timezone that migrates must not run cron(8) jobs between 01:00 and 02:59 in the morning on Sunday, unless those jobs can handle running twice or not at all when daylight savings changes. Consult the page above for rules in other countries, and upcoming changes in the United States.

Audit Unix systems for problematic crontab(5) entries via the following command. Note that hyphens or other special syntax may cause an early morning run without 1 appearing in the hours field for the entry.

$ crontab -l | \ perl -lane 'print "@F" if $F[1] =~ m/1/'

Technorati Tags:

Continue reading "Daylight Savings vs. Cron" »

October 08, 2006

Character 2647

Browsing through the Unicode character charts recently, and ran across the ♇ character for Pluto. How should Unicode address the diminished status of Pluto? Maybe make the character a dwarf character? Just wondering.

Technorati Tags:

September 15, 2006

Unix Utility Invocation Overview

Unix utilities may enforce bizzare restrictions on where options can appear on the command line, or support any number of incompatible option formats. This charming mess results from open development on multiple branches of Unix, and a healthy “invent as need be” attitude. The Rosetta Stone for Unix does a great job mapping common tasks to various commands on various Unix. This article presents a selection of common option processing methods with commentary and example uses.

Technorati Tags:

Continue reading "Unix Utility Invocation Overview" »

September 09, 2006

OmniWeb 5.5 Released

OmniWeb 5.5 released by OmniGroup. My default browser on Mac OS X, mainly due to site specific options and sidebar tabs with preview instead of a row of tabs with nearly useless <title> text. New release much snappier, better JavaScript support for the few pages I allow that on (mainly Amazon and Bikely).

August 28, 2006

On Log Standards

Anton Chuvakin on Log Standards. Focus should be on content of log messages, not the format or transport or storage thereof. Making developers remove spurious log messages would be a great start.

Additional logging thoughts: System Logging on Unix.

Technorati Tags:

August 23, 2006

Timezone Troubles

Ideally, all systems would use the UTC timezone, and avoid problems with daylight savings shifts and timezone conversions. However, in practice, systems will use the local timezone, even if the company later follows the British Empire. Deployed systems will not migrate to UTC, as other projects will take priority, and the timezone change would require extensive testing and likely reveal previously unknown bugs. This article contains thoughts on better handling time related data, such as logfile entries or database records.

Technorati Tags: ,

Continue reading "Timezone Troubles" »

August 11, 2006

Eliminate Spurious Errors

Software must not emit warnings that are not errors. Needless errors clog log files, increase data processing and storage costs, and greatly complicate log analysis. At best, a new hire will debug a script, and waste time asking “is this message normal?” Better sites might Wiki “ignore this log” and hope the new hire can find it. Best sites kill off the message (or lower the severity to notice or below), and the time is never wasted wondering, documenting, and retraining. Without clear mappings of log levels to actions, one enters expression hell, where long action lists evolve: warnings X, Y, and Z require action but not M, Q, or Y. Except on Tuesday. Maintaining such lists is both time consuming and error prone.

Instead make logs actionable: specific priorities must map to specific actions. For example, a emerg or alert syslog(3) message always results in a severity 1 (highest priority) ticket and a page, crit or err messages a severity 2 ticket and page, warning messages a severity 3 ticket but no page, and no action for any lower priority. Simple to code for, and easy to decide what sort of response (and therefore priority) a new log message requires. Actionable log levels also create automation. Under Tomcat, developers could mandate any FATAL logs mean the instance requires immediate restart. Easy to check for in a log file, and automatically thread dump, kill, and restart java should a FATAL turn up.

In-house code benefits most from actionable logs. Vendor software may emit no logs, or use bizarre priority levels for trivial data: automount on Mac OS X used to log the automount version under daemon.err!. Worse, stock syslogd(8) omit the facility and priority information by default. Use monitoring software such as Nagios to trigger actionable events where vendor logs lack good information, and reserve log-based actions generated by tools like sec.pl to well known errors, such as disk full or kernel panics.

No news is good news: also eliminate spam from cron(8) jobs. Larger sites with high turnover may end up with hundreds of daily notifications, mostly junk, mixed with a few critical messages. Identify the required messages, and direct their output to role based mailing lists (never root or directly to a user), then kill off everything else. If the notification message confirms something ran, instead write a low priority log message (or touch a last-ran-on status file), then have another utility warn if the message (or the last-ran-file) was last updated too long ago.

Technorati Tags: ,

August 05, 2006

Howto Disable Cron Jobs

Rather than edit a crontab(5) entry to disable a cron job, instead consider using a status file to disable the operation of a script. Various advantages include:

  • Risk of operator error eliminated as crontab(5) file not edited.
  • Ability to monitor the status file: trigger alarm if exists, record duration of existance for exact outage numbers.

Disabling the cron entry or disabling crond blur the line between system configuration and system operation. Using a status file distinguishes whether the configuration is correct, and whether the script should not run for some external reason (planned outage, data center migration).

Either use a filename based on the script name, perhaps somewhere under a custom /var directory. Also consider a global file to disable all scripts, or filenames that disable groups of scripts by function.

Technorati Tags:

August 01, 2006

Happy Mailman Day!

It’s that time of month again, when Mailman turns to notification spam. This time, a Perl script to automatically disable said spam. Silence is golden, and spam is not.

July 29, 2006

On Call Fun

Like the circus, except with more animals. On the down side, running systems blind to remote connectivity problems is a Bad Thing. On the plus side, Monorail Espresso serves a great shot on Saturdays, and port checks can be automated with a little bit of Perl and IO::Socket.

July 26, 2006

crontab Management Tips

  • Avoid early morning jobs
  • Jobs scheduled to run between 2AM and 3AM may run afoul daylight savings time shifts. This can be avoided by configuring systems to use the UTC timezone, or by more intelligent scripts that try to run multiple times. Something like: “try to run every hour, but only if last good run took place more than 24 hours ago”.

  • Personal crontab file backups
  • If paranoid, backup crontab(5) files to your home directory. This allows a $HOME backup to catch the files, and helps transition jobs to new systems.

    # save crontab(5) 23 23 * * * crontab -l > $HOME/.cron.`hostname`

    This scheme will work on multiple systems, but will fail if two users share the same home directory (bizarre and incredibly rare). Be sure to run Another option: wrap the crontab(1) command to save automatically into another repository. Also consider whether any at(1) jobs need to be backed up, or whether some other utility can regenerate them if lost from the system directory.

  • Use @reboot

    Some crond implementations support @reboot. This may allow jobs to be run when the system starts or reboots:

    @reboot $HOME/bin/do-something

  • Easy e-mail filtering
  • If supported by crond, set a custom MAILTO e-mail address. This allows a single e-mail filter to catch all e-mail generated from cron. Check with your mail administrator for the proper mail syntax to use; some systems use username+detail, others username-detail, and others will need to create a custom e-mail alias.

    MAILTO=username+cron

Technorati Tags:

July 15, 2006

Recover Deleted Files

Files on Unix may be deleted, but still held open by another process. While most Unix would require a utility to read a file by the filesystem and inode(5) number, the special /proc filesystem on Linux allows the recovery of deleted but held open files:

  1. Use lsof(1) to discover the deleted file, and record the Process ID (PID) and File Descriptor (FD) open to this file.
  2. Recover the file: cp /proc/$PID/fd/$FD /var/tmp/recovered

The deleted file should appear as a broken symbolic link under the /proc/$PID/fd directory. Despite this, /proc still allows the file to be copied elsewhere. For related information, see how to debug Unix systems.

Technorati Tags:

July 14, 2006

Unix Filesystem Tips

On Unix, all filesystem objects are files, including directories, sockets, and other types. The stat(2) manual covers the various file types. Interesting consequences:

  • Directory sizes do not reflect the size of the files inside the directory.
  • Instead, the size of a directory shown by ls(1) reflects the number of files contained by a directory (and the length of the filenames). Consider an empty directory versus one with 10,000 subdirectories:

    $ ls dir* ls: dir*: No such file or directory $ mkdir dir1 dir2 $ (cd dir2 && perl -e 'mkdir $_ for 1..10000') $ ls -ld dir* drwxrwxr-x 2 jmates jmates 68 Jul 13 23:53 dir1 drwxrwxr-x 10002 jmates jmates 340068 Jul 13 23:55 dir2 $ rm -r dir*

  • Files would need an array to store the different hardlink filenames. Instead, these names are stored in the directory, and point to the same underlying inode:
  • $ mkdir dir1 $ cd dir1 $ touch file1 file3 $ ln file1 file2

    Dir-File-Links-2

For more information, I recommend:

Technorati Tags:

July 13, 2006

OS X vs. Windows XP

Initial thoughts on using Windows XP. Long time Mac, Mac OS X, and Unix user, now running with Windows XP laptop at job.

  • Command Line Interface (cmd.exe)
  • Utter joke, compared to the Unix shell. Pitifully difficult to copy and paste data between shell and other utilities. Need to try the CLIP command. Luckily, found zsh for Windows. Downside: no idea where dotfiles hidden on Windows, so configuration will take some time.

    Will also need to patch any number of my 150+ utility ~/bin scripts (of which I really only use around 20 of on a regular basis) to work under Windows, and somehow obtain the required Perl modules for them. On Unix, I simply rsync around the portable scripts and libraries, then use CPAN or the vendor package systems to obtain the required supporting data.

  • Firefox
  • No anti-flash and make-the-page-hold-still plugins installed by default. Need to figure out the desktop system security policy, or otherwise get the help desk to install these.

  • Outlook 2003
  • Calendar nice, though mostly unused by me, as I am not meeting bound. Somewhat easy to setup a Getting Things Done (GTD) method using Contacts and Tasks tricks. Recurring tasks a nice feature. E-mail interface mostly a pain, no threading support I've found yet, text editing clumsy compared with vim. Spent long time finding all the options required to make e-mail mostly readable, as default view emphasizes sender, not subject, and shows a short date.

    Rules support seems extensive, though much time consuming clicking around required. Would prefer procmail style filtering, as have a decade of experience with it, and can easily pipe messages to arbitrary commands and various mailbox formats.

    Slow to display messages, several “Outlook must close now!” crashes. Cannot remember ever having vim or mutt crash on me. Outlook likes to unmark messages as read when moving between folders, or otherwise reluctant to quickly mark things as read. This wastes my time.

Finally found a ClearType utility, to make the fonts look less crummy. No where near as nice as the fonts on Mac OS X. Still need to find a QuickSliver type application.

Dell laptop running Windows XP weak: trackpad pretty much unusable (especially with tap-to-click enabled by default, and no obvious way to disable it), and keyboard nub annoying. Unlike Mac OS X, no double-touch to scroll support. Slow to sleep and wake.

Overall: functional, but nothing special. Extreme virus and spyware threat on Windows stops me from running Windows at home.