Big Blobs
Perl coding can evolve towards the use of a Big Blob—a large structure of deeply nested data—once perldsc and perllol are mastered. That is, input from various sources is assembled into the Big Blob, any required munging performed, and the data structure iterated over to emit some sort of output or change. This method does work, though suffers from a number of avoidable flaws.
if ( $line =~ m/^ { $/x ) { $rule_target[-1]->[-1]->{_subrules} = []; push @rule_target, $rule_target[-1]->[-1]->{_subrules}; }
First, consider instead providing an Object Oriented interface, thus hiding the Big Blob. However, this may defeat a “well, I’m just trying to mangle X into Y, not waste time with class struggles” coding effort. Whether OO makes sense depends on the project. A standalone data conversion script probably does not justify OO. Code that other code will use, or a service interface, especially one used by other groups or users, will likely benefit from OO.
Secondly, Big Blobs could be a solution looking for a problem. The coder knows how to parse data into the blob, then iterate over the mess, but never considers whether a blob should have been used.
my %big_blob = load_from_file($filename); upload_to_database(\%big_blob);
In many cases, the entirety of data need not be loaded into memory, and instead only the minimum necessary data retained in memory before acting on it:
while <$fh> { my %line_data; # ... parse line into line_data hash # upload line_data contents to database $db->... } continue { if ($. % 1000) { $db->commit(); } }
This method scales better, as it no longer is bound by memory, and will not require DB_File or refactoring should the data set exceed available memory. The question: “do I really need to parse all the data to memory, or is there a more efficient solution?” will help prevent inappropriate use of Big Blobs.
Technorati Tags: Perl
