Perl@PolettiX


S M T W T F S
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31      
Tue, 15 Dec 2009

Why parsing should be simple?

Following an article by osfameron (found thanks to Planet Perl Iron Man) I landed on the interesting analysis performed by Aldo Cortesi. I was quite unsurprised at seeing yet another variant of the old thorny "Perl is dead or at least does not feel very well" infamous adage.

I was a bit more surprised at seeing this comment:

At the risk of inflaming more Perl programmers to come and manfully defend their language on my blog, I think there's a reason why Python has a nice BNF grammar, and Perl has 5600 lines of ad-hoc parsing code:
http://www.perlmonks.org/?node_id=663393
I wonder which reason Aldo is thinking about. IMHO, the reason is that probably Python development focuses more on language orthogonality and Perl development more on programmers' ease at the possible expense of a more complicated compiler, but it's just me.
Posted at 14:41:41 by Flavio Poletti
Sat, 5 Dec 2009

IO::Zlib and saved space

A few days ago a colleague asked me about using Perl for analysing some ASA firewall logs in order to spot how many public addresses are needed for NATting users towards the Internet. The basic regular expression to capture the bits of information that he needs is quite straightforward, but what was interesting is that the files he has to work on are gzipped, and he had already extracted a sample one to work on. I remembered that there is IO::Zlib and this is what I did:
for my $file (@ARGV) {
   eval {
      my $fh = _open($file);
      while (<$fh>) {
         my ($inside, $outside) = /Built\ dynamic\ translation\ from\ inside:(.*?)\ to\ outside:(.*?)/mxs
            or next;
         # use $inside and $outside
      }
      close $fh;
   } or warn "exception for '$file': $EVAL_ERROR";
}

sub _open {
   my ($file) = @_;
   my $fh;
   if ($file =~ /\.gz \z/mxs) {
      $fh = IO::Zlib->new();
      $fh->open($file, 'rb')
         or die "IO::Zlib complained: $OS_ERROR";
   }
   else {
      open $fh, '<', $file
         or die "open(): $OS_ERROR";
   }
   return $fh;
}
It worked pretty well so nothing to complain. Just before blogging about it, I paid a due visit to the documentation, and I discovered that I was more or less lucky: there are limitations in using the module, which basically boil down to $fh not being what you expect from a full fledged filehandle. But, at least, it should work out of the box if all you need is to read the file one line at a time.

The module isn't in the core distribution, but it's a common prerequisite so chances are that you already have it in your distro. It's a bit weird that it is known by corelist to have been included in 5.9.3:

IO::Zlib was first released with perl 5.009003
even though there is no trace of it in 5.10. Go figure. Anyway, it should be a bit more common to find than the alternative PerlIO::gzip, which would make the sub _open unneeded when substituted with this:
   open my $fh, '<:autopop', $file or die '...';
I wonder how much Perl IO layers are used out there.
Posted at 23:27:28 by Flavio Poletti
Sun, 29 Nov 2009

Equality is reflexive... isn't it?

I read about Perl6::Junction in an artile in blogs.perl.org and I was tickled. I quickly went on CPAN to see what the module was about beyond the post above, and saw two enthusiastic reviews by two bigs (at least this is what I consider both of them).

I have to say that I was a bit disappointed in seeing that they both talked about very clear documentation, while it seemed a bit too minimal for my taste. I do agree that the test suite is complete, anyway, and it's a useful source for examples too!The tests are indeed quite extensive, and there are also tests for something that made me curious, i.e.: "will it be possible to use junctions on both sides?". The answer turns out to be positive, and there are tests for those cases (see the T/join.t test file for details).

One funny things in the module is that the following both apply:

my $is_true = (all(3, 4) == any(3, 4));
my $is_true_as_well = (any(3, 4) == all(3, 4));

It actually makes sense: the first says "do all elements in the {3, 4} set have something equal to them in the {3, 4} set?". Course they do, because 3 in the first set has 3 in the second, and 4 in the first set has 4 in the second. The second says "is any element in the {3, 4} equal to all the elements in {3, 4}?". Course there isn't, because 3 from the first set is equal to 3 in the second set, but fails to be equal to 4.

Hence, the (numeric) equality operator does not maintain the reflexive property here, and it seems just... weird, even though it makes perfectly sense.

Posted at 00:15:32 by Flavio Poletti
Tue, 24 Nov 2009

RTFM, you geek!

After the previous post, I shipped the script and the related libraries to the recipient and slept as a baby. Just to discover that they didn't figure out how to use it properly!

The main issue is that geeks and technical people generally don't read the documentation. I admit, most of the times I don't do this as well. In this case, my good colleague saw a Perl script, made the too-simple-to-seem-true association between Perl and the Web and installed Apache! Well, as a matter of fact it is a web application, but I strived to make it self-contained and you end up with Apache?!?

Luck was that I was there some thirty minutes after to correct the problem and let Perl shine: "hey bro, it's a single program with an embedded web server!". That was the lucky part, anyway: the environment we're working into is particularly hostile, to the point that two machines in the very same network can't ping one another! Fortunately when I did use the program to make some tests the network connectivity was not that bad and we managed to do them.

Another thing that annoyed me a bit was that the script was not fully self-contained. Yes, I have to ship a hole tar.gz file... so I thought that it would be great to distribute a single script without anything more. One option is PAR, of course, but I always regretted to use it because you basically have to compile for a target architecture (at least in my understanding). In case of Pure-Perl stuff this does not make me happy.

Some time ago I had a similar problem that I solved with a script to bundle Pure-Perl modules inside a script. I put it in repository in repo.org.cz, for anyone to have a good laugh.

Posted at 02:53:11 by Flavio Poletti
Thu, 19 Nov 2009

local::lib for distributions

local::lib is a love-it-or-hate-it module, with the additional feature that you don't get the hate-it part.

Recently, I had to develop a script to do a couple of HTTP redirections. I headed towards CPAN, quickly found that HTTP::Server::Simple (and in particular HTTP::Server::Simple::CGI) and in some twenty minutes I had a working prototype. Forget that I changed my mind a couple of times before having what I eventually used for my test...

Now, I knew I had to go into an environment that could possibly prevent me from using my machine to perform the test. As a matter of fact, I didn't know whether I could use the program anywhere, let alone know what kind of Perl environment I would have found. Nightmare!

Luckily enough, it turned out that I only needed modules that do not require compilation. I love Pure Perl modules! So now I had the problem to bundle all the needed non-core modules in a way that was convenient to deliver. This is where local::lib really saved the day, and in particular its --self-contained option. Well - yes - I've seen options that were commented way better... but at least the only reference in the synopsis made me curious enough to discover that it was hitting the nail right in the head.

On my machine I have my own compiled Perl version to tinker with, so I installed local::lib without the need to bootstrap anything. At this point, all I had to do was something along these lines:

shell$ perl -MCPAN -Mlocal::lib=--self-contained,my_lib \
          -e 'CPAN::install($_) for @ARGV' \
          HTTP::Server::Simple URI Log::Log4perl

Yes... I'm quite fond of Log::Log4perl, but that's another story.

The installation above went smooth and installed all the modules, and their needed non-core dependencies, under the directory tree starting from my_lib. I checked that there were actually no compiled components - dependencies could play some trick - and I verified that I had been lucky. Yay!

The directory structure you end up with is more or less the following:

my_lib/bin
my_lib/lib/perl5/...
my_lib/man

I didn't need either the bin or the man subdirectory, so I just moved the contents my_lib/lib/perl5 into a lib subdirectory, removed what remained of my_lib and... that's it! Well, wait a minute, I had to make a slight change to the code as well:

#...
use FindBin;
use lib $FindBin::Bin;

OK, now that's it! The funny part? I was actually able to use my laptop, so I didn't need anything of this...

Posted at 02:27:09 by Flavio Poletti
Sun, 15 Nov 2009

Perl.org has a new face!

Wow, I'm very pleased to see the new face of Perl.org. I have to thank Paul Fenwick for the hint!
Posted at 01:39:03 by Flavio Poletti
Thu, 29 Oct 2009

Italian Perl Workshop 2009

Divertito mi sono divertito! Rivedere gli amici perlisti, conoscerne di nuovi... Ho notato una cosa, perĂ²: la gente sembrava veramente poca quando si stava nella sala grande!

Le presentazioni che ho fatto sono su SlideShare:


Posted at 03:04:28 by Flavio Poletti

Older posts