I have a small example for how to calculate a file's or string md5 checksum. It's also useful if you want save large number of urls into db, their md5sum could be used as a key column, see  my another article, store urls into database for later search

To compute a file's checksum

use Digest::MD5  qw(md5 md5_hex md5_base64);
my $ffname="/tmp/testfile";
my $hash;
 local $/ = undef;
 open FILE, "$ffname";
 binmode FILE;
 my $data = <FILE>;
 close FILE;
 $hash = md5_hex($data);
 printf("md5_hex:%s\n",$hash);

Or this way

use Digest::MD5  qw(md5 md5_hex md5_base64);
open FILE, "$ffname";
my $ctx = Digest::MD5->new;
$ctx->addfile (*FILE);
my $hash = $ctx->hexdigest;
close (FILE);
printf("md5_hex:%s\n",$hash);

Compute a string checksum

#!/usr/bin/perl -w

use warnings;
use strict;

use Digest::MD5  qw(md5 md5_hex md5_base64);

my $string = "https://sites.google.com/site/itmyshare/perl-tips-and-examples/how-to-calculate-md5-of-a-file-string-in-perl";
print "Calculating checksum for $string\n";
my $crc;
$crc = md5_base64($string);
printf("md5_base64:%s\n",$crc);
$crc = md5_hex($string);
printf("md5_hex:%s\n",$crc);
$crc = md5($string);
printf("md5:%s\n",$crc);
exit(0);

Run and see

$./crc.pl
Calculating checksum for https://sites.google.com/site/itmyshare/perl-tips-and-examples/how-to-calculate-md5-of-a-file-string-in-perl
adler32:2818975824
md5_base64:w4wLOAUg4R6oiIMLk9kBFg
md5_hex:c38c0b380520e11ea888830b93d90116
md5:Ì
     8 �.���
            ��..

md5 returns you a binary form

md5_hex returns you the checksum with 32 char length, but in hexadecimal form, 32^16 chars combinations

Wwhile, md5_base64 returns you with 22 char length, but with more characters, 22^64 chars combinations, so the probability of the chance you get duplicated crc is very low.



Detail about md5 and md5_base64, see   http://search.cpan.org/dist/Digest-MD5/MD5.pm



md5($data,...)

    This function will concatenate all arguments, calculate the MD5 digest of this "message", and return it in binary form. The returned string will be 16 bytes long.

    The result of md5("a", "b", "c") will be exactly the same as the result of md5("abc").

md5_hex($data,...)

    Same as md5(), but will return the digest in hexadecimal form. The length of the returned string will be 32 and it will only contain characters from this set: '0'..'9' and 'a'..'f'.

$md5->b64digest

    Same as $md5->digest, but will return the digest as a base64 encoded string. The length of the returned string will be 22 and it will only contain characters from this set: 'A'..'Z', 'a'..'z', '0'..'9', '+' and '/'.

    The base64 encoded string returned is not padded to be a multiple of 4 bytes long. If you want interoperability with other base64 encoded md5 digests you might want to append the string "==" to the result.

 

 

 

 

 

 

 

 

 

Comments powered by CComment