First published: 09th January 2010
Name pmailtomaildir.pl
A script to convert Pegasus Mail folders to Courier-IMAP Maildir+
Synopsis
pmailtomaildir.pl [-d] [-n] [-v] [-p (pegasus directory)] [-m (maildir directory)] username
Description
Basic usage is quite simple, give it the name of a user and the script searches the user's PMAIL directory for folders and inbox messages, and creates copies of them in the user's Maildir directory.
This script was developed after a migration from Pegasus Mail with a Mercury Mail server to Courier Mail Server. Pegasus Mail and Mercury are an email client and server, respectively, developed by David Harris, initially using an NLM on a Netware server and a DOS. More recently, Mercury and Pegasus has also been available on Windows. At the site in question, the Netware 5.0 network was being retired, and it was decided to replace it with a Samba 3 domain, running on Debian Linux. There were no Windows servers at the site, so a replacement for Mercury was required and it was decided to use Postfix delivering to users' Maildirs, with Courier-IMAP providing access to those Maildirs, with the clients using Thunderbird (or their choice of IMAP client). After the replacement, users requested the transfer of their Pegasus Mail folders to the new system.
Operation
Pegasus/Mercury stores a user's mail in a directory called PMAIL, under their home directory, and, similarly, the Maildir+ format stores mail in a directory called Maildir, again under the user's home directory. The script therefore take the username, looks up the user's home directory and locates the relevant directories below that.
Pegasus folders are files with a *.PMM extension (there is also an index file with the same name and a .PMI extension, but pmailtomaildir.pl ignores this). RFC822 Messages are stored in the file, separated by Ctrl-Z (\x1a), there is a 128-byte header at the start of the file, which the script does not process.
Pegasus uses a file called HIERARCH.PM to store the structure of a user's mail folders. In HIERARCH.PM, each line describes one element in the hierarchy: mailbox, filing tray or folder. There are 5 comma-separated fields (number H0 – H5 for reference). The root ("My mailbox") has H0=2, H1=1. Folders have H0=0, H1=0. Filing trays have H0=1 and H1=0, 1 or 3. H3, H4 and H5 are quoted with double-quotes.
H2 has 2 or 3 sub-fields (number G0 – G2 for reference), separated by colons. Folders have 3 sub-fields: G0 is a unique hexadecimal reference number; G1 is a 4-digit hexadecimal number; G2 is the base of the folder filename. Filing trays and the mailbox have two sub-fields: G0 is a unique hex. ref. no.; G1 is the name, truncated to 16 characters.
H3 specifies the parent of the item, it is empty for the mailbox. For folders and filing trays, it contains the H2 field of the parent.
H4 contains the visible name of the item.
Note that:
- an entry for a Recent Search is different,
- some lines (Copyself folder) have two extra fields
- entries for missing folders (foreign mailboxes) will say Name_Unavailable
The script will try to use the information in HIERARCH.PM to recreate the same folder structure in the Maildir. However, in some circumstances, a user's HIERARCH.PM file can become corrupt or lost, if this is the case, folders where the hierarchy cannot be found will be created at the top level of the Maildir tree.
The characters ":", "/" and "\" are removed from folder names, "." is replaced by "_".
Options
Option | Description |
---|---|
-d | Debug. Turns on debug messages. |
-v | Verbose. Turns on even more messages. |
-p (pegasus directory) | Location of the Pegasus Mail directory, relative to the user's home directory. Defaults to PMAIL |
-m (maildir directory) | Location of the Maildir+ directory, relative to the user's home directory. Defaults to Maildir. |
Script
#!/usr/bin/perl -w
#
# pmailtomaildir.pl
#
# A ultility to migrate folders from Pegasus Mail to Maildir
# Copyright 2009, 2010 Allan G. Dyer, Yui Kee Computing Ltd.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# The GNU General Public License is at http://www.gnu.org/licenses/gpl.txt
# See http://www.gnu.org/licenses/.
#
use warnings;
use strict;
use Getopt::Std;
use Sys::Hostname;
use Data::Dumper;
use vars qw( $opt_d $opt_p $opt_m $opt_n $opt_v $username $homedir $uid $gid
$pdir $mdir $hierarch $fnmindex $Q $mboxid);
$opt_d = undef; # Debug
$opt_v = undef; # Verbose
$opt_p = 'PMAIL';
$opt_m = 'Maildir';
$opt_n = undef; # Take no action (use with -d to report what would happen)
$Q= 0;
die "Invalid options" unless (getopts("dnvp:m:"));
die "Invalid number of parameters" unless (@ARGV == 1);
$username= shift @ARGV;
(undef, undef, $uid, $gid, undef, undef, undef, $homedir) = getpwnam($username);
die "User $username does not exist" unless $homedir;
$pdir= "$homedir/$opt_p";
$mdir= "$homedir/$opt_m";
print STDERR "Source: $pdir\tDestination: $mdir\n" if $opt_d;
die "Source directory $pdir does not exist" unless -d $pdir;
die "Destination directory $mdir does not exist" unless -d $mdir;
($hierarch, $fnmindex, $mboxid)= gethierarch();
# Read the source directory, *.PMM
opendir PM, $pdir or die "Unable to open $pdir: $!";
while (my $fldnm = readdir PM) {
next unless $fldnm =~ /^(.*)\.PMM$/;
# force upper case, DOS/Windows is case-insensitive
my $basenm= uc($1);
my $fld= $hierarch->{$fnmindex->{$basenm}};
print STDERR "Processing folder $basenm: $fld->{NAME}\n" if $opt_d;
my $flddir= createfolder($fld);
unless ($flddir) {
print STDERR "Unable to process folder $basenm\n";
next;
}
splitPMM( $fldnm, $flddir);
}
# Now migrate the inbox messages
opendir PM, $pdir or die "Unable to open $pdir: $!";
my $fld= $hierarch->{$mboxid};
my $flddir= createfolder($fld);
while (my $msgnm = readdir PM) {
next unless $msgnm =~ /^(.*)\.CNM$/;
my $infileF= "$pdir/$msgnm";
my $ofn= "$mdir/.$flddir/tmp/" . newfn();
print STDERR "Migrating $infileF to $ofn\n" if $opt_d;
unless (open CNM, "<$infileF") {
print STDERR "Unable to open $infileF: $!\n";
next;
}
if ($opt_n) {
print STDERR "Would write $ofn\n";
} else {
unless (open OUT, ">$ofn") {
print STDERR "Unable to open $ofn: $!\n";
next;
}
print STDERR "Writing $ofn\n" if $opt_d;
while () {
print OUT;
}
close OUT;
rename $ofn, "$mdir/.$flddir/new/" or print STDERR "Unable to rename $ofn: $!\n";
}
}
print STDERR "Finished. $Q messages migrated.\n";
exit;
# Read and hash the HIERARCH.PM file
sub gethierarch {
open HRC, "<$pdir/HIERARCH.PM" or die "Can't open $pdir/HIERARCH.PM : $!";
my $hierarch= {};
my $fnmindex= {};
my $mboxid= '';
# There might be more than one folder with the same name at the same level, e.g.
# if there's been a loss of HIERARCH.PM, but I don't want to force all the
# contents into a single folder in the Maildir, so make the name unique
my $NAMEindex= {};
while () {
# the fields might contain " or , so a simple split is not possible
if ( /^(\d),(\d),"(.*[^\\]?)","(.*[^\\]?)",(?:"(.*[^\\]?)"|(Name_Unavailable))(?:,\d,\d\d)?\r?$/ ) {
my $h0= $1;
my $h1= $2;
my $id= $3;
my $parent= $4;
my $name= $5;
# The first empty parent indicates the mailbox name
if (not $parent and not $mboxid) {
$mboxid= $id;
print STDERR "Mailbox id is $mboxid\n" if $opt_d;
}
unless ($name) {
if ($id =~ /:(.*)$/) {
$name= $1;
} else {
$name= 'unknown';
}
}
if (exists $NAMEindex->{"$name\n$parent"}) {
my $suff= 0;
while (exists $NAMEindex->{"$name-$suff\n$parent"}) {
$suff++;
}
print STDERR "Non-unique folder name: $parent $name-$suff\n"
if $opt_d and $opt_v;
$NAMEindex->{"$name-$suff\n$parent"}= $id;
$name= "$name-$suff";
} else {
$NAMEindex->{"$name\n$parent"}= $id;
}
$hierarch->{$id}= { H0=>$h0, H1=>$h1, ID => $id, PARENT => $parent,
NAME => $name };
# Create index of folder filenames
if ($id =~ /.+:.+:(.+)/) {
# force upper case, DOS/Windows is case-insensitive
$fnmindex->{uc($1)}= $id;
}
} else {
print STDERR "Error: cannot parse folder $_\n";
next;
}
}
close HRC;
print STDERR Dumper($hierarch) if $opt_d and $opt_v;
return $hierarch, $fnmindex, $mboxid;
}
# Create Maildir folder (recursive)
sub createfolder {
my ($fld)= @_;
my $flddir= $fld->{NAME};
print STDERR "Createfolder: ", Dumper($fld) if $opt_d and $opt_v;
$flddir =~ s/[:\/\\]//g;
$flddir =~ s/\./_/g;
print STDERR "\t$flddir\n" if $opt_d;
if ($fld->{PARENT} and exists ($hierarch->{$fld->{PARENT}})) {
$flddir= createfolder($hierarch->{$fld->{PARENT}}) . ".$flddir";
}
print STDERR "\tdirectory name .$flddir\n" if $opt_d;
unless (-d "$mdir/.$flddir") {
my $cmd= "sudo -u $username maildirmake -f \"$flddir\" \"$mdir\"";
if ($opt_n) {
print STDERR "Would create $mdir/.$flddir\n\t$cmd\n";
} else {
`$cmd`;
unless (-d "$mdir/.$flddir") {
print STDERR "Failed to create $mdir/.$flddir\n";
$flddir= "";
exit;
}
}
}
return $flddir;
}
# Split a pmail folder into messages and write to output folder
sub splitPMM {
my ($infile, $outdir) = @_;
my $infileF= "$pdir/$infile";
my $outdirF= "$mdir/.$outdir/tmp";
my $rendirF= "$mdir/.$outdir/new";
print STDERR "Splitting $infileF to $outdirF\n" if $opt_d;
unless (open PMM, "<$infileF") {
print STDERR "Unable to open $infileF: $!\n";
return;
}
my $outopen= '';
while () {
if (/\x1a/) {
if ($outopen and not $opt_n) {
print OUT $`;
closeout($outdirF, $rendirF, $outopen);
}
# ^Z separates messages in the folder
$outopen= openout($outdirF, $');
return unless $outopen;
next;
}
unless ($outopen) {
# Folders start with a 128 byte header, but I might have a corrupt folder
if (/^.{128}(\w+:.*)/) {
$outopen= openout($outdirF, $1);
} else {
print STDERR "Warning: incorrect folder header for $infileF\n";
$outopen= openout($outdirF, $_);
}
next;
}
print OUT if not $opt_n;
}
closeout($outdirF, $rendirF, $outopen);
}
sub closeout {
my ($odir, $rdir, $fn)= @_;
close OUT;
rename "$odir/$fn", "$rdir/$fn" or print STDERR "Unable to rename $odir/$fn: $!\n";
}
sub openout {
my ($outdirF, $start)= @_;
my $nfn= newfn();
my $ofn= $outdirF . "/$nfn";
if ($opt_n) {
print STDERR "Would write $ofn\n";
} else {
unless (open OUT, ">$ofn") {
print STDERR "Unable to open $ofn: $!\n";
return '';
}
print STDERR "Writing $ofn\n" if $opt_d;
print OUT $start;
chown $uid, $gid, $ofn;
}
return $nfn;
}
sub newfn {
# Generates a unique filename for a message
my $fn= time() . ".P${$}Q" . $Q++ . "." .hostname();
print STDERR "fn: $fn\n" if $opt_d;
return $fn;
}