Ruby script to sync email from any IMAP server to Gmail

Wednesday October 24, 2007 @ 02:45 PM (PDT)

Update (2009-03-16): This script has been superseded by Larch, a full-fledged Ruby application that does the same thing, only faster and more reliably.

Last night after Gmail began rolling out IMAP support, I started investigating ways to copy my huge email archive (thousands and thousands of messages dating back to 2003) from my IMAP server to Gmail’s IMAP server.

Copying the messages from one account to the other in Thunderbird works, but it’s glacially slow, needs babysitting, and is prone to creating duplicate messages unless the entire copy operation works right the first time. Great for copying a few messages, not so great for copying thousands.

I also investigated imapsync, a Perl script that’s somewhat faster and more reliable than Thunderbird and doesn’t create duplicate messages, but for some reason using imapsync results in the messages on Gmail being timestamped with the time they were imported rather than the time they were sent or received, which is unacceptable. I tried using the --syncinternaldates option to rectify this, but it didn’t work.

So, since the best way to get something done right is to do it yourself, I set about writing my own tool to transfer my email. Thanks to Ruby and Net::IMAP, this turned out to be pretty easy.

Here’s what I came up with. It’s not pretty, it’s not user friendly, and it doesn’t do much error checking, but it’s extremely fast, it works, and if it fails at any point you can just run it again and it’ll pick up where it left off. Share and enjoy.

#!/usr/bin/env ruby
require 'net/imap'

# Source server connection info.
SOURCE_NAME = 'username@example.com'
SOURCE_HOST = 'mail.example.com'
SOURCE_PORT = 143
SOURCE_SSL  = false
SOURCE_USER = 'username'
SOURCE_PASS = 'password'

# Destination server connection info.
DEST_NAME = 'username@gmail.com'
DEST_HOST = 'imap.gmail.com'
DEST_PORT = 993
DEST_SSL  = true
DEST_USER = 'username@gmail.com'
DEST_PASS = 'password'

# Mapping of source folders to destination folders. The key is the name of the
# folder on the source server, the value is the name on the destination server.
# Any folder not specified here will be ignored. If a destination folder does
# not exist, it will be created.
FOLDERS = {
  'INBOX' => 'INBOX',
  'sourcefolder' => 'gmailfolder'
}

# Maximum number of messages to select at once.
UID_BLOCK_SIZE = 1024

# Utility methods.
def dd(message)
   puts "[#{DEST_NAME}] #{message}"
end

def ds(message)
   puts "[#{SOURCE_NAME}] #{message}"
end

def uid_fetch_block(server, uids, *args)
  pos = 0

  while pos < uids.size
    server.uid_fetch(uids[pos, UID_BLOCK_SIZE], *args).each {|data| yield data }
    pos += UID_BLOCK_SIZE
  end
end

@failures = 0
@existing = 0
@synced   = 0

# Connect and log into both servers.
ds 'Connecting...'
source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)

ds 'Logging in...'
source.login(SOURCE_USER, SOURCE_PASS)

dd 'Connecting...'
dest = Net::IMAP.new(DEST_HOST, DEST_PORT, DEST_SSL)

dd 'Logging in...'
dest.login(DEST_USER, DEST_PASS)

# Loop through folders and copy messages.
FOLDERS.each do |source_folder, dest_folder|
  # Open source folder in read-only mode.
  begin
    ds "Selecting folder '#{source_folder}'..."
    source.examine(source_folder)
  rescue => e
    ds "Error: select failed: #{e}"
    next
  end

  # Open (or create) destination folder in read-write mode.
  begin
    dd "Selecting folder '#{dest_folder}'..."
    dest.select(dest_folder)
  rescue => e
    begin
      dd "Folder not found; creating..."
      dest.create(dest_folder)
      dest.select(dest_folder)
    rescue => ee
      dd "Error: could not create folder: #{e}"
      next
    end
  end

  # Build a lookup hash of all message ids present in the destination folder.
  dest_info = {}

  dd 'Analyzing existing messages...'
  uids = dest.uid_search(['ALL'])

  if uids.length > 0
    uid_fetch_block(dest, uids, ['ENVELOPE']) do |data|
      dest_info[data.attr['ENVELOPE'].message_id] = true
    end
  end

  dd "Found #{uids.length} messages"

  # Loop through all messages in the source folder.
  uids = source.uid_search(['ALL'])

  ds "Found #{uids.length} messages"

  if uids.length > 0
    uid_fetch_block(source, uids, ['ENVELOPE']) do |data|
      mid = data.attr['ENVELOPE'].message_id

      # If this message is already in the destination folder, skip it.
      if dest_info[mid]
        @existing += 1
        next
      end

      # Download the full message body from the source folder.
      ds "Downloading message #{mid}..."
      msg = source.uid_fetch(data.attr['UID'], ['RFC822', 'FLAGS',
          'INTERNALDATE']).first

      # Append the message to the destination folder, preserving flags and
      # internal timestamp.
      dd "Storing message #{mid}..."

      tries = 0

      begin
        tries += 1
        dest.append(dest_folder, msg.attr['RFC822'], msg.attr['FLAGS'],
            msg.attr['INTERNALDATE'])

        @synced += 1
      rescue Net::IMAP::NoResponseError => ex
        if tries < 10
          dd "Error: #{ex.message}. Retrying..."
          sleep 1 * tries
          retry
        else
          @failures += 1
          dd "Error: #{ex.message}. Tried and failed #{tries} times; giving up on this message."
        end
      end
    end
  end

  source.close
  dest.close
end

puts "Finished. Message counts: #{@existing} untouched, #{@synced} transferred, #{@failures} failures."

Update: Now includes Steve K’s patch to fix BadResponseError exceptions. Thanks Steve!

Update (2009-03-02): Brought the script up to date with several bug fixes and enhancements (including those contributed in comments below). Thanks everyone!

Update (2009-03-16): This script has been superseded by Larch, a full-fledged Ruby application that does the same thing, only faster and more reliably.

Comments

Hi,

Nice script! Is there a way to get it to iterate through all folders automatically? I have quite a large number of folders and I would prefer no to have to list them all out. The other question is - if I do have to list them all out, how does one specify nested folders in your structure. Unforutunately I am no programmer so help in its simplest form would be great!

Thanks again

Gravatar icon
Drumbo
Thursday October 25, 2007 @ 01:10 AM (PDT)

Hi, i received these errors: I have ruby-1.8.6 and cyrus-imapd-2.3.9

$ imapGmail [localhost] connecting... [localhost] logging in... [imap.gmail.com] connecting... [imap.gmail.com] logging in... [localhost] selecting folder 'INBOX.HUP'... [imap.gmail.com] selecting folder 'INBOX.HUP'... [imap.gmail.com] analyzing existing messages... /usr/local/lib/ruby/1.8/net/imap.rb:982:in `pick_up_tagged_response': Could not parse command (Net::IMAP::BadResponseError) from /usr/local/lib/ruby/1.8/net/imap.rb:973:in `get_tagged_response' from /usr/local/lib/ruby/1.8/net/imap.rb:1031:in `send_command' from /usr/local/lib/ruby/1.8/monitor.rb:242:in `synchronize' from /usr/local/lib/ruby/1.8/net/imap.rb:1016:in `send_command' from /usr/local/lib/ruby/1.8/net/imap.rb:1169:in `fetch_internal' from /usr/local/lib/ruby/1.8/monitor.rb:242:in `synchronize' from /usr/local/lib/ruby/1.8/net/imap.rb:1167:in `fetch_internal' from /usr/local/lib/ruby/1.8/net/imap.rb:716:in `uid_fetch' from /home/gyula/bin/imapGmail:83 from /home/gyula/bin/imapGmail:53:in `each' from /home/gyula/bin/imapGmail:53

Gravatar icon
Gyula Blanka
Saturday October 27, 2007 @ 06:38 AM (PDT)

Hello,

imapsync 1.223 was buggy with dates and --syncinternaldates, 1.219 wasn't, 1.233 isn't.

Your Ruby code is nice. Is it GPL? Can I make a reference ti it in imapsync distribution?

Gravatar icon
Tuesday October 30, 2007 @ 06:21 AM (PDT)

Aha! I was indeed using version 1.223. Thanks.

Please consider this code public domain (and unsupported). You're more than welcome to refer to it if you'd like.

Gravatar icon
Tuesday October 30, 2007 @ 11:15 AM (PDT)

Man, I hate to use you as a source of tech support, but I just can't figure this out for myself, being a PHP guy and not a Ruby guy. I managed to synch my inbox just fine, but I'm having no luck with any other folders. For instance, this mapping:

imap/Receipts => Receipts

spits out the following error:

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/imap.rb:982:in `pick_up_tagged_response': Could not parse command (Net::IMAP::BadResponseError)

on my Leopard install. (It goes on from there, but I've clipped it for brevity's sake.) I'm under the impression that Leopard has a brilliant default Ruby setup, so I imagine that everything's A-OK there. Might you -- or some other commenter -- be able to point me in the right direction to figure out how to fix this?

Gravatar icon
Thursday November 01, 2007 @ 04:26 PM (PDT)

I too am getting the same thing. Seems to be happening when the Inbox is a hierarchical one. My inbox structure looks like this: INBOX INBOX.folder1 INBOX.folder2 INBOX.folder3

etc etc... But only the main INBOX is migrated.. Here is the full output below..

Anyone figured this out?? Thanks

[imap.gmail.com] selecting folder 'INBOX.DHA'... [imap.gmail.com] folder not found; creating... [imap.gmail.com] analyzing existing messages... /usr/local/lib/ruby/1.8/net/imap.rb:982:in `pick_up_tagged_response': Could not parse command (Net::IMAP::BadResponseError) from /usr/local/lib/ruby/1.8/net/imap.rb:973:in `get_tagged_response' from /usr/local/lib/ruby/1.8/net/imap.rb:1031:in `send_command' from /usr/local/lib/ruby/1.8/monitor.rb:242:in `synchronize' from /usr/local/lib/ruby/1.8/net/imap.rb:1016:in `send_command' from /usr/local/lib/ruby/1.8/net/imap.rb:1169:in `fetch_internal' from /usr/local/lib/ruby/1.8/monitor.rb:242:in `synchronize' from /usr/local/lib/ruby/1.8/net/imap.rb:1167:in `fetch_internal' from /usr/local/lib/ruby/1.8/net/imap.rb:716:in `uid_fetch' from ./rubyimapsync:79 from ./rubyimapsync:50:in `each' from ./rubyimapsync:50

Gravatar icon
mp
Friday November 02, 2007 @ 08:13 AM (PDT)

Hey guys, I don't mean to be rude, but you're on your own when it comes to debugging these problems. I wrote the script to meet my own needs and shared it in the hope that other people might find it useful, but I don't have the time or the inclination to test it on every IMAP server and every folder configuration you might want to use it with.

If you can't get this script working and don't want to debug it yourself, I recommend you use the latest version of imapsync, which is far more robust and doesn't have the date problem I experienced in an earlier version.

Gravatar icon
Friday November 02, 2007 @ 09:58 AM (PDT)

A huge thank you to you for this code. It's the first imap copy tool I've found that does just what I need.

I was having the same problem as the people above with the BadResponseError exception. I finally tracked it down : the problem is the script assumes there is at least one message in all source and destination folders. If you hand an empty uid list to uid_fetch, the Ruby imap library happily issues a bogus EXAMINE request on your behalf, and then we get this exception. I have a patch to fix it; your blog is eating it as I'm trying to paste it in here though, so you can grab it here. Note I did diff -ub so the indention/whitespace changes aren't generating tons of diff output.

Finally I have my mail copying up to Gmail. Many thanks again!

Gravatar icon
Steve K
Friday November 02, 2007 @ 05:44 PM (PDT)

Thanks for the patch! I've updated the post to include your changes.

Gravatar icon
Friday November 02, 2007 @ 06:23 PM (PDT)

That was precisely the solution that I was hoping somebody would provide, Steve. Thank you for providing that patch, and thank you for working it into your original Ruby script, wonko.

Gravatar icon
Friday November 02, 2007 @ 07:25 PM (PDT)

Thank you for writing the script, and to Steve for the helpful patch. Unfortunately I am still getting errors on several other folders. Does anyone have any thoughts on these:

[mail.stmintz.com] selecting folder 'INBOX.Personal.Parents'...
[imap.gmail.com] selecting folder 'old-parents'...
[imap.gmail.com] analyzing existing messages...
c:/ruby/lib/ruby/1.8/net/imap.rb:949:in `receive_responses': * BYE [ALERT] Fatal
 (Net::IMAP::ByeResponseError)l: No such file or directory
        from c:/ruby/lib/ruby/1.8/monitor.rb:238:in `synchronize'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:933:in `receive_responses'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:918:in `initialize'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:917:in `start'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:917:in `initialize'
        from C:/ruby/samples/gmail_imap_import.rb:44:in `new'
        from C:/ruby/samples/gmail_imap_import.rb:44
[mail.stmintz.com] selecting folder 'INBOX.Personal.Mailing-List-Crafty'...
[imap.gmail.com] selecting folder 'old-crafty-list'...
[imap.gmail.com] analyzing existing messages...
c:/ruby/lib/ruby/1.8/openssl/buffering.rb:178:in `syswrite': An existing connect
ion was forcibly closed by the remote host. (Errno::ECONNRESET)
        from c:/ruby/lib/ruby/1.8/openssl/buffering.rb:178:in `do_write'
        from c:/ruby/lib/ruby/1.8/openssl/buffering.rb:219:in `print'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1042:in `put_string'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1016:in `send_command'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1015:in `each'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1015:in `send_command'
        from c:/ruby/lib/ruby/1.8/monitor.rb:238:in `synchronize'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1012:in `send_command'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1165:in `fetch_internal'
        from c:/ruby/lib/ruby/1.8/monitor.rb:238:in `synchronize'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:1163:in `fetch_internal'
        from c:/ruby/lib/ruby/1.8/net/imap.rb:712:in `uid_fetch'
        from C:/ruby/samples/gmail_imap_import.rb:95
        from C:/ruby/samples/gmail_imap_import.rb:56:in `each'
        from C:/ruby/samples/gmail_imap_import.rb:56

Thanks again.

Sean

Gravatar icon
Sunday November 04, 2007 @ 08:48 AM (PST)

Do the errors also occur on folders without hyphens in the name? I wonder if that has anything to do with it.

Gravatar icon
Sunday November 04, 2007 @ 09:07 AM (PST)

These work:

INBOX.old-inbox to old-inbox, INBOX.Personal.Livejournal to Livejournal, INBOX.Personal.NYU to old-nyu, INBOX.Personal.UH to old-uh

These don't work:

INBOX.Personal.Parents to old-parents, INBOX.Personal.Mailing-List-Crafty to old-crafty-list, INBOX.Sent to old-sent

Sean

Gravatar icon
Sunday November 04, 2007 @ 02:40 PM (PST)

On the folders that fail, I only get this error message on Linux:

[imap.gmail.com] analyzing existing messages...
 (Net::IMAP::ByeResponseError)949:in `receive_responses': * BYE [ALERT] Fatal error: max atom size too small: File exists
        from /usr/lib/ruby/1.8/monitor.rb:238:in `synchronize'
        from /usr/lib/ruby/1.8/net/imap.rb:933:in `receive_responses'
        from /usr/lib/ruby/1.8/net/imap.rb:918:in `initialize'
        from /usr/lib/ruby/1.8/net/imap.rb:917:in `start'
        from /usr/lib/ruby/1.8/net/imap.rb:917:in `initialize'
        from gmail_imap_import.rb:46:in `new'
        from gmail_imap_import.rb:46

Sean

Gravatar icon
Monday November 05, 2007 @ 07:41 AM (PST)

wonko, thanks for merging in that patch. My copy finished up without errors and I'm cut over completely to gmail imap now.

Saw this thing from Sean and couldn't help myself from butting in... Sean, do those failing folders have a lot of messages in them? googling for "max atom size too small" suggests it's a request size limit with (older?) Courier servers. You could try modifying the ruby code here to break the lookups into chunks, or an easier, low tech solution might be to break those folders up into several, smaller folders using your favorite gui mail app. Then try the script again.

Gravatar icon
Steve K
Monday November 05, 2007 @ 01:04 PM (PST)

Yes, I figured that as well. These are large folders and I am splitting them up as we speak.

Thanks

Sean

Gravatar icon
Monday November 05, 2007 @ 01:18 PM (PST)

For the benefit of others, splitting folders in to chunks of roughly 2500 messages works without any problems.

Gravatar icon
Monday November 05, 2007 @ 07:38 PM (PST)

Hey,

I'm using this script to try and migrate a fairly large set of mailboxes from Dovecot to Gmail. I know the script is unsupported but there seems to be some interest in the comments here so I'm hoping somebody else might have run into this problem and have some suggestions.

I'm able to copy some of my smaller folders but when I try and copy larger folders I get this error:

/usr/lib/ruby/1.9/net/imap.rb:996:in `raise': Unable to append message to folder (Failure) (Net::IMAP::NoResponseError)
        from /usr/lib/ruby/1.9/net/imap.rb:996:in `get_tagged_response'
        from /usr/lib/ruby/1.9/net/imap.rb:1047:in `block in send_command'
        from /usr/lib/ruby/1.9/monitor.rb:190:in `mon_synchronize'
        from /usr/lib/ruby/1.9/net/imap.rb:1032:in `send_command'
        from /usr/lib/ruby/1.9/net/imap.rb:632:in `append'
        from /home/adam/bin/imap-to-gmail.rb:110:in `block (2 levels) in '
        from /home/adam/bin/imap-to-gmail.rb:96:in `each'
        from /home/adam/bin/imap-to-gmail.rb:96:in `block in '
        from /home/adam/bin/imap-to-gmail.rb:57:in `each'
        from /home/adam/bin/imap-to-gmail.rb:57:in `'

Normally I can simply run the command again and it will make it a bit farther so I guess I could put it in a loop until it made it all the way through but that seems a little ugly :-)

When I was playing with imapsync I saw my account loose IMAP access a couple of times. I'm half wondering if Google is seeing me pound their IMAP service with sync requests and blocking me (though it always seem to come back an hour or so later).

Adam.

Gravatar icon
Tuesday November 06, 2007 @ 10:22 PM (PST)

FWIW ... if anyone is curious here's the imapsync command line I've found which "kind of" works for migrating from a Dovecot IMAP server to Gmail.

imapsync --syncinternaldates --host1 localhost --user1 adam --password1 password  --host2 imap.gmail.com --user2 adam@example.net --password2 password --authmech2 plain --port2 993 --ssl2 --authmech1 PLAIN --authmech2 LOGIN

I say "kind of" because while it appears to complete successfully it doesn't actually seem to do a full sync of your mail (from manual tests and also running imapsync with the "--justfolders" flag and comparing the results for each account).

However for smaller mailboxes it seems to work okay.

Adam.

Gravatar icon
Tuesday November 06, 2007 @ 10:28 PM (PST)

I'd like to thank you for this script! I've been eyeing imapsync and quite frankly, that's a lot of time and software installs just to get it working (I still don't have it right) and it's installing software I normally wouldn't use on my Mac.

This script worked right off the screen clip and I didn't have to install anything (Rails works on Leopard out of the box!)

Thanks!!!!

Gravatar icon
DWY
Friday November 09, 2007 @ 05:28 PM (PST)

Thanks for the script. It was taking me forever to do this manually in Mail.app.

One small change. I was getting occasional "Unable to append message to folder" errors from Google, so I wrapped the append command (line 103) to catch the exception and retry:

# Append the message to the destination folder, preserving flags and
# internal timestamp.
dd "storing message #{mid}..."
success = false
begin
  dest.append(dest_folder, msg.attr['RFC822'], msg.attr['FLAGS'], msg.attr['INTERNALDATE'])
  success = true
rescue Net::IMAP::NoResponseError => e
  puts "Got exception: #{e.message}. Retrying..."
  sleep 1
end until success
Gravatar icon
Tuesday December 11, 2007 @ 08:59 PM (PST)

I started hitting the "Fatal error: max atom size too small" error when I tried syncing my larger folders. This happens when you try to do a uid_fetch on a large number of uids. I added a uid_fetch_block method that breaks it up into UID_BLOCK_SIZE uids (1024 worked fine for me). Here's the diff of all my changes:

--- untitled
+++ (clipboard)
@@ -15,6 +15,8 @@
 DEST_USER = 'username@gmail.com'
 DEST_PASS = 'password'
 
+UID_BLOCK_SIZE = 1024 # max number of messages to select at once
+
 # Mapping of source folders to destination folders. The key is the name of the
 # folder on the source server, the value is the name on the destination server.
 # Any folder not specified here will be ignored. If a destination folder does
@@ -33,6 +35,14 @@
    puts "[#{SOURCE_HOST}] #{message}"
 end
 
+def uid_fetch_block(server, uids, *args)
+  pos = 0
+  while pos < uids.size
+    server.uid_fetch(uids[pos, UID_BLOCK_SIZE], *args).each { |data| yield data }
+    pos += UID_BLOCK_SIZE
+  end
+end
+
 # Connect and log into both servers.
 ds 'connecting...'
 source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)
@@ -77,16 +87,18 @@
   
   dd 'analyzing existing messages...'
   uids = dest.uid_search(['ALL'])
+  dd "found #{uids.length} messages"
   if uids.length > 0
-    dest.uid_fetch(uids, ['ENVELOPE']).each do |data|
+    uid_fetch_block(dest, uids, ['ENVELOPE']) do |data|
       dest_info[data.attr['ENVELOPE'].message_id] = true
     end
   end
   
   # Loop through all messages in the source folder.
   uids = source.uid_search(['ALL'])
+  ds "found #{uids.length} messages"
   if uids.length > 0
-    source.uid_fetch(uids, ['ENVELOPE']).each do |data|
+    uid_fetch_block(uids, ['ENVELOPE']) do |data|
       mid = data.attr['ENVELOPE'].message_id
 
       # If this message is already in the destination folder, skip it.
Gravatar icon
Wednesday December 12, 2007 @ 06:31 AM (PST)

Thanks Brad, Steve, and Wonko - It is now working quite well yae :)

one small thing: I had to add the first argument 'source' to the second call of uid_fetch_block, e.g.:
 0
    uid_fetch_block(source, uids, ['ENVELOPE']) do |data|
      mid = data.attr['ENVELOPE'].message_id

not sure why this was left off of your patch - possibly formatting??

Gravatar icon
Thursday December 20, 2007 @ 09:47 AM (PST)

Hello, thanks for publishing this IMAP transfer script, I've recently moved to google apps and have used this transfer ~5k emails. I have made a few modifications, mainly for handling exceptions and server timeouts. If your interested have a look here

Hugh

Gravatar icon
Friday January 11, 2008 @ 01:23 AM (PST)

This script only allow transfer messages from one IMAP-folder? How do this recursive on existing subfolders?

Gravatar icon
Saturday January 12, 2008 @ 02:31 AM (PST)

I've added some lines to implement the following features in order to hava a much complete sync tool:

1) Make a duplicate copy of ALL the folders(labels) included in the source

2) Remove from destination files that are no more included in the source

3) Remove from destination folders(labels) that are no more included in the source

you can have a look here

Thanks a lot for writing the rest :-) Diego

Gravatar icon
Friday February 08, 2008 @ 07:55 AM (PST)

I Add follow lines to get imap server’s all sub folder and mapped to gmail folder. example map (INBOX.example to INBOX/example ), it work in my situation.

Comment “FOLDERS” array
#FOLDERS = {
# ‘INBOX’ => ‘INBOX’
#}

and add the line under dd ‘logging in…’
dest.login(DEST_USER, DEST_PASS)

### Added line ##
FOLDERS = Hash.new
OLDLIST=source.list(””, ”*”)
OLDLIST.each{ |i|
$NEWGMAIL_FOLDER=i.name
$NEWGMAIL_FOLDER = $NEWGMAIL_FOLDER.gsub(/[.]/,’/’)
FOLDERS[i.name]=$NEWGMAIL_FOLDER
}

Gravatar icon
Jack Ho
Friday February 22, 2008 @ 02:38 AM (PST)

Hi guys,

A solution for the “max atom size too small” problem is to require the ‘enumerator’ stdlib and to change the fetch iteration to:

uids.each_slice(1000) do |slice| source.uid_fetch(slice, [‘ENVELOPE’]).each do |data|

Note: I used slices of 1000 emails, but i think it supports more than 3000.

Gravatar icon
Thursday April 17, 2008 @ 05:59 AM (PDT)

Thanks for the post… really good.

I’m having this error: c:/ruby/lib/ruby/1.8/net/imap.rb:982:in `pick_up_tagged_response’: CLOSE not allowed now. (Net::IMAP::BadResponseError)

I think maybe is because gmail server, isn’t it?

Gravatar icon
Mario Ruiz
Tuesday May 27, 2008 @ 03:14 AM (PDT)

I’ve been looking for an easy way to migrate a BUNCH of mail to GMail’s IMAP server. This solved it with some quick awk and friends to generate the mailbox lists.

Thanks a bunch!

Gravatar icon
Larry Rosenman
Saturday June 14, 2008 @ 09:39 PM (PDT)

For the first very thanks for the script.
I’m small modified it (also thanks for the patches to recursive transfer and any other, I’m read all and make self mix of all it with small add-ons) and it transfer many mails.

But, I have got a problem: What can be done if message_id in Envelope is nil (data.attr‘ENVELOPE’.message_id)??? So, it may be Nil by FRC 822 (http://www.faqs.org/rfcs/rfc3501.html)
In this case we can not check presents its mail in dest_hash! I’m suppose there is another way to compute unique message-hash. There are anyone who known it?

And I’m very-very newbie in ruby, so, it is my first experience with it. Please, explain much more simply as you can.

Gravatar icon
Tuesday August 05, 2008 @ 06:21 AM (PDT)

great script. Worked like a dream!

Thanks a lot. Saved me a bunch of work!

Gravatar icon
Snorre Gylterud
Tuesday August 19, 2008 @ 08:01 PM (PDT)

I am trying to sync gmail to my local server as a backup. it seems if you have subfolders that this script chokes up…

here is an example output….

[imap.gmail.com] selecting folder ‘example/sub folder’…
[localhost] selecting folder ‘example/sub folder’…
[localhost] folder not found; creating…
[localhost] error: could not create folder: Invalid mailbox name
[localhost] /usr/lib/ruby/1.8/net/imap.rb:972:in `get_tagged_response’
/usr/lib/ruby/1.8/net/imap.rb:1023:in `send_command’
/usr/lib/ruby/1.8/monitor.rb:238:in `synchronize’
/usr/lib/ruby/1.8/net/imap.rb:1008:in `send_command’
/usr/lib/ruby/1.8/net/imap.rb:414:in `create’

anyone have a fix?

Gravatar icon
David
Thursday August 21, 2008 @ 10:22 PM (PDT)

fyi: my local server is CentOS 5.x using dovecot imap server with ~/Maildir configured, not mbox.

Gravatar icon
David
Thursday August 21, 2008 @ 10:23 PM (PDT)

I figured it out. dovecot stores sub dir separators as a ., not a slash. I reversed the code above with the gsub()

a note, gmail is my SOURCE, dovecot is my DEST

FOLDERS = Hash.new
OLDLIST=source.list(“", "*”)
OLDLIST.each{ |i|
$NEWGMAIL_FOLDER=i.name
$NEWGMAIL_FOLDER = $NEWGMAIL_FOLDER.gsub(/\//,‘.’)
FOLDERS[i.name]=$NEWGMAIL_FOLDER
}

Gravatar icon
David
Thursday August 21, 2008 @ 11:23 PM (PDT)

Does anyone know how to modify the script to mark the messages as unread in Gmail?

Gravatar icon
Panderson
Friday September 05, 2008 @ 05:36 PM (PDT)

Hey, it looks like there has been quite a bit of collaboration put in to revise the original script. I’m curious if there is a copy of the script with all of the patches/tweaks? Has the version in the main part of the posting been updated with the improvements? Otherwise, I’m curious if somebody can share a copy of the patched version. I’ve never used Ruby before, but I imagine I could work my way through putting the patches into the script, but no reason to re-invent if you guys are willing to share.
Thanks.

Gravatar icon
Friday September 12, 2008 @ 08:56 AM (PDT)

I’ve applied all the above patches and cleaned up the code a bit. The script works like a charm thanks to the original and all subsequent contributors.

#!/usr/bin/env ruby
require 'net/imap'

# Source server connection info.
SOURCE_HOST = 'mail.example.com'
SOURCE_PORT = 993
SOURCE_SSL  = true
SOURCE_USER = 'username'
SOURCE_PASS = 'password'

# Destination server connection info.
DEST_HOST = 'imap.gmail.com'
DEST_PORT = 993
DEST_SSL  = true
DEST_USER = 'user@gmail.com'
DEST_PASS = 'password'

UID_BLOCK_SIZE = 1024 # max number of messages to select at once

# Mapping of source folders to destination folders. The key is the name of the
# folder on the source server, the value is the name on the destination server.
# Any folder not specified here will be ignored. If a destination folder does
# not exist, it will be created.
FOLDERS = {
  'INBOX' => 'INBOX',
  #'sourcefolder' => 'gmailfolder'
}

# Utility methods.
def dd(message)
   puts "[#{DEST_HOST}] #{message}"
end

def ds(message)
   puts "[#{SOURCE_HOST}] #{message}"
end

def uid_fetch_block(server, uids, *args)
  pos = 0
  while pos < uids.size
    server.uid_fetch(uids[pos, UID_BLOCK_SIZE], *args).each { |data| yield data }
    pos += UID_BLOCK_SIZE
  end
end

# Connect and log into both servers.
ds 'connecting...'
source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)

ds 'logging in...'
source.login(SOURCE_USER, SOURCE_PASS)

dd 'connecting...'
dest = Net::IMAP.new(DEST_HOST, DEST_PORT, DEST_SSL)

dd 'logging in...'
dest.login(DEST_USER, DEST_PASS)

# Loop through folders and copy messages.
FOLDERS.each do |source_folder, dest_folder|
  # Open source folder in read-only mode.
  begin
    ds "selecting folder '#{source_folder}'..."
    source.examine(source_folder)
  rescue => e
    ds "error: select failed: #{e}"
    next
  end
  
  # Open (or create) destination folder in read-write mode.
  begin
    dd "selecting folder '#{dest_folder}'..."
    dest.select(dest_folder)
  rescue => e
    begin
      dd "folder not found; creating..."
      dest.create(dest_folder)
      dest.select(dest_folder)
    rescue => ee
      dd "error: could not create folder: #{e}"
      next
    end
  end
  
  # Build a lookup hash of all message ids present in the destination folder.
  dest_info = {}
  
  dd 'analyzing existing messages...'
  uids = dest.uid_search(['ALL'])
  dd "found #{uids.length} messages"
  if uids.length > 0
    uid_fetch_block(dest, uids, ['ENVELOPE']) do |data|
      dest_info[data.attr['ENVELOPE'].message_id] = true
    end
  end
  
  # Loop through all messages in the source folder.
  uids = source.uid_search(['ALL'])
  ds "found #{uids.length} messages"
  if uids.length > 0
    uid_fetch_block(source, uids, ['ENVELOPE']) do |data|
      mid = data.attr['ENVELOPE'].message_id

      # If this message is already in the destination folder, skip it.
      next if dest_info[mid]
    
      # Download the full message body from the source folder.
      ds "downloading message #{mid}..."
      msg = source.uid_fetch(data.attr['UID'], ['RFC822', 'FLAGS',
          'INTERNALDATE']).first
    
      # Append the message to the destination folder, preserving flags and
      # internal timestamp.
      dd "storing message #{mid}..."
      success = false
      begin
        dest.append(dest_folder, msg.attr['RFC822'], msg.attr['FLAGS'], msg.attr['INTERNALDATE'])
        success = true
      rescue Net::IMAP::NoResponseError => e
        puts "Got exception: #{e.message}. Retrying..."
        sleep 1
      end until success

      dest.append(dest_folder, msg.attr['RFC822'], msg.attr['FLAGS'],
          msg.attr['INTERNALDATE'])
    end
  end
  
  source.close
  dest.close
end

puts 'done'
Gravatar icon
Ryan
Thursday October 23, 2008 @ 02:30 PM (PDT)

Hi Ryan,

Just wanted to drop by and say thanks for this script, it helped me merge my 3 errant IMAP inboxes (and respective folders) into one gmail inbox of goodness. I had to lower the max selected to 512 as 1024 was giving me problems (frequent disconnects, then having to rerun the script)

I made several copies of the script and then ran them all at once (multi-threading, woohoo!) as I noticed that running one script at once was quite inefficient (only utilises 5kb/s on average, i have a 100kb/s upload) – this combined with the 512 limit allowed me to run them unattended.

You should submit this to lifehacker as it seems like something that would be published there.

Thanks again!

Gravatar icon
Tuesday December 02, 2008 @ 11:00 AM (PST)

I love the script, but I think Ryan accidentally included the following twice:

dest.append(dest_folder, msg.attr['RFC822'], msg.attr['FLAGS'], msg.attr['INTERNALDATE'])

If the script fails on the second dest.append it won’t retry, and it’s also a waste to upload the message twice…

Gravatar icon
Karl Rosenbaum
Wednesday December 17, 2008 @ 12:34 PM (PST)

Ryan (et al),

Thanks for this script. Due to my good searching skills but complete lack of Ruby (or any) programming knowledge, I think I’m now running the script to clear out one Gmail account and dump the messages to another Gmail account.

I’m sure I’m doing something fantastically wrong, though, because it’s very slow.

I downloaded Ruby, ran SciTE, pasted the code, updated my server and folder info, and then hit ‘Go’. I chose ‘Go’ because things like ‘Compile’ were greyed out.

I don’t need a whole Ruby tutorial for just this one project, but if this is so absurdly wrong that it offends your sensibilities, please gimme a nudge in the right direction!

Gravatar icon
Greg J
Saturday January 03, 2009 @ 01:01 PM (PST)

The best way to run this script is from the command line, since that will allow you to see its output as it copies messages. Open a command prompt (on Windows, click the Start menu, then “Run…” and type cmd), then type ruby <path to script>.

Gravatar icon
Saturday January 03, 2009 @ 02:36 PM (PST)

Hi

Thanks a lot for your script, it helped me a lot. I did some checks regard Gmail and the support of IMAP keywords: to me, they do support it.

Based on your script, here’s another one in Ruby that some people might find useful (running as a cronjob, though it’s not optimal…

#!/usr/bin/env ruby

# I'm discovering Ruby...
# Many thanks to http://wonko.com/post/ruby_script_to_sync_email_from_any_imap_server_to_gmail
# on which this script is largely based

require 'net/imap'

# Destination server connection info.
SOURCE_HOST = 'imap.gmail.com'
SOURCE_PORT = 993
SOURCE_SSL  = true
SOURCE_USER = 'USERNAME@gmail.com'
SOURCE_PASS = 'MY_GMAIL_PASSWORD'

# Mapping of folders to the label to apply
# For flexibility, this has to be defined by the user
# People with "Google mail" rather than "Gmail",
# people who don't want to put IMAP keywords on every Gmail label,
# people who want to set a different keyword than the Gmail label,
# this is for you.
# 
GMAILLABEL2IMAPKEYWORD = {
# Eg:
# '[Gmail]/Spam' => 'Spam',
  'MY_GMAIL_LABEL' => 'MY_IMAP_KEYWORD'
}

# Utility methods.
def ds(message)
   puts "[#{SOURCE_HOST}] #{message}"
end

# Connect and log into both servers.
ds 'Connecting...'
source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)

ds 'Logging in...'
source.login(SOURCE_USER, SOURCE_PASS)

# Loop through folders and copy messages.
GMAILLABEL2IMAPKEYWORD.each do |gmaillabel, imapkeyword|
  # Open source folder in read-write mode.
  begin
    ds "Selecting folder '#{gmaillabel}'..."
    source.select(gmaillabel)
  rescue => e
    ds "Error: select failed: #{e}"
    next
  end
  
  # Loop through all messages in the source folder.
  ds "Setting the '#{imapkeyword}' keyword on existing messages..."
  uids = source.uid_search(['ALL'])

  # Set the specified keyword
  source.uid_store(uids, "+FLAGS", imapkeyword).each
  
  source.close
end

ds 'Successfully completed'

Gravatar icon
Sylvain
Saturday January 03, 2009 @ 03:14 PM (PST)

Thanks, Ryan. Better yet, I found out accidentally that I can also just double-click on the script (if I save it with an .rb extension) and it’ll open its own DOS window!

Now I’ve got to figure out how to edit the script so my ‘Sent Mail’ folders can be copied…

Gravatar icon
Greg J
Saturday January 03, 2009 @ 05:51 PM (PST)

Ha. Turns out it’s simple:

‘[Gmail]/Sent Mail’ => ‘[Gmail]/Sent Mail’
Gravatar icon
Greg J
Saturday January 03, 2009 @ 05:54 PM (PST)

thanks for this fantastic script!

i’ve successfully used it for a number of smaller folders, but am now attempting to use it for two folders with 8000 and 17000 messages. when i run this i get a “word too long” error, which i believe is a problem with the imap server not allowing such a large request.

if absolutely necessary, i figure i can make some smaller folders and then transfer those. wondering, though, if there’s an easier/cleaner way to do this. i believe there there are other possibilities for the uid_search instead of [‘ALL’] that might be more successful, as documented here:

http://www.ruby-doc.org/stdlib/libdoc/net/imap/rdoc/classes/Net/IMAP.html

i’d be grateful for any suggestions. thanks.

[imap.gmail.com] analyzing existing messages...
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/imap (Net::IMAP::ByeResponseError): * BYE Fatal error: word too long
	from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/monitor.rb:242:in `synchronize'
	from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/imap.rb:937:in `receive_responses'
	from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/imap.rb:922:in `initialize'
	from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/imap.rb:921:in `start'
	from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/net/imap.rb:921:in `initialize'
	from ./gmail_cp.rb:38:in `new'
	from ./gmail_cp.rb:38
Gravatar icon
Wednesday January 07, 2009 @ 03:07 PM (PST)

Have you tried the updated version of the script posted by (the other) Ryan in this comment? I haven’t tried it myself, but it may fix the “word too long” problem.

Gravatar icon
Wednesday January 07, 2009 @ 03:41 PM (PST)

ryan,

thanks for the pointer to the complete updated script — apologies that i missed it in the comments. indeed the UID_BLOCK_SIZE seems to fix the problem!

i still occasionally receive an append error, but it looks like that may be a problem on google’s end.

thanks again for the terrific script and for your support.

Gravatar icon
Wednesday January 07, 2009 @ 09:48 PM (PST)

Greetings Ryan,

I’ve found this script very useful. Thank you.

I ran the script over-night and saw that it was indeed syncing, but in the morning I found the message “error: select failed: Mailbox does not exist, or must be subscribed to. done”

So I wondered if something went wrong, but it seems to me like nothing went wrong and that that message in fact means “Syncing completed. Nothing further to do”. Am I right?

I’m unable to compare the number of messages in the destination with that in the source because the destination uses the concept of “conversation” which combines many messages together.

I suggest reporting the number of messages synchronised at the end of the run.

Gravatar icon
Tuesday January 13, 2009 @ 04:58 AM (PST)

Ryan,

I just modified your source to add the counter and other numerical hints while working. You can find it at http://ahmad.gharbeia.org/files/imapsync.rb

Thank you for the code, and for making me write my first Ruby code ever.

Regrading the aburpt termination I mentioned earlier I till need to investigate this as the number indicators show me that some messages are indeed not copied. This could pertain to any of the servers, however.

Regards,

Gravatar icon
Tuesday January 13, 2009 @ 09:03 AM (PST)

#!/usr/bin/env ruby

require ‘net/imap’

  1. Source server connection info.
    SOURCE_HOST = ‘mail.gharbeia.org’
    SOURCE_PORT = 993
    SOURCE_SSL = true
    SOURCE_USER = ‘transit+gharbeia.org’
    SOURCE_PASS = ‘onthemove’
  1. Destination server connection info.
    DEST_HOST = ‘imap.gmail.com’
    DEST_PORT = 993
    DEST_SSL = true
    DEST_USER = ‘ahmad.gharbeia@gmail.com’
    DEST_PASS = ‘onthemove’
  1. Mapping of source folders to destination folders. The key is the name of the
  2. folder on the source server, the value is the name on the destination server.
  3. Any folder not specified here will be ignored. If a destination folder does
  4. not exist, it will be created.
    FOLDERS = {
    ‘Inbox.Sent’ => ‘Sent’,
    ‘sourcefolder’ => ‘gmailfolder’
    }
  1. Utility methods.
    def dd(message)
    puts “[#{DEST_HOST}] #{message}”
    end

def ds(message)
puts “[#{SOURCE_HOST}] #{message}”
end

  1. Connect and log into both servers.
    ds ‘connecting…’
    source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)

ds ‘logging in…’
source.login(SOURCE_USER, SOURCE_PASS)

dd ‘connecting…’
dest = Net::IMAP.new(DEST_HOST, DEST_PORT, DEST_SSL)

dd ‘logging in…’
dest.login(DEST_USER, DEST_PASS)

  1. Loop through folders and copy messages.
    FOLDERS.each do |source_folder, dest_folder|
    # Open source folder in read-only mode.
    begin
    ds “selecting folder ‘#{source_folder}’…”
    source.examine(source_folder)
    rescue => e
    ds “error: select failed: #{e}”
    next
    end

    # Open (or create) destination folder in read-write mode.
    begin
    dd “selecting folder ‘#{dest_folder}’…”
    dest.select(dest_folder)
    rescue => e
    begin
    dd “folder not found; creating…”
    dest.create(dest_folder)
    dest.select(dest_folder)
    rescue => ee
    dd “error: could not create folder: #{e}”
    next
    end
    end

    # Build a lookup hash of all message ids present in the destination folder.
    dest_info = {}

    dd ‘analyzing existing messages…’
    uids = dest.uid_search([‘ALL’])
    if uids.length > 0
    dest.uid_fetch(uids, [‘ENVELOPE’]).each do |data|
    dest_info[data.attr[‘ENVELOPE’].message_id] = true
    end
    end
    dd “found #{uids.length} messages”

    # Loop through all messages in the source folder.
    uids = source.uid_search([‘ALL’])
    ds “found #{uids.length} messages”
    if uids.length > 0
    puts ‘start of copying…’
    copy_counter = 0
    source.uid_fetch(uids, [‘ENVELOPE’]).each do |data|
    mid = data.attr[‘ENVELOPE’].message_id
# If this message is already in the destination folder, skip it. next if dest_info[mid] # Download the full message body from the source folder. ds “downloading message #{mid}…” msg = source.uid_fetch(data.attr[‘UID’], [‘RFC822’, ‘FLAGS’, ‘INTERNALDATE’]).first # Append the message to the destination folder, preserving flags and internal timestamp. dd “storing message #{mid}…” dest.append(dest_folder, msg.attr[‘RFC822’], msg.attr[‘FLAGS’], msg.attr[‘INTERNALDATE’]) copy_counter = copy_counter + 1 end end puts “copied #{copy_counter} messages” source.close dest.close

end

puts ‘done’

Gravatar icon
Tuesday January 13, 2009 @ 09:17 AM (PST)

Hi,

I’m using the updated script posted in the comment above by Ryan and i’ve got 4 copies running copying from Google Mail to Google Mail for Domains.

All seems to be ok except for one small label (with 46 conversations) which it just doesn’t like – I get this:

[imap.gmail.com] analyzing existing messages…
[imap.gmail.com] found 0 messages
[imap.googlemail.com] found 87 messages
/usr/lib/ruby/1.8/net/imap.rb:3118:in `parse_error’: unknown token – “\”Google" (Net::IMAP::ResponseParseError)
from /usr/lib/ruby/1.8/net/imap.rb:3063:in `next_token’
from /usr/lib/ruby/1.8/net/imap.rb:2980:in `lookahead’
from /usr/lib/ruby/1.8/net/imap.rb:2878:in `nstring’
from /usr/lib/ruby/1.8/net/imap.rb:2080:in `envelope’
from /usr/lib/ruby/1.8/net/imap.rb:2067:in `envelope_data’
from /usr/lib/ruby/1.8/net/imap.rb:2042:in `msg_att’
from /usr/lib/ruby/1.8/net/imap.rb:2022:in `numeric_response’
from /usr/lib/ruby/1.8/net/imap.rb:1964:in `response_untagged’
from /usr/lib/ruby/1.8/net/imap.rb:1944:in `response’
from /usr/lib/ruby/1.8/net/imap.rb:1870:in `parse’
from /usr/lib/ruby/1.8/net/imap.rb:1002:in `get_response’
from /usr/lib/ruby/1.8/net/imap.rb:926:in `receive_responses’
from /usr/lib/ruby/1.8/net/imap.rb:919:in `initialize’
from /usr/lib/ruby/1.8/net/imap.rb:918:in `start’
from /usr/lib/ruby/1.8/net/imap.rb:918:in `initialize’
from ./gmail_copy3.sh:47:in `new’
from ./gmail_copy3.sh:47

I’ve never seen ruby code before I’ve managed to put a load of debugging in and tracked it down to this line in uid_fetch_block:
server.uid_fetch(uids[pos, UID_BLOCK_SIZE], *args).each { |data| yield data }

Any ideas?

Thanks

Ian

Gravatar icon
ichilton
Sunday January 25, 2009 @ 05:12 PM (PST)

Hi,

Left 5 copies running overnight and they have all done a Segmentation Fault after about 6 hours.

I’ve set them all back off but one of them has Segmentation Faulted quite quickly.

Anyone else had similar problems?

Thanks

Ian

Gravatar icon
Ian Chilton
Monday January 26, 2009 @ 01:15 AM (PST)

Further to the message I posted above about the strange label which stops the script from running – I tried the original script from the original posting and that does exactly the same. I have however tracked it back to a single message causing the problem – it’s an old mail from Google with a subject of “Google Analytics New Version”, but when I view the message source, it’s in there as:
Subject: =?ISO-8859-1?Q?Google=20Analytics=20New=20Version=0A?=

Anyone any ideas how to improve the script to cope with that mail? – i’ve deleted it but there could well be others similar in there before i’m finished (only about a third of the way through ~80,000 messages).

The line it’s stopping on in the original script is:
source.uid_fetch(uids, [‘ENVELOPE’]).each do |data|

Thanks

Ian

Gravatar icon
Ian Chilton
Monday January 26, 2009 @ 01:33 AM (PST)

Hi there,

This script is great – thanks for sharing it! The only problem I have is that at the end, I get an error to do with the ‘existing’ variable. Since I know nothing of ruby, could you help me out here? The error is:

transfer_mail.ruby:155: undefined local variable or method `existing’ for main:Object (NameError)

I copied and pasted the code at the top of this page into a file called transfer_mail.ruby and ran chmod +x transfer_mail.ruby and ./transfer_mail.ruby. The messages copy over great, but it just crashes on the print statement at the end. Any ideas?

Thanks!

Gravatar icon
Thomas
Thursday March 05, 2009 @ 03:24 PM (PST)

Sorry about that. Should be fixed now.

For what it’s worth, I’m working on a much more robust app for syncing IMAP mailboxes. I’ll announce it on this blog when it’s ready.

Gravatar icon
Thursday March 05, 2009 @ 04:10 PM (PST)

I’m trying to sync my mail to my gmail account… and started with the inbox… and I noticed some messages just never uploaded… Any ideas?

Gravatar icon
Ronald Vyhmeister
Thursday March 05, 2009 @ 09:00 PM (PST)

One reason may be that this script, for the sake of simplicity, assumes that all messages will have a valid Message-Id header that can be used to uniquely identify them. While this is the case for most messages, sometimes a message will be missing the header. When this happens, Gmail simply makes up a Message-Id, but most other IMAP servers don’t (and even if they did, it probably wouldn’t match Gmail’s), so this can cause problems.

Another reason could be that the message is malformed in some way that the source IMAP server doesn’t mind but that Gmail can’t handle, or perhaps it has an attachment Gmail doesn’t like.

This is why I’m working on a more robust application to perform reliable IMAP to IMAP sync operations. This script was a quick hack that I threw together to do a job that needed doing fast, but it won’t work perfectly for everyone since there are simply too many edge cases it doesn’t handle.

Gravatar icon
Thursday March 05, 2009 @ 10:08 PM (PST)

Hi Ryan,

Thanks for the fix! I’m running into a problem where the script hangs on a particular message:

[xxx@xxx] Downloading message <200811171532.mAHFWE6t009315@xxx>…
[yyy@yyy] Storing message <200811171532.mAHFWE6t009315@xxx>…
[xxx@xxx] Downloading message …
[yyy@yyy] Storing message …

Is there a quick fix for this?

Also, on what kind of timescale are you thinking of releasing your new app?

Thanks again!

Thomas

Gravatar icon
Thomas
Friday March 06, 2009 @ 10:06 AM (PST)

It looks like the message it’s hanging on doesn’t have a Message-Id, which is the biggest problem with this script. Making the following change should cause the script to skip the offending message and move on (caveat, I haven’t tested this).

Change this:

# If this message is already in the destination folder, skip it.
if dest_info[mid]
  @existing += 1
  next
end

To this:

# If this message has no Message-Id, skip it.
if mid.nil? || mid.to_s.strip.empty?
  ds "Skipping a message with no Message-Id"
  @failures += 1
  next
end

# If this message is already in the destination folder, skip it.
if dest_info[mid]
  @existing += 1
  next
end

I’m hoping to get the new app finished soon since I need to use it myself, but all I can tell you is that it’ll be released as soon as I get it working.

Gravatar icon
Friday March 06, 2009 @ 10:27 AM (PST)

When messages are sync’d from source to destination on Gmail my UNREAD messages show up as READ. Is this normal and if so how can I ensure the message status is transparently synchronized?

Btw, dude…awesome, yet simple script.

Gravatar icon
Steve
Friday March 06, 2009 @ 12:29 PM (PST)

I haven’t experienced this problem, but I’ve mostly tested with large sets of read messages. Message flags should be transferred properly from the source to the destination. However, if a message already exists on the destination but has different flags, the script will not modify it. I’m planning to make this an option in the upcoming app.

Gravatar icon
Friday March 06, 2009 @ 01:14 PM (PST)

Hi Ryan,

Thanks so much for the script, it’s fantastic. I do have one question though. Out of the 5k+ messages I transfered, a couple of them failed. I made sure to run the script while sending the output to a file, so I’ve got a log of the specific messages that didn’t transfer properly. My question is, how can I determine which message wasn’t transfered based on [xxx@xxx.com] Downloading message <abcdef-gh-ijklmnop@yyy.com>…?

Gravatar icon
Matt
Friday March 06, 2009 @ 02:14 PM (PST)

It’s definitely marking my UNREAD messages as READ. I did a sync between an Exchange IMAP (source) and a new Gmail account. I’ve verified via Mail.App (mac os) using IMAP that the messages are indeed UNREAD still on the source server.

Gravatar icon
Steve
Friday March 06, 2009 @ 04:13 PM (PST)

@Matt: The string between angle brackets in the script’s output is the Message-Id. If the source account happens to be Gmail, you can just paste that string into the search box to find the message. Otherwise, you’ll need to find an IMAP client that supports searching or filtering based on arbitrary headers.

I don’t think Thunderbird will search within Message-Id headers, but you may be able to create a custom filter and apply it to existing folders in order to find the message.

Gravatar icon
Friday March 06, 2009 @ 05:40 PM (PST)

@Steve: I haven’t tested this script at all with Exchange, but Exchange’s IMAP implementation is notoriously broken, so it’s possible that it may not be returning the correct message flags. Sorry. I do plan to test the new app with Exchange and work around as many of its quirks as possible, though.

Gravatar icon
Friday March 06, 2009 @ 05:42 PM (PST)

I would’ve thought a non-Microsoft IMAP client would’ve seen this problem in that case (UNREAD/READ flag not being recognized or returned correctly). Doesn’t your script read the same flags? Is there any way to turn on verbose debugging so I can see what’s happening? I’m sure this could help you with the new app.

Gravatar icon
Steve
Saturday March 07, 2009 @ 01:49 PM (PST)

If this is an Exchange issue, then there’s a good chance other clients work around it, whereas the script doesn’t.

I hope to be able to release a very early version of the new app (which I’m calling Larch) sometime tomorrow. Stay tuned. I’ll start testing it with Exchange shortly as well.

Gravatar icon
Saturday March 07, 2009 @ 11:42 PM (PST)

Anyone know if there is an easy way to modify the script for the case where the source is just Thunderbird mailboxes on local disk? In Thunderbird I set up an account with a dummy server, username, etc. Then I pointed [account]/server settings/Local directory: to my mailboxes on local disk. Any ideas?

Gravatar icon
troy
Saturday March 07, 2009 @ 11:43 PM (PST)

This script can’t read Thunderbird mailboxes. It should be possible to write a script that can read them (I believe Thunderbird uses an Mbox-like format), but I don’t plan to do it.

Gravatar icon
Sunday March 08, 2009 @ 12:12 AM (PST)

Some info that may help you…

1. Sync Exchange to Gmail with 1 UNREAD message → Returns FLAG “Unseen” (capital U and lowercase nseen). UNREAD status is NOT synchronized correctly. Message shows as READ on Gmail.
2. Sync Gmail to Dovecot with 1 UNREAD message → Returns no flags at all, but UNREAD status is correctly reflected in Dovecot.

Is the UNREAD/READ status coming from somewhere else possibly?

Gravatar icon
Steve
Sunday March 08, 2009 @ 07:35 AM (PDT)

Replacing RFC822 in the source retrieval line to “BODY.PEEK[]” seems to work for getting an accurate state from Exchange. Now I’m struggling with how to replace RFC822 on the append line. I’m a hack and don’t know Ruby…how would I make the appropriate change? I get an error with "msg.attr[‘BODY.PEEK[]’]. I know my syntax is the problem, just not sure what it should be. :)

Gravatar icon
Steve
Sunday March 08, 2009 @ 08:44 AM (PDT)

Got it to work! Here’s is what I now have that correctly syncs READ/UNREAD from Exchange to Gmail. The key was BODY.PEEK in the FETCH and BODY in the append.

  1. Download the full message body from the source folder.
    ds “Downloading message #{mid}…”
    msg = source.uid_fetch(data.attr[&lsquo;UID’], [‘BODY.PEEK[]’, ‘FLAGS’,
    ‘INTERNALDATE’]).first
  1. Append the message to the destination folder, preserving flags and
  2. internal timestamp.
    dd “Storing message #{mid}…”
tries = 0 begin tries += 1 dest.append(dest_folder, msg.attr[‘BODY[]’], msg.attr[‘FLAGS’], msg.attr[‘INTERNALDATE&rsquo;])
Gravatar icon
Steve
Sunday March 08, 2009 @ 10:36 AM (PDT)

Ah, got it. Good catch; I’ll make sure this is fixed in Larch.

Gravatar icon
Sunday March 08, 2009 @ 01:01 PM (PDT)

An early pre-release version of Larch (my new IMAP syncing app) is ready for testing, if you’re interested. Source code is on GitHub, or you can install the test gem like so:

sudo gem sources -a http://gems.github.com
sudo gem install rgrove-larch

Type larch -h for an overview of the command-line options. Here’s an example that will work much like the script above:

larch --from imaps://mail.example.com --from-folder sourcefolder \
--to imaps://imap.gmail.com --to-folder gmailfolder

(you’ll be prompted for your usernames and passwords)

This test version doesn’t yet have all the features I plan to implement for the official release, but it already matches the features of the old script and should be faster and much more robust. Please try it out and let me know if you notice any problems.

Gravatar icon
Sunday March 08, 2009 @ 04:53 PM (PDT)

Larch has been released. I’d like to thank everyone for the invaluable feedback on this script, most of which has been incorporated into Larch.

As of now, please consider this script deprecated and use Larch instead.

Gravatar icon
Monday March 16, 2009 @ 11:41 PM (PDT)

thanks for the hard work on larch — it’s worked great so far for me. easier to use and more robust to failures than the original script.

Gravatar icon
Tuesday March 31, 2009 @ 06:49 PM (PDT)

Fantastic script and great coding (very easy to understand)

Gravatar icon
James Z
Monday May 04, 2009 @ 08:21 AM (PDT)

Is there a way to get it to iterate through all folders automatically? So I don’t have to list them. I think this question was asked before. This is a great script, I just need to figure out that piece. Any help would be great!

Gravatar icon
Ned Adams
Friday May 22, 2009 @ 11:39 AM (PDT)

Larch works REALLY REALLY WELL!!

Gravatar icon
Ned Adams
Friday May 22, 2009 @ 02:18 PM (PDT)

Here is a bit more improved version of the script. Thanks to the original author for the very good idea. Enjoy!

I am not sure how to post code perfectly, so I would assume it will not work from copy/paste.
Look through the code and fix the parts that are broken.


#!/usr/bin/env ruby

require 'getoptlong'
require 'net/imap'
require 'thread'
require 'pp'

HELP_INFO = '
 Synopsis:

   This script will allow you to migrate IMAP account(s) from one server to the other.
   Originaly was created by Ryan Grove and can be found at http://wonko.com/post/ruby_script_to_sync_email_from_any_imap_server_to_gmail
   I have put in lots of my own modifications and made it more robust and flexible.
   By: Ian Matyssik (2009)

 Usage:

    IMAPmigrator [options]

 -h, --help :
    show help

 -s <server fqdn or IP>, --from-server <server fqdn or IP> :
    Specify server name you would like to migrate from.

 -p <port#>, --from-port <port#> :
    Specify port number on "from" server to connect to.
    Default: 143, 993 when --from-ssl is used.

 --from-ssl :
    Enable SSL when connect to "from" server.

 -u <username>, --from-user <username> :
    Specify to use when connectin to "from" server.

 -x <password>, --from-pass <password> :
    Password to use when connecting to "from" server.

 --from-pass-file <full path to the file with password> :
    If you do not want to show password on command line and would like to store password in the file.
    Please make sure that file contains only one password for the specified user.

 -S <server fqdn or IP>, --to-server <server fqdn or IP> :
    Specify server name you would like to migrate to.

 -P <port#>, --to-port <port#> :
    Specify port number on "to" server to connect to.
    Default: 143, 993 when --from-ssl is used.

 --to-ssl :
    Enable SSL when connect to "to" server.

 -U <username>, --to-user <username> :
    Specify to use when connectin to "to" server.

 -X <password>, --to-pass <password> :
    Password to use when connecting to "to" server.

 --to-pass-file <full path to the file with password> :
    If you do not want to show password on command line and would like to store password in the file.
    Please make sure that file contains only one password for the specified user.

 Filtering options:

 --from-prefix <string>
    If "from" server uses a prefix, please specify it here.
    Example: --from-preifx INBOX

 --from-delimiter <string>
    If "from" server uses different delimiter from the "to" server, then I suggest you to specify both delimiters: "to" and "from"
    Example: --from-delimiter "."

 --to-prefix <string>
    If "to" server uses a prefix, please specify it here.
    Example: --to-preifx INBOX

 --to-delimiter <string>
    If "to" server uses different delimiter from the "from" server, then I suggest you to specify both delimiters: "to" and "from"
    Example: --to-delimiter "/"

 --msg-since-days <number of days> :
    Specify number of days into the past since today you would like to filter messages on.
    Only messages that have been received this many days ago will be analysed and transffered.

 --msg-before-days <number of days> :
    Specify number of days into the past since today you would like to filter messages on.
    Only messages that have been received this many days before will be analysed and transffered.

 --accepted-flags <comma separated list of accepted flags> :
    Specify list of IMAP flags you would like to be synced in the following form:
    "Deleted,Seen,Flagged,Answered"
    Default is the following:
    "Seen,Deleted,Answered,Draft,Flagged"
    Be carefull, some flags like ":Recent" will cause some servers to puke.

    Info:
    Script is not perfect by any means, please look-out for changing of Mailbox names. I have put change from "/Junk" to "/Spam" staticaly in the script, if not needed please change to what you need or delete it.

    Note: This script was originaly released under GPL and should stay the same.
    Also you should know that this script comes with no guarantee and support.
    If it breaks you mail, server, house, health, etc. it should be your own responcibility.

'

opts = GetoptLong.new(
      [ '--help', '-h', GetoptLong::NO_ARGUMENT ],
      [ '--from-server', '-s', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--from-port', '-p', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--from-ssl', GetoptLong::NO_ARGUMENT ],
      [ '--from-user', '-u', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--from-pass', '-x', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--from-pass-file', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--from-prefix', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--from-delimiter', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-server', '-S', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-port', '-P', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-ssl', GetoptLong::NO_ARGUMENT ],
      [ '--to-user', '-U', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-pass', '-X', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-pass-file', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-prefix', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--to-delimiter', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--msg-since-days', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--msg-before-days', GetoptLong::REQUIRED_ARGUMENT ],
      [ '--accepted-flags', GetoptLong::REQUIRED_ARGUMENT ]
    )

# Source server connection info.
source_host = ''
source_port = 0
source_ssl  = false
source_user = ''
source_pass = ''
source_delimiter = ''
source_prefix = ''

# Destination server connection info.
dest_host = ''
dest_port = 0
dest_ssl  = false
dest_user = ''
dest_pass = ''
dest_delimiter = ''
dest_prefix = ''

# Filtering options
search_criteria = Array.new
accepted_flags = Array.new

    opts.each do |opt, arg|
      case opt
        when '--help'
          print HELP_INFO
          exit
        when '--from-server'
            source_host = arg.to_s.strip
        when '--from-port'
            source_port = arg.strip.to_i
        when '--from-ssl'
            source_ssl  = true
        when '--from-user'
            source_user = arg.to_s.strip
        when '--from-pass'
            source_pass = arg.to_s.strip
        when '--from-pass-file'
            source_pass = file.read(arg.to_s).to_s.strip
        when '--from-prefix'
            source_prefix = arg.to_s.strip
        when '--from-delimiter'
            source_delimiter = arg.to_s.strip
        when '--to-server'
            dest_host = arg.to_s.strip
        when '--to-port'
            dest_port = arg.strip.to_i
        when '--to-ssl'
            dest_ssl  = true
        when '--to-user'
            dest_user = arg.to_s.strip
        when '--to-pass'
            dest_pass = arg.to_s.strip
        when '--to-pass-file'
            dest_pass = file.read(arg.to_s).to_s.strip
        when '--to-prefix'
            dest_prefix = arg.to_s.strip
        when '--to-delimiter'
            dest_delimiter = arg.to_s.strip
        when '--msg-since-days'
            tmp_since = Time.now - (arg.to_i * 60 * 60 * 24)
            search_criteria += [ 'SINCE' , tmp_since.strftime('%d-%b-%Y') ]
        when '--msg-before-days'
            tmp_before = Time.now - (arg.to_i * 60 * 60 * 24)
            search_criteria += [ 'BEFORE',  tmp_before.strftime('%d-%b-%Y') ]
        when '--accepted-flags'
            arg.to_s.split(%r{\s*,\s*}).each {|f| accepted_flags.push(:"#{f}")}
      end
    end

# Source server connection info.
if source_host.length == 0 then
  puts "Please specify --from-host, I do not know where to connect to!"
  exit
else
  SOURCE_HOST = source_host
end

SOURCE_SSL  = source_ssl

if source_ssl && source_port == 0 then
  SOURCE_PORT = 993
elsif !source_ssl && source_port == 0 then
  SOURCE_PORT = 143
else
  SOURCE_PORT = source_port
end

if source_user.length == 0 then
  puts "Please specify --from-user, I do not know whom to connect as!"
  exit
else
  SOURCE_USER = source_user
end

SOURCE_PASS = source_pass

SOURCE_DELIMITER = source_delimiter
SOURCE_PREFIX = source_prefix

# Destination server connection info.
if dest_host.length == 0 then
  puts "Please specify --to-host, I do not know where to connect to!"
  exit
else
  DEST_HOST = dest_host
end

DEST_SSL  = dest_ssl

if dest_ssl && dest_port == 0 then
  DEST_PORT = 993
elsif !dest_ssl && dest_port == 0 then
  DEST_PORT = 143
else
  DEST_PORT = dest_port
end

if dest_user.length == 0 then
  puts "Please specify --to-user, I do not know whom to connect as!"
  exit
else
  DEST_USER = dest_user
end

DEST_PASS = dest_pass

DEST_DELIMITER = dest_delimiter
DEST_PREFIX = dest_prefix

if search_criteria.length != 0 then
  SEARCH_CRITERIA = search_criteria
else
  SEARCH_CRITERIA = ['ALL']
end

if accepted_flags.length != 0 then
  ACCEPTED_FLAGS = accepted_flags
else
  ACCEPTED_FLAGS = [ :Seen, :Deleted, :Answered, :Draft, :Flagged ]
end

#Textual represantation of "from" and "to" names
SOURCE_NAME = SOURCE_USER
DEST_NAME = DEST_USER
#Number of secconds to sleep between NOOPs to the server
NOOP_INTERVAL = 180
# Maximum number of messages to select at once.
UID_BLOCK_SIZE = 512

# Utility methods.
def dd(message)
   puts "[#{DEST_HOST}: #{DEST_NAME}] #{message}"
end

def ds(message)
   puts "[#{SOURCE_HOST}: #{SOURCE_NAME}] #{message}"
end

def uid_fetch_block(server, uids, *args)
  pos = 0

  while pos < uids.size
    server.uid_fetch(uids[pos, UID_BLOCK_SIZE], *args).each {|data| yield data }
    pos += UID_BLOCK_SIZE
  end
end

def server_send_noop(server,interval)
   while true do
    if !server.disconnected?() then
      server.noop()
      puts 'Sent NOOP to the server ...'
      sleep interval
    end
   end
end

@failures = 0
@existing = 0
@synced   = 0

# Connect and log into both servers.
ds 'Connecting...'
source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)

ds 'Logging in...'
source.login(SOURCE_USER, SOURCE_PASS)

src_thread = Thread.new {server_send_noop(source,NOOP_INTERVAL)}

FOLDER_LIST = source.list("","*")
FOLDERS = Hash.new
FOLDER_LIST.each do |src_folder|
  # Open source folder in read-only mode.
  begin
    ds "Selecting folder '#{src_folder.name}'..."
    source.examine(src_folder.name)
    source.subscribe(src_folder.name)
    new_folder = src_folder.name.sub(/^#{SOURCE_PREFIX}#{SOURCE_DELIMITER}/,"#{DEST_PREFIX}#{SOURCE_DELIMITER}")
    new_folder = new_folder.gsub(SOURCE_DELIMITER,DEST_DELIMITER)
    # Remove me XXX
    new_folder = new_folder.sub(/^\/Junk/,"/Spam")
    FOLDERS[src_folder.name] = new_folder
  rescue => e
    ds "Error: select failed: #{e}"
    if source.disconnected?() then
      begin
      source = Net::IMAP.new(SOURCE_HOST, SOURCE_PORT, SOURCE_SSL)
      source.login(SOURCE_USER, SOURCE_PASS)
      rescue => e
        ds "Error: select failed: #{e}"
      end
    end
    next
  end
end
pp FOLDERS
##################################
dd 'Connecting...'
dest = Net::IMAP.new(DEST_HOST, DEST_PORT, DEST_SSL)

dd 'Logging in...'
dest.login(DEST_USER, DEST_PASS)

dst_thread = Thread.start {server_send_noop(dest,NOOP_INTERVAL)}

# Loop through folders and copy messages.
FOLDERS.each do |source_folder, dest_folder|
  # Open source folder in read-only mode.
  begin
    ds "Selecting folder '#{source_folder}'..."
    source.examine(source_folder)
  rescue => e
    ds "Error: select failed: #{e}"
    next
  end

  # Open (or create) destination folder in read-write mode.
  begin
    dd "Selecting folder '#{dest_folder}'..."
    dest.select(dest_folder)
  rescue => e
    begin
      dd "Folder not found; creating..."
      dest.create(dest_folder)
      dest.select(dest_folder)
    rescue => ee
      dd "Error: could not create folder: #{e}"
      next
    end
  end

  # Build a lookup hash of all message ids present in the destination folder.
  dest_info = {}

  dd 'Analyzing existing messages...'
  if SEARCH_CRITERIA.length == 0  then
    uids = dest.uid_search(['ALL'])
  else
    uids = dest.uid_search(SEARCH_CRITERIA)
  end

  if uids.length > 0
    uid_fetch_block(dest, uids, ['ENVELOPE']) do |data|
      if data.attr['ENVELOPE'].message_id != nil then
        dest_info[data.attr['ENVELOPE'].message_id] = true
      else
        msg = dest.uid_fetch(data.attr['UID'], ['RFC822', 'FLAGS',
            'INTERNALDATE']).first
        dest_info[Digest::MD5.hexdigest(msg.attr['RFC822'])] = true
      end
    end
  end

  dd "Found #{uids.length} messages"

  # Loop through all messages in the source folder.
  if SEARCH_CRITERIA.length == 0  then
    uids = source.uid_search(['ALL'])
  else
    uids = source.uid_search(SEARCH_CRITERIA)
  end

  ds "Found #{uids.length} messages"

  if uids.length > 0
#### (LOOP) START MESSAGE TRANSFFER ####
    uid_fetch_block(source, uids, ['ENVELOPE']) do |data|
      if data.attr['ENVELOPE'].message_id != nil then
        mid = data.attr['ENVELOPE'].message_id
      else
        tmp_msg = source.uid_fetch(data.attr['UID'], ['RFC822', 'FLAGS',
            'INTERNALDATE']).first
        mid = Digest::MD5.hexdigest(tmp_msg.attr['RFC822'])
      end

      # If this message is already in the destination folder, skip it.
      if dest_info[mid]
        @existing += 1
        next
      end
      # Download the full message body from the source folder.
      ds "[#{source_folder}]Downloading message #{mid}..."
      tries = 0
      begin
        tries += 1
        msg = source.uid_fetch(data.attr['UID'], ['RFC822', 'FLAGS',
            'INTERNALDATE']).first
      rescue Net::IMAP::Error => ex
        if tries < 10
          dd "Error: #{ex.message}. Retrying..."
          sleep 1 * tries
          retry
        else
          @failures += 1
          dd "Error: #{ex.message}. Tried and failed #{tries} times; giving up on this message."
        end
      end

      # Append the message to the destination folder, preserving flags and
      # internal timestamp.
      dd "[#{dest_folder}]Storing message #{mid}..."

      tries = 0

      begin
        tries += 1
        pp msg.attr['FLAGS']
        store_flags = msg.attr['FLAGS'] & ACCEPTED_FLAGS
        pp store_flags
        dest.append(dest_folder, msg.attr['RFC822'], store_flags,
            msg.attr['INTERNALDATE'])

        @synced += 1
      rescue Net::IMAP::Error => ex
        if tries < 10
          dd "Error: #{ex.message}. Retrying..."
          sleep 1 * tries
          retry
        else
          @failures += 1
          dd "Error: #{ex.message}. Tried and failed #{tries} times; giving up on this message."
        end
      end
    end
#### END MESSAGE TRANSFFER ####
  end
  source.close
  dest.close
end

src_thread.exit
dst_thread.exit
puts "Finished. Message counts: #{@existing} untouched, #{@synced} transferred, #{@failures} failures."

Gravatar icon
Friday December 11, 2009 @ 04:00 AM (PST)
Copyright © 2002-2010 Ryan Grove. All rights reserved.
Powered by Thoth.