Deduplicating Email with Muttmutt // email // technology
Reading time: ~2 min
The good: Mutt can do basically everything. The bad: it's not very discoverable. The solution: here, have another blog post.
If it wasn't already apparent, this one is going to be a bit rambly... I mostly just wanted to record my thoughts and path to fixing a giant mess I created with my maildir.
Usually, self-hosting things means I get a sense of satisfaction out of figuring out how things work at a deeper level than I otherwise would have... but in return am also rewarded with being personally responsible for all sorts of new, weird, technological edge cases and software problems. I'd be lying if I said I don't frustrated when an operating system upgrade magically causes my contacts and calendars to stop working.
Also sometimes I just break things on my own, accidentally. I did that with my mail recently. It's hosted by someone else (I'm not sure I'm crazy enough or brave enough to want to host my own mailserver), but I keep it local as well in a maildir, then use Mutt as my client.
So how did I mess it up? I duplicated every mail in my maildir 4x over. So the roughly 3000 mails in my maildir suddenly became roughly 15000 mails. Then I made it worse by not realizing it had happened, syncing with my cloud mail hoster, and propagating the duplication back to them. Super.........
I started looking up how to deduplicate mail in a maildir and found this surprisingly robust looking project.
I tried it out, and ran into a few issues, which I'm guessing I could have hacked through with some perseverance, but for some reason at this point it occurred to me that I should check if Mutt has any built-in deduplication ability. And, of course it does. It's Mutt, for crying out loud. I found this article.
The TLDR is, assuming you have
set duplicate_threads=yes, as well as
set sort=threads, you can then hit
D to delete matching a pattern,
~= as the pattern.