Thematic background picture about Tech

Mail Mirror on NAS

The problem

In the past, I used to mirror my email accounts locally and use Notmuch with the alot email client. Every email was tagged appropriately, I would only see the tags I wanted in alot and it was generally a pretty smooth system that got me very, very close to inbox zero.

The problem was that this was only the case on my desktop PC. Every time I was out of the house or wanted to try out a new distro, the email I would see anywhere else was still an cluttered mess as the tagging was only local to that one machine.

The solution

  1. Move from Notmuch tags to folders
  2. Move the email mirroring off my main PC
  3. (bonus) Get Notmuch back, but not locally

My NAS is by definition always on and - should I want - I can access it from anywhere in the world (though the setup I chose doesn't depend on this at all). This also provides me with a place to keep backups of email that I might need in the future, but serve no purpose taking up space on my email provider's drive.

So, with a solution planned, let's get started.

Part 1: Move from Notmuch tags to folders

Tags are great, but we want every client to have access to organised email, so IMAP folders are the way to go here. IMAPFilter is a pretty well-established tool for moving email around on IMAP servers; it provides a good amount of filter options for selecting those emails and the fact that its configuration is done in Lua allows us to customise the procedure anyway we like.

Installing IMAPFilter

Since this is going to run on my NAS, a Docker container is the obvious way to go.

I've prepared a Docker image for IMAPFilter. Normally, I would use Alpine Linux as a base, but IMAPFilter is not yet in a stable release of Alpine (it's in edge, so it should happen eventually), because of this, the Docker image uses Debian as a base. This doesn't affect its use, but is good to know.

This image is set-up to be able to run IMAPFilter on a schedule. Now, IMAPFilter can stay connected to all defined IMAP accounts and listen for either RECENT or EXISTS events, which normally indicate that new mail has arrived. In theory, this would be the ideal way to run as we wouldn't have to keep polling the email server(s) and could just run our filters when needed. Unfortunately, many email providers only send these events once and only to one connected client, meaning IMAPFilter wouldn't get the notification if, for example, my email client on Android also happened to be connected at the time.

Instead, I'm using Supercronic to schedule IMAPFilter to be run every few minutes. I also set flags on messages that have already been seen by IMAPFilter so we don't have to check all mail in the inbox every time, but that's a setup detail for later.

I'm using the following crontab setup. You may chose to run less often, if you get less promotional email than I do:

1# Run every 2 minutes, starting from 0
20/2 * * * * /workdir/sync.sh imapfilter

The /workdir/sync.sh script can just be export IMAPFILTER_HOME=/workdir/.imapfilter; imapfilter -c /workdir/.imapfilter/config.lua, but we'll return to that later.

Just create two bind mounts for Docker, mounting the crontab to /etc/crontab and your configuration directory to /workdir/.imapfilter:

1docker run -d \
2  --name mail-imapfilter \
3  --mount type=bind,source="/path/to/your/confs/crontab",target=/etc/crontab,readonly \
4  --mount type=bind,source="/path/to/your/confs/imapfilter",target=/workdir/.imapfilter,readonly \
5  cybolic/imapfilter-isync:latest

This command is just to test and give a quick overview. Ideally you wouldn't run the docker command like this, but use a docker-compose.yml file or something like Portainer instead.

Setting up IMAPFilter's filters

IMAPFilter filters are usually written directly in Lua, but I like to keep my data separate from my code and filter definitions definitely fall under data for me. Instead, I wrote a small module that lets me write filters as Lua tables instead of code. This allows me to write some things more succinctly and also opens up the possibility to write an exporter for the filter rules, should I move away from IMAPFilter down the line.

Here's the module and here's an example of how I use it:

1messages = messages - rulesrunner.run_rules(messages, {
2  -- PayPal Subscription receipts
3  { subscription_paypal = {
4    { match_subject = 'receipt.*to Dropbox' },
5    { match_subject = 'receipt.*to Humble Bundle', contain_body = "Humble Choice" }
6  }, from = { contain_from = '@paypal' }, move_to = 'Official/Receipts/Subscription' },
7  -- Google Play Subscription receipts
8  { subscription_googleplay = {
9    { contain_subject = 'receipt' },
10    { contain_body = 'subscription' }
11  }, from = { contain_from = 'Google Play', contain_subject = 'Your Google Play' }, move_to = 'Official/Receipts/Subscription' },
12  -- Game purchases
13  { purchase_games = {
14    { contain_from = "@steampowered.com", contain_subject = { "Steam purchase", "thank you" } }
15  }, move_to = 'Official/Purchase/Games' },
16  -- Crowdfunding direct messages
17  { crowdfunding_messages = {
18    { contain_from = '@kickstarter.com', match_utf8_field = { 'subject', 'response needed|sent you a message|new message' } }
19  }, flag = true }
20}, accounts.posteo)
  • Each rule in the main table passed to run_rules is run in order and given the messages table with any results from previous rules subtracted.
  • Each filter (e.g. match_subject) is logically ANDed with the rest in the rule.
  • Multiple filters of the same type can be defined by providing their values in a table (e.g. contain_subject = { "Hello", "World" }) and are ANDed.
  • Filters that take more than one argument (like match_field and match_utf8_field) have their arguments defined in a table.
  • Two custom matchers, match_utf8_field and match_utf8_body decode Base64-encoded UTF8 text before matching (necessary for a lot of modern email).
  • The move_to key triggers moving the matches to the given destination (the account to move to is given in the final argument to run_rules). delete and flag are also supported.
  • The returned value is a table of all matches so I can subtract them from messages so the next run_rules block skips them.

If you want to do the same filtering without my module, this would be the code for just the PayPal example above:

1local results
2local subresults
3-- PayPal Subscription receipts
4local _messages = messages:contain_from('@paypal')
5-- for Dropbox
6subresults = _messages:match_subject('receipt.*to Dropbox')
7_messages = _messages - subresults
8results = Set(subresults)
9-- for Humble Choice
10subresults = _messages:match_subject('receipt.*to Humble Bundle') * _messages:contain_body("Humble Choice")
11_messages = _messages - subresults
12results = results + subresults
13results:move_messages(accounts.posteo['Official/Receipts/Subscription'])

Personally, I find the table version easier to read.

At the end of the filtering, I add a filtered flag to all messages that were matched so they aren't processed next time IMAPFilter runs:

1results:add_flags({ 'Filtered' })

And I get the initial list of messages to process as so (messages from the last 24 hours that haven't been processed):

1local messages = (accounts.posteo.INBOX:is_newer(1) + accounts.gmail.INBOX:is_newer(1)):has_unkeyword('Filtered')

You can find my full setup and rules here. Do note that this is the first time I've written anything in Lua, so this setup is not necessarily production quality, but it works for my use. The files in that repo are also not the exact ones I use on my NAS; these are my local dotfiles for my main PC, so consider it more of a staging area.

Part 2: Move the email mirroring off my main PC (setting up isync / mbsync)

The Docker image used above also has isync (also known as mbsync) installed, so we can just throw a config file in there and add it to the crontab.

I've set this up to do the synchronisation in stages:

  • Every 2 minutes: The inbox gets synced
  • Every 7 minutes: Sync mail sorted by IMAPFilter into folders that I care about every
  • Every 15 minutes: A full sync of all email

You'll see that IMAPFilter runs every two minutes (:00, :02, etc.) and isync runs every 2 + 1 minutes (:01, :03, etc.). This gives IMAPFilter time to sort the email before isync fetches it.

mbsyncrc:

1# Generic defaults
2
3Create Slave
4SyncState *
5CopyArrivalDate yes
6
7
8# Accounts
9
10IMAPAccount posteo
11CertificateFile /etc/ssl/certs/ca-certificates.crt
12SSLType IMAPS
13Host posteo.de
14User username@posteo.net
15PassCmd /workdir/get_passwd.sh posteo
16
17IMAPStore posteo-remote
18Account posteo
19
20MaildirStore posteo-local
21# The trailing "/" is important
22Path /workdir/data/posteo-account/
23Inbox /workdir/data/posteo-account/inbox
24Subfolders Verbatim
25
26
27# Channels
28
29Channel posteo-inbox
30Master :posteo-remote:INBOX
31Slave :posteo-local:inbox
32Sync All
33
34Channel posteo-sorted
35Master :posteo-remote:
36Slave :posteo-local:
37Patterns * !archived !drafts !sent !trash !Inbox !"Unimportant*" !"Official/Backups*" !"Promotional*"
38Sync All
39
40Channel posteo-non-urgent
41Master :posteo-remote:
42Slave :posteo-local:
43Patterns "Unimportant*" "Official/Backups*" "Promotional*"
44Sync All
45
46Channel posteo-archived
47Master :posteo-remote:Archived
48Slave :posteo-local:archived
49Sync All
50
51Channel posteo-sent
52Master :posteo-remote:Sent
53Slave :posteo-local:sent
54Sync All
55
56Channel posteo-drafts
57Master :posteo-remote:Drafts
58Slave :posteo-local:drafts
59Sync All
60
61Channel posteo-trash
62Master :posteo-remote:Trash
63Slave :posteo-local:trash
64Sync All
65
66
67# Groups
68
69Group inbox
70Channel posteo-inbox
71
72Group sorted
73Channel posteo-inbox
74Channel posteo-sorted
75
76Group full-without-inbox
77Channel posteo-non-urgent
78Channel posteo-drafts
79Channel posteo-sent
80Channel posteo-trash
81Channel posteo-archived

crontab:

1# Run every 2 minutes, starting from 0
20/2 * * * * /workdir/sync.sh imapfilter
3# Run every 2 minutes, starting from 1
41/2 * * * * /workdir/sync.sh isync inbox
5
6# Run every 7 minutes
7*/7 * * * * /workdir/sync.sh isync sorted
8
9# # Run every 15 minutes
10*/15 * * * * /workdir/sync.sh isync full-without-inbox

sync.sh:

1#!/usr/bin/env bash
2
3if [[ "$1" = "isync" ]]; then
4  mbsync --config /workdir/.mbsyncrc $2
5elif [[ "$1" = "imapfilter" ]]; then
6  export IMAPFILTER_HOME=/workdir/.imapfilter
7  imapfilter -c /workdir/.imapfilter/config.lua
8fi

Our Docker setup should now look something like this:

1docker run -d \
2  --name mail-imapfilter \
3  --mount type=bind,source="/path/to/your/confs/crontab",target=/etc/crontab,readonly \
4  --mount type=bind,source="/path/to/your/confs/imapfilter",target=/workdir/.imapfilter,readonly \
5  --mount type=bind,source="/path/to/your/confs/mbsyncrc",target=/workdir/.mbsyncrc,readonly \
6  --mount type=bind,source="/path/to/your/email/storage",target=/workdir/data \
7  cybolic/imapfilter-isync:latest

Part 3: Get Notmuch back, but not locally

Even though I'm no longer using Notmuch to sort my email, that doesn't mean it's not still a great program for very quickly searching one's email. Since Notmuch reads through all the email files, it's not really something you'd want reading the files over a networked connection to your NAS. Instead, it's much faster to run an instance of Notmuch on your NAS, let that create it's database and then let your local Notmuch instance just use that database.

For that, I've created another Docker image, cybolic/notmuch. Just like the previous one, it uses Supercronic, so you can set it up to run Notmuch every 30 minutes or so, depending on how fresh you want your search results.

Dockerfile:

1docker run -d \
2  --name mail-imapfilter \
3  --mount type=bind,source="/path/to/your/confs/crontab",target=/etc/crontab,readonly \
4  --mount type=bind,source="/path/to/your/confs/notmuch",target=/workdir/notmuch-config,readonly \
5  --mount type=bind,source="/path/to/your/email/storage",target=/workdir/data \
6  cybolic/notmuch:latest

crontab:

1# 15 and 45 past
215/30 * * * * notmuch new

notmuch-config:

1[database]
2path=/workdir/data
3
4[user]
5name=Christian Dannie Storgaard
6primary_email=address@host.com
7other_email=semicolon@separated.com;list@of.com;other@addresses.com
8
9[new]
10tags=new
11ignore=/[.]*.(off|old|bak)$/;/[.]*.json$/;.lock;/.*[.]mbsyncstate.*$/;.uidvalidity
12
13[search]
14exclude_tags=deleted;spam;trash;
15
16[maildir]
17synchronize_flags=true
18
19[index]
20headers.List=List-Id

Everything will just work as long as your local Notmuch config file's path variable points to the same files as your NAS instance uses. My email storage is mounted using NFS, so is set to /mnt/nas-mail in the local Notmuch config.

That's it! Your email is now sorted, backed up, searchable and available from your NAS.