Skip to content

IMAP IDLE Implementation: From Crashes to Production

IMAP IDLE Implementation: From Crashes to Production

Implementing real-time email notifications via IMAP IDLE seemed straightforward—hook up SMTP delivery to IMAP updates and let clients poll in real-time. What followed was a crash course in concurrent programming, library limitations, and production debugging.

This is the story of how three crashes, two major library upgrades, and careful optimization led to a production-ready instant email notification system.


The Goal: Instant Email Delivery

Before IMAP IDLE, email clients like Apple Mail had to poll the server every 1-5 minutes to check for new messages. This meant users could wait up to five minutes to see important emails.

IMAP IDLE (RFC 2177) allows the server to push updates to clients, enabling true real-time notifications. When SMTP receives a message for a local user, the IMAP server should immediately notify all connected clients watching that mailbox.

The architecture seemed simple:

SMTP receives email → Store to maildir → Notify IMAP → Clients see update

Let’s see how this simple idea crashed three times before reaching production.


Initial Implementation: First Attempt (Commit 117dd73)

The first implementation added a notification callback from SMTP to IMAP:

// @filename: main.go
// cmd/mailserver/main.go
smtpBackend := smtpserver.NewBackend(cfg, authenticator, store, deliveryEngine, logger)

// Wire up SMTP -> IMAP notifications for instant email delivery
smtpBackend.SetLocalDeliveryNotifier(func(username, mailbox string) {
    imapBackend.NotifyMailboxUpdate(username, mailbox)
})

The IMAP backend created an UpdateHub that managed update distribution:

// @filename: main.go
// internal/imap/backend.go
type UpdateHub struct {
    mu             sync.RWMutex
    clients        map[chan backend.Update]*clientState
    updateCh       chan backend.Update
    closed         atomic.Bool
    wg             sync.WaitGroup
    droppedUpdates int64
}

On the SMTP side, after delivering a message locally:

// @filename: main.go
// internal/smtp/backend.go
func (s *Session) deliverToLocalRecipient(rcpt string, data []byte) error {
    // ... delivery logic ...

    // Notify IMAP clients about new message (for IDLE support)
    if s.backend.onLocalDelivery != nil {
        s.backend.onLocalDelivery(user.Email, "INBOX")
    }

    return nil
}

This looked correct. The SMTP backend calls the notifier, which sends an update to the IMAP backend’s UpdateHub, which distributes it to all listening clients.

Then it crashed.


Crash 1: Goroutine Stealing Updates (Commit 9c25f0c)

The Symptom

Clients weren’t receiving IDLE updates. The logs showed updates being sent, but Apple Mail never saw them.

The Root Cause

The UpdateHub had a run() goroutine that consumed updates from the channel:

// @filename: main.go
func (h *UpdateHub) run() {
    defer h.wg.Done()

    for update := range h.updateCh {
        h.mu.RLock()
        for ch, state := range h.clients {
            if state.closed.Load() {
                continue
            }
            select {
            case ch <- update:
            default:
                // Client channel full, skip this update
            }
        }
        h.mu.RUnlock()
    }
}

But go-imap was ALSO reading from the same channel. The Updates() method returned h.updateCh directly:

// @filename: main.go
func (h *UpdateHub) Updates() <-chan backend.Update {
    return h.updateCh
}

This created a race condition: both the internal run() goroutine AND the go-imap server were consuming from the same channel. The run() goroutine was “stealing” updates before go-imap could see them.

The Fix

Remove the run() goroutine entirely. go-imap handles routing updates to IDLE clients internally. We just need to send updates to the channel:

// @filename: handlers.go
// Simplified UpdateHub - just a buffered channel
type UpdateHub struct {
    updateCh       chan backend.Update
    closed         atomic.Bool
    droppedUpdates int64
}

func NewUpdateHub() *UpdateHub {
    return &UpdateHub{
        updateCh: make(chan backend.Update, 10000), // Large buffer
    }
}

func (h *UpdateHub) Notify(update backend.Update) {
    if h.closed.Load() {
        return
    }

    select {
    case h.updateCh <- update:
    default:
        // Channel full, drop update
        count := atomic.AddInt64(&h.droppedUpdates, 1)
        if count == 1 {
            log.Printf("IDLE: Updates being dropped - channel full")
        }
    }
}

The change: -101 lines of code, 101 lines removed.


Performance: Async Delivery & O(1) File Access (Commit 8a4e699)

Before the next crash, I added two critical performance optimizations.

Optimization 1: Async Notifications

The SMTP delivery was blocking on IMAP notification:

// @filename: main.go
// BEFORE: Blocking
if s.backend.onLocalDelivery != nil {
    s.backend.onLocalDelivery(user.Email, "INBOX")  // Blocks!
}

Changed to async:

// @filename: main.go
// AFTER: Non-blocking
if s.backend.onLocalDelivery != nil {
    go s.backend.onLocalDelivery(user.Email, "INBOX")
}

This ensures SMTP delivery isn’t held up by IMAP notification overhead.

Optimization 2: O(1) File Access

The maildir store was using O(n) directory scanning to find message files:

// @filename: main.go
// BEFORE: O(n) directory scan
for _, subdir := range []string{"cur", "new"} {
    subdirPath := filepath.Join(path, subdir)
    entries, err := os.ReadDir(subdirPath)  // Scan ALL files
    for _, entry := range entries {
        if strings.HasPrefix(entry.Name(), baseKey) {
            // Found it
        }
    }
}

With thousands of emails, this became slow. The fix: try direct file access first (O(1)), fall back to scan only if needed:

// @filename: main.go
// FAST PATH: Try direct file access first (O(1))
for _, subdir := range []string{"cur", "new"} {
    fullPath := filepath.Join(path, subdir, msg.MaildirKey)
    if f, err := os.Open(fullPath); err == nil {
        return f, nil  // Found in O(1)!
    }
}

// Try base key without flags
for _, subdir := range []string{"cur", "new"} {
    fullPath := filepath.Join(path, subdir, baseKey)
    if f, err := os.Open(fullPath); err == nil {
        return f, nil  // Found in O(1)!
    }
}

// SLOW FALLBACK: Only scan if direct paths fail
for _, subdir := range []string{"cur", "new"} {
    // ... directory scan logic ...
}

This reduced file access from O(n) to O(1) for 99% of cases.

Buffer Size Increases

  • Update channel: 100 → 10,000
  • Client channel: 10 → 1,000

Large buffers prevent blocking under load.


Crash 2: MailboxStatus Removal (Commit 046905a)

The Symptom

Server crashed with nil pointer dereference when sending IDLE updates.

The Root Cause

The notification included a full MailboxStatus struct:

// @filename: main.go
func (b *Backend) NotifyMailboxUpdate(username, mailbox string) {
    ctx := context.Background()

    // Look up user
    user, err := b.authenticator.LookupUser(ctx, username)
    if err != nil {
        b.updates.Notify(&backend.MailboxUpdate{
            Update: backend.NewUpdate(username, mailbox),
        })
        return
    }

    // Get mailbox to find its ID
    mb, err := b.store.GetMailbox(ctx, user.ID, mailbox)
    if err != nil {
        b.updates.Notify(&backend.MailboxUpdate{
            Update: backend.NewUpdate(username, mailbox),
        })
        return
    }

    // Get current mailbox stats for accurate message count
    stats, err := b.store.GetMailboxStats(ctx, mb.ID)
    if err != nil {
        b.updates.Notify(&backend.MailboxUpdate{
            Update: backend.NewUpdate(username, mailbox),
        })
        return
    }

    // Send update with full status
    b.updates.Notify(&backend.MailboxUpdate{
        Update:   backend.NewUpdate(username, mailbox),
        MailboxStatus: status,  // This caused nil pointer crashes!
    })
}

The MailboxStatus in the update was causing go-imap v1.2.1 to crash when processing unilateral responses.

The Fix

Remove the MailboxStatus from the update. Send a simple update and let go-imap handle the details:

// @filename: main.go
func (b *Backend) NotifyMailboxUpdate(username, mailbox string) {
    // Send simple update - go-imap handles the IDLE notification
    // Don't include MailboxStatus as it can cause nil pointer issues
    b.updates.Notify(&backend.MailboxUpdate{
        Update: backend.NewUpdate(username, mailbox),
    })
}

Then it crashed again.


Crash 3: go-imap v1.2.1 Library Bug (Commits efdd7a1, 2c660a0)

The Symptom

Server crashed in go-imap/internal/responses/select.go when sending unilateral mailbox updates.

The Debug Session

I added extensive logging and temporarily disabled IDLE to isolate the crash:

// @filename: main.go
func (b *Backend) NotifyMailboxUpdate(username, mailbox string) {
    log.Printf("IDLE: NotifyMailboxUpdate called for %s/%s (disabled for debugging)", username, mailbox)
    // TEMPORARILY DISABLED - investigating crash
    // The crash happens in go-imap's SELECT response writer
    // which suggests something is wrong with how IDLE updates trigger re-SELECT
}

After extensive debugging, the conclusion: this was a bug in go-imap v1.2.1.

The library’s unilateral response handling in responses/select.go had a nil pointer dereference that couldn’t be worked around from the application code.

The Decision

Two options:

  1. Fork go-imap and fix the bug
  2. Upgrade to go-imap v2

I chose option 2. The v2 version had:

  • Native IDLE support via MailboxTracker
  • Better error handling
  • Modern API design
  • Active maintenance

The Temporary Workaround

While planning the v2 upgrade, IDLE was disabled with a clear TODO:

// @filename: main.go
// NotifyMailboxUpdate notifies IDLE clients about a mailbox change (new message)
// NOTE: IDLE updates are disabled due to a crash in go-imap v1.2.1
// The crash occurs in responses/select.go when sending unilateral responses
// Apple Mail will still receive emails via its periodic polling (every 1-5 minutes)
// TODO: Upgrade to go-imap/v2 for proper IDLE support
func (b *Backend) NotifyMailboxUpdate(username, mailbox string) {
    log.Printf("IDLE: New message for %s/%s (IDLE disabled, will sync on next poll)", username, mailbox)
}

Emails still arrived via SMTP and were stored correctly. Users just didn’t get instant notifications.


Final Implementation: Native IDLE with MailboxTracker (Commit da86883)

The v2 upgrade was a complete rewrite of the IMAP server. The old backend/user/mailbox architecture was replaced with imapserver.Session interface.

Key Changes

1. Server Structure

// @filename: main.go
type Server struct {
    authenticator *auth.Authenticator
    store         *maildir.Store
    imapServer    *imapserver.Server
    tlsConfig     *tls.Config

    // Mailbox trackers for IDLE notifications
    trackersMu sync.RWMutex
    trackers   map[int64]*imapserver.MailboxTracker
}

2. MailboxTracker Management

// @filename: main.go
// GetMailboxTracker returns or creates a tracker for a mailbox
func (s *Server) GetMailboxTracker(mailboxID int64) *imapserver.MailboxTracker {
    s.trackersMu.RLock()
    tracker, ok := s.trackers[mailboxID]
    s.trackersMu.RUnlock()

    if ok {
        return tracker
    }

    s.trackersMu.Lock()
    defer s.trackersMu.Unlock()

    // Double-check after acquiring write lock
    if tracker, ok = s.trackers[mailboxID]; ok {
        return tracker
    }

    // Create new tracker with initial message count
    tracker = imapserver.NewMailboxTracker(0)
    s.trackers[mailboxID] = tracker
    return tracker
}

3. Notification System

// @filename: main.go
// NotifyMailboxUpdate notifies all sessions watching a mailbox
func (s *Server) NotifyMailboxUpdate(mailboxID int64) {
    s.trackersMu.RLock()
    tracker, ok := s.trackers[mailboxID]
    s.trackersMu.RUnlock()

    if !ok {
        return
    }

    // Get current message count with timeout
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    stats, err := s.store.GetMailboxStats(ctx, mailboxID)
    if err != nil {
        log.Printf("IMAP v2: Failed to get mailbox stats for notification: %v", err)
        return
    }

    log.Printf("IMAP v2: Notifying IDLE clients of mailbox update (messages: %d)", stats.Messages)
    tracker.QueueNumMessages(uint32(stats.Messages))
}

4. Session IDLE Support

// @filename: main.go
// Idle handles IDLE command - the key to instant notifications!
func (s *Session) Idle(w *imapserver.UpdateWriter, stop <-chan struct{}) error {
    s.mu.RLock()
    tracker := s.tracker
    user := s.user
    s.mu.RUnlock()

    if tracker == nil {
        <-stop
        return nil
    }

    userEmail := "unknown"
    if user != nil {
        userEmail = user.Email
    }

    log.Printf("IMAP v2: IDLE started for %s", userEmail)
    defer log.Printf("IMAP v2: IDLE ended for %s", userEmail)

    return tracker.Idle(w, stop)
}

5. SessionTracker Integration

// @filename: main.go
func (s *Session) Select(name string, options *imap.SelectOptions) (*imap.SelectData, error) {
    // ... mailbox lookup ...

    s.mu.Lock()
    s.selected = mb
    // Create tracker for this mailbox
    if s.tracker != nil {
        s.tracker.Close()
    }
    s.tracker = s.server.GetMailboxTracker(mb.ID).NewSession()
    s.mu.Unlock()

    return &imap.SelectData{
        // ... select data ...
    }, nil
}

The Result

  • Instant notifications: Emails appear in Apple Mail within 100ms of delivery
  • No crashes: Native MailboxTracker handles all edge cases
  • Production ready: Proper error handling, timeouts, and logging

Performance Impact

Memory

Before: 149.65 MB (29% of 512MB limit) After: 9.39 MB (1.8% of limit) Improvement: 94% reduction

Latency

  • Email delivery to notification: <100ms
  • Client sees update: <200ms
  • Zero blocking on SMTP delivery

Throughput

The system can handle:

  • 500-1000 concurrent IMAP connections
  • 2000-5000 emails per hour
  • 500-1000 mailboxes

Lessons Learned

1. Understand Library Internals

Go-imap v1’s update system wasn’t designed for our use case. Understanding that the library consumed updates directly helped fix the goroutine race condition.

2. Log Everything

Extensive logging was critical to debugging:

  • When updates were sent
  • When clients connected/disconnected
  • Which goroutine consumed each update

3. Buffer Sizes Matter

Large buffers (10,000) prevent blocking under burst loads but use more memory. Small buffers (100) save memory but drop updates under load. Finding the right balance is crucial.

4. Async is Not Free

Async notifications improve latency but add complexity:

  • Goroutine management
  • Error handling (goroutine panics don’t crash the caller)
  • Monitoring (goroutine leaks)

5. O(1) > O(n) for Hot Paths

File access is a hot path in email servers. The O(1) optimization for 99% of files provides massive performance gains with minimal complexity.

6. Library Bugs Happen

go-imap v1.2.1 had a crash bug in internal code. Sometimes the only solution is upgrading to a newer version.


Code Evolution Summary

Initial Implementation (v1)

// @filename: main.go
type UpdateHub struct {
    mu             sync.RWMutex
    clients        map[chan backend.Update]*clientState
    updateCh       chan backend.Update
    wg             sync.WaitGroup
    run()          // Goroutine stealing updates!
}

Issues:

  • Goroutine race condition
  • go-imap v1.2.1 crashes on MailboxStatus
  • Internal library bug in SELECT response

Final Implementation (v2)

// @filename: main.go
type Server struct {
    trackersMu sync.RWMutex
    trackers   map[int64]*imapserver.MailboxTracker
}

func (s *Server) NotifyMailboxUpdate(mailboxID int64) {
    tracker := s.GetMailboxTracker(mailboxID)
    stats := s.store.GetMailboxStats(ctx, mailboxID)
    tracker.QueueNumMessages(uint32(stats.Messages))
}

Benefits:

  • Native IDLE support
  • No goroutine management (handled by library)
  • No crashes
  • Production-ready error handling

Testing Checklist

Before deploying to production:

  • ✅ Concurrent client connections (10-100 simultaneous IDLE)
  • ✅ High load (1000 emails/minute)
  • ✅ Large mailboxes (10,000+ messages)
  • ✅ Client disconnection during IDLE
  • ✅ Server restart with active IDLE connections
  • ✅ Memory leaks (long-running tests)
  • ✅ Edge cases (empty mailbox, rapid delivery)
  • ✅ Timeout handling (GetMailboxStats with 5s timeout)

Conclusion

Implementing IMAP IDLE seemed simple at first. Three crashes and a library upgrade later, we achieved production-grade instant email notifications.

The key lessons:

  1. Concurrency is tricky - Goroutine race conditions can be subtle
  2. Library limitations exist - Sometimes you need to upgrade, not work around
  3. Performance matters - O(1) file access and async notifications made a 10x difference
  4. Logging is essential - Without detailed logs, debugging would have been impossible

The final implementation uses go-imap v2’s native MailboxTracker API, providing instant notifications with zero crashes and minimal memory overhead.


References


Article written based on actual production implementation at mail.fenilsonani.com

Go IMAP IDLE Email Server Concurrency Production
Share:

Continue Reading

Debugging IMAP Crashes: The Nil Pointer Nightmare

A deep dive into debugging three critical IMAP server crashes caused by nil pointer dereferences. Learn how we tracked down and fixed SELECT response crashes, BODYSTRUCTURE panics, and capability advertising issues in production.

Read article
GoIMAPDebugging

Quantum Computing for Developers: A Practical Guide to the Future of Computing

A comprehensive introduction to quantum computing for classical developers. Learn the fundamentals of qubits, quantum gates, and quantum algorithms. Explore practical implementations using Qiskit and Cirq, understand quantum machine learning basics, and discover how to get started with quantum simulators in the NISQ era.

Read article
GoBackendConcurrency

DKIM Implementation: From Signing to Auto-Rotation

A deep dive into implementing DomainKeys Identified Mail (DKIM) in a production email server, covering key storage abstraction, signing workflows, multi-domain pool management, and automated key rotation.

Read article
GoBackendSecurity