Skip to content

Messages Operations Guide

Practical operational guidance for monitoring messaging system health, debugging common issues, and handling production incidents.

System Health Monitoring

Key Metrics to Track

Message delivery rate:

sql
SELECT
    DATE(created_at) as date,
    COUNT(*) as messages_sent,
    COUNT(CASE WHEN is_read = true THEN 1 END) as messages_read,
    ROUND(COUNT(CASE WHEN is_read = true THEN 1 END)::numeric / COUNT(*)::numeric * 100, 2) as read_rate
FROM messages
WHERE created_at >= NOW() - INTERVAL '7 days'
GROUP BY DATE(created_at)
ORDER BY date DESC;

Reminder effectiveness:

sql
SELECT
    recipient_type,
    level,
    COUNT(*) as total_reminders,
    COUNT(CASE WHEN sent_at IS NOT NULL THEN 1 END) as sent,
    COUNT(CASE WHEN cancelled_at IS NOT NULL THEN 1 END) as cancelled,
    ROUND(COUNT(CASE WHEN cancelled_at IS NOT NULL THEN 1 END)::numeric / COUNT(*)::numeric * 100, 2) as cancel_rate
FROM message_thread_reminders
WHERE created_at >= NOW() - INTERVAL '30 days'
GROUP BY recipient_type, level;

AI service health:

bash
# Check recent AI analysis jobs
docker-compose exec -T wedissimo-api php artisan queue:failed | grep CheckMessageNeedsResponse

Queue depth:

bash
# Monitor queue status
docker-compose exec -T wedissimo-api php artisan queue:monitor default --max=100

Alerting Thresholds

Critical alerts:

  • Queue depth > 500 for more than 10 minutes
  • AI service failure rate > 5% over 1 hour
  • Message delivery failure rate > 2%

Warning alerts:

  • Average message response time > 48 hours
  • Unread message count > 1000 per vendor
  • Queue processing lag > 15 minutes

Common Issues & Solutions

Issue: Reminders Not Being Sent

Symptoms:

  • Users report not receiving reminder emails
  • Database shows reminders scheduled but not sent
  • sent_at column remains NULL past scheduled time

Diagnostic steps:

  1. Check scheduler is running:
bash
docker-compose exec -T wedissimo-api php artisan schedule:list | grep messages:send-reminders
  1. Verify command execution:
bash
docker-compose logs -f wedissimo-api | grep "messages:send-reminders"
  1. Check pending reminders:
sql
SELECT COUNT(*), recipient_type, level
FROM message_thread_reminders
WHERE scheduled_for <= NOW()
  AND sent_at IS NULL
  AND cancelled_at IS NULL
GROUP BY recipient_type, level;

Solutions:

  • Ensure Laravel scheduler cron is running
  • Restart queue workers: docker-compose restart wedissimo-api
  • Check mail configuration: php artisan config:cache
  • Manually run: php artisan messages:send-reminders

Issue: AI Service Timeouts

Symptoms:

  • Slow message creation
  • Failed jobs in queue
  • Vertex AI timeout errors in logs

Diagnostic steps:

  1. Check recent failures:
bash
docker-compose exec -T wedissimo-api php artisan queue:failed | grep CheckMessageNeedsResponseJob
  1. Test AI service directly:
bash
docker-compose exec -T wedissimo-api php artisan tinker
php
$ai = app(\Modules\Messages\Services\VertexAiScanningService::class);
$result = $ai->scanForNeedsResponse("Can you help me with my wedding?");
dd($result);

Solutions:

  • Check config in modules/Messages/Config/config.php
  • The job has built-in retry with backoff (60s, 120s, 300s)
  • On failure, job defaults to scheduling reminders (fail-safe)

Issue: Duplicate Reminders

Symptoms:

  • Users receiving multiple reminder emails
  • Database shows duplicate reminder records

Diagnostic steps:

  1. Check for duplicate records:
sql
SELECT thread_id, trigger_message_id, level, COUNT(*)
FROM message_thread_reminders
GROUP BY thread_id, trigger_message_id, level
HAVING COUNT(*) > 1;
  1. Check unique constraint exists:
sql
SELECT indexname FROM pg_indexes
WHERE tablename = 'message_thread_reminders'
AND indexname LIKE '%unique%';

Solutions:

  • The unique_reminder constraint prevents duplicates
  • MessageReminderService::sendReminder() uses pessimistic locking
  • Review if jobs are being dispatched multiple times

Issue: Reminders Not Cancelled When User Responds

Symptoms:

  • User responds but still receives reminders
  • cancelled_at not set after response

Diagnostic steps:

  1. Check event listener is registered:
bash
docker-compose exec -T wedissimo-api php artisan tinker
>>> app()->make('events')->getListeners(\Modules\Messages\Events\ParticipantRespondedToThread::class)
  1. Check if event was fired:
bash
docker-compose logs -f wedissimo-api | grep "ParticipantRespondedToThread"
  1. Check pending reminders for user:
sql
SELECT * FROM message_thread_reminders
WHERE recipient_id = 'user-uuid'
  AND sent_at IS NULL
  AND cancelled_at IS NULL;

Solutions:

  • Verify CancelRemindersOnResponse listener is registered in MessagesServiceProvider
  • Check MessageService::sendMessage() fires the event
  • Manually cancel: MessageReminderService::cancelPendingRemindersForRecipient()

Performance Optimization

Slow Reminder Queries

Problem: Reminder scheduler running slowly.

Solution - Verify partial index:

sql
-- Check the partial index exists
SELECT indexname, indexdef FROM pg_indexes
WHERE tablename = 'message_thread_reminders';

-- Should see reminders_due_idx with WHERE clause

Solution - Check query plan:

sql
EXPLAIN ANALYZE
SELECT * FROM message_thread_reminders
WHERE scheduled_for <= NOW()
  AND sent_at IS NULL
  AND cancelled_at IS NULL;

Queue Backlogs

Problem: Jobs growing faster than processing.

Solution - Scale workers:

bash
# Run additional worker
docker-compose exec -T wedissimo-api php artisan queue:work --queue=default --tries=3

Solution - Check job throughput:

bash
# Monitor job processing
docker-compose exec -T wedissimo-api php artisan queue:monitor default

Data Maintenance

Cleaning Up Old Reminders

Remove sent reminders after 6 months:

php
// Via tinker
MessageThreadReminder::query()
    ->whereNotNull('sent_at')
    ->where('sent_at', '<', now()->subMonths(6))
    ->delete();

Cancel stale pending reminders:

php
// Cancel reminders for messages older than 30 days
MessageThreadReminder::query()
    ->whereNull('sent_at')
    ->whereNull('cancelled_at')
    ->whereHas('triggerMessage', fn($q) => $q->where('created_at', '<', now()->subDays(30)))
    ->update([
        'cancelled_at' => now(),
        'cancellation_reason' => 'stale',
    ]);

Cleaning Up Old Messages

Soft-deleted message cleanup (after 90 days):

php
Message::onlyTrashed()
    ->where('deleted_at', '<', now()->subDays(90))
    ->forceDelete();

Incident Response

Message Delivery Failure

Immediate actions:

  1. Check mail service status (Mailpit/SES)
  2. Verify queue workers are running
  3. Review recent failed jobs
  4. Check notification channel configuration

Recovery steps:

  1. Retry failed jobs: php artisan queue:retry all
  2. Restart queue workers: docker-compose restart wedissimo-api
  3. Verify delivery with test message
  4. Monitor queue for 15 minutes

AI Service Outage

Immediate actions:

  1. Jobs will fail and retry automatically
  2. After 3 retries, failed() method schedules reminders as fallback
  3. Monitor failed job count

Check fallback behavior:

php
// The job's failed() method schedules reminders even if AI fails
// This is the safe default - better to remind than miss a booking

Recovery:

  1. Test AI service restoration via tinker
  2. Retry failed jobs: php artisan queue:retry all
  3. Monitor AI success rate

Runbook Checklist

Daily checks:

  • [ ] Queue depth < 100
  • [ ] No failed AI jobs in last 24 hours
  • [ ] Scheduler running (check schedule:list)

Weekly checks:

  • [ ] Review reminder cancellation rates (40-60% is healthy - means users are responding)
  • [ ] Check for message spam patterns
  • [ ] Review slow query logs

Monthly checks:

  • [ ] Clean up old sent reminders
  • [ ] Archive messages > 2 years old
  • [ ] Review reminder effectiveness metrics

Key Commands

bash
# Send pending reminders manually
docker-compose exec -T wedissimo-api php artisan messages:send-reminders

# Check scheduler
docker-compose exec -T wedissimo-api php artisan schedule:list

# Check failed jobs
docker-compose exec -T wedissimo-api php artisan queue:failed

# Retry failed jobs
docker-compose exec -T wedissimo-api php artisan queue:retry all

# Clear config cache
docker-compose exec -T wedissimo-api php artisan config:cache

Support Resources

Laravel Queue Documentation:https://laravel.com/docs/queues

Internal Contacts:

  • Platform Team: #platform-support
  • AI Integration: #ai-engineering

Wedissimo API Documentation