HTTPS / TLS Certificate Management - Team Guide
Last Updated: November 10, 2025
Target Audience: Operations Team, DevOps, Developers
Table of Contents
- Current Certificate Status
- How Certificates Are Generated
- How Certificates Are Renewed
- Monitoring & Alerts
- What To Do When Things Go Wrong
- Manual Certificate Hosts
Current Certificate Status
✅ Automated Let's Encrypt (90-day certs, auto-renewed at 31 days)
All production, test, QA, and sandbox environments are now using Let's Encrypt certificates that are automatically obtained and renewed:
| Environment | Hosts | Cert Status | Last Renewed |
|---|---|---|---|
| Production | app3, app4 | ✅ Valid ~90 days | Nov 9-10, 2025 |
| Production | shopify3, shopify4 | ✅ Valid ~90 days | Nov 9, 2025 |
| Production | bigcommerce3, bigcommerce4 | ✅ Valid ~90 days | Nov 9, 2025 |
| Production | foxycart3, foxycart4 | ✅ Valid ~90 days | Nov 9, 2025 |
| Production | bot | ✅ Valid ~90 days | Nov 9, 2025 |
| Sandbox | sandbox3, sandbox4 | ✅ Valid ~90 days | Nov 9, 2025 |
| Sandbox | shopify-sandbox3, shopify-sandbox4 | ✅ Valid ~90 days | Nov 7-8, 2025 |
| Sandbox | bigcommerce-sandbox3, bigcommerce-sandbox4 | ✅ Valid ~90 days | Nov 7, 2025 |
| Sandbox | foxycart-sandbox3, foxycart-sandbox4 | ✅ Valid ~90 days | Nov 9, 2025 |
| Test | test | ✅ Valid ~90 days | Nov 7, 2025 |
| Test | shopify-test | ✅ Valid ~90 days | Nov 8, 2025 |
| Test | bigcommerce-test | ✅ Valid ~90 days | Nov 8, 2025 |
| Test | foxycart-test | ✅ Valid ~90 days | Nov 9, 2025 |
| Test | bot-test | ✅ Valid ~90 days | Nov 9, 2025 |
| Test | test.findaplace.xyz | ❌ No certificate | TBD |
| QA | qa | ✅ Valid ~90 days | Nov 7, 2025 |
| QA | qa.findaplace.xyz | ❌ No certificate | TBD |
| Sandbox | sandbox.findaplace.xyz | ❌ No certificate | TBD |
| Infrastructure | npm | ✅ Valid ~90 days | Nov 9, 2025 |
| Infrastructure | devops | ✅ Valid ~90 days | Nov 8, 2025 |
| Infrastructure | cassandra | ✅ Valid ~90 days | Nov 8, 2025 |
⚠️ Missing Certificates
You will notice that some of the servers above have not been issued a cert. Doing this setup is not zero cost so I've listed them above for your information. If you NEED certs for any of those servers or anything NOT listed please reach out.
How Certificates Are Generated
Initial Certificate Issuance
Certificates are obtained automatically during server deployment using Certbot (Let's Encrypt ACME client).
Process Flow:
- Deployment Triggered: When you run
ansible-playbook -i tot -e hostGroup=<service> playbooks/deployService.yml - Prerequisites Setup:
- Nginx is deployed with ACME challenge support built into templates
- Webroot directory created at
/var/www/letsencrypt - ACME challenge endpoint configured:
http://<domain>/.well-known/acme-challenge/
- Certificate Request:
- Certbot contacts Let's Encrypt ACME servers
- Let's Encrypt sends a challenge to verify domain ownership
- Challenge file is placed in
/var/www/letsencrypt/.well-known/acme-challenge/ - Let's Encrypt validates by fetching
http://<domain>/.well-known/acme-challenge/<token>
- Certificate Issued:
- If validation succeeds, Let's Encrypt issues certificate (90-day validity)
- Certificate stored in
/etc/letsencrypt/live/<cert-name>/ - Symlinks created in
/etc/nginx/certs/pointing to Let's Encrypt certs - Nginx reloaded with new certificates
Configuration Location
Certificates are configured in Ansible group variables:
# Example: group_vars/foxycart-sandbox.yaml
certbot_enabled: true
certbot_email: operations+letsencrypt@tokenoftrust.com
certbot_install_method: apt
certbot_renewal_days: 31
certbot_domains:
- domains:
- "foxycart-sandbox.tokenoftrust.com"
- "foxycart-sandbox3.tokenoftrust.com"
- "foxycart-sandbox4.tokenoftrust.com"
webroot: "/var/www/letsencrypt"
cert_name: "foxycart-sandbox"
certbot_services:
- name: "foxycart-sandbox"
cert_source: "foxycart-sandbox"
How Certificates Are Renewed
Automated Renewal Process
Certificates are automatically renewed 31 days before expiration using a daily cron job.
Renewal Flow:
- Daily Check (3:00 AM + random jitter):
- Cron job runs on each server:
/usr/local/sbin/tot-ops-monitor --only cert_renewal_<service> - Script checks certificate expiration via TLS handshake
- Jitter (0-15 minutes) prevents simultaneous renewals across servers
- Cron job runs on each server:
- Expiration Check:
- If certificate expires in < 31 days → Trigger renewal
- If certificate expires in ≥ 31 days → No action, exit silently
- Certbot performs ACME challenge (same as initial issuance)
- Let's Encrypt validates domain ownership
- New certificate issued (90-day validity)
- Nginx Reload:
- Certbot deploy hook (
/etc/letsencrypt/renewal-hooks/deploy/<service>.sh) triggered - Script validates Nginx config:
nginx -t - If valid → Reload Nginx:
systemctl reload nginx - If invalid → Abort reload, log error
- Certbot deploy hook (
- Notification Sent (see Monitoring & Alerts)
Renewal Attempt:
certbot renew --cert-name <cert-name> --quiet
Renewal Safety Features
- Rate Limit Protection: Certbot respects Let's Encrypt rate limits (no forced renewals)
- Flock Protection: Prevents concurrent renewals on same server
- Config Validation: Nginx config tested before reload to avoid downtime
- Idempotent: Safe to run multiple times
Certificate Storage
/etc/letsencrypt/
├── live/<cert-name>/
│ ├── fullchain.pem → Certificate + intermediate chain (for Nginx ssl_certificate)
│ ├── chain.pem → Intermediate chain only (for Nginx ssl_trusted_certificate)
│ ├── cert.pem → Certificate only
│ └── privkey.pem → Private key (for Nginx ssl_certificate_key)
├── renewal/
│ └── <cert-name>.conf → Renewal configuration
└── renewal-hooks/
└── deploy/
└── <service>.sh → Post-renewal script (reloads Nginx)
/etc/nginx/certs/
├── <service>.fullchain.crt → symlink to /etc/letsencrypt/live/<cert-name>/fullchain.pem
├── <service>.ca-bundle.crt → symlink to /etc/letsencrypt/live/<cert-name>/chain.pem
└── <service>.key → symlink to /etc/letsencrypt/live/<cert-name>/privkey.pem
Monitoring & Alerts
How You Get Notified
All certificate events send email notifications to Trello.
Notification Destinations
Primary: darrin84+u863fd0rvj35nrdbfiy2@boards.trello.com
From: operations@tokenoftrust.com
Emails are delivered to a Trello board via email-to-board integration, creating a card for each event.
Alert Types
1. ✅ Successful Renewal
Subject: [SUCCESS] <hostname> – Cert renewed for <domain>
Content:
Certificate for <domain> renewed successfully.
Previous expiry: <date> (<X> days remaining)
New expiry: <date> (~90 days)
Cert name: <cert-name>
Renewed at: 2025-11-09T20:58:29Z
Nginx has been reloaded with the new certificate.
Action Required: None. This is a success confirmation.
2. ❌ Renewal Failure
Subject: [ERROR] <hostname> – Cert renewal FAILED for <domain>
Content:
Certificate renewal for <domain> FAILED.
Current expiry: <date> (<X> days remaining)
Cert name: <cert-name>
Renewal attempted at: 2025-11-09T20:58:29Z
Error output:
<certbot error details>
Manual intervention required. Check:
1. Nginx is serving /.well-known/acme-challenge/ from /var/www/letsencrypt
2. DNS correctly points to this server
3. Port 80 is accessible externally
4. Certbot logs: /var/log/letsencrypt/letsencrypt.log
Troubleshooting commands:
curl -I http://<domain>/.well-known/acme-challenge/
nginx -t
certbot certificates
tail -100 /var/log/letsencrypt/letsencrypt.log
Action Required: Immediate investigation (see What To Do When Things Go Wrong)
3. ℹ️ Expiration Warning (Manual Certs Only)
For manually managed certificates, a separate monitor checks expiration:
Subject: [WARNING] <hostname> – Certificate expires in <X> days
Action Required: Manual renewal needed (see Manual Certificate Hosts)
Where Alerts Are Logged
In addition to Trello notifications:
Cron Logs:
sudo grep -i certbot /var/log/syslog
System Logs (on each server):
sudo journalctl -t certbot-deploy
sudo tail -f /var/log/letsencrypt/letsencrypt.log
What To Do When Things Go Wrong
Scenario 1: Certificate Renewal Failed
Symptoms: You receive [ERROR] Cert renewal FAILED notification
Troubleshooting Steps:
Force renewal (if dry-run succeeds):
sudo certbot renew --cert-name <cert-name> --force-renewal
Test renewal manually (dry-run, safe to run):
sudo certbot renew --cert-name <cert-name> --dry-run
Check Certbot logs:
sudo tail -100 /var/log/letsencrypt/letsencrypt.log
Verify Nginx configuration:
sudo nginx -t
Check ACME challenge endpoint:
curl -I http://<domain>/.well-known/acme-challenge/
# Should return 404 (nginx is serving the path) or 200 (if test file exists)
Check certificate status:
sudo certbot certificates
SSH to the affected server:
ssh ubuntu@<hostname>
Common Issues:
| Error | Cause | Fix |
|---|---|---|
DNS problem: NXDOMAIN |
DNS not resolving | Check DNS records, wait for propagation |
Connection refused |
Port 80 blocked | Check firewall, security groups |
404 on ACME challenge |
Nginx not configured | Re-run deployService.yml playbook |
Rate limit exceeded |
Too many renewal attempts | Wait 1 hour, then retry |
Nginx reload failed |
Bad nginx config | Fix config, run nginx -t, then reload |
Scenario 2: Certificate Expired
Symptoms: Browser shows "Certificate Expired" error, or monitoring alert for < 0 days remaining
Immediate Actions:
Verify new certificate:
echo | openssl s_client -connect <domain>:443 -servername <domain> 2>/dev/null \
| openssl x509 -noout -dates
Reload Nginx:
sudo nginx -t && sudo systemctl reload nginx
If renewal fails, request a new certificate:
sudo certbot certonly --webroot -w /var/www/letsencrypt \
-d <domain> --cert-name <cert-name> --force-renewal
Force renewal immediately:
ssh ubuntu@<hostname>
sudo certbot renew --cert-name <cert-name> --force-renewal
Scenario 3: Missing Certificate (QA Hosts)
Affected Hosts:
- shopify-qa.tokenoftrust.com
- bigcommerce-qa.tokenoftrust.com
- foxycart-qa.tokenoftrust.com
- bot-qa.tokenoftrust.com
Resolution:
These hosts need certbot configuration added to their group_vars. To fix:
Verify certificate obtained:
ansible -i tot <service>-qa -b -a "certbot certificates"
Deploy certbot configuration:
cd ~/src/tot/servers/ansible
ansible-playbook -i tot -e hostGroup=<service>-qa --ask-become-pass \
playbooks/deployService.yml
Add certbot configuration (if missing):
certbot_enabled: true
certbot_email: operations+letsencrypt@tokenoftrust.com
certbot_install_method: apt
certbot_renewal_days: 31
certbot_domains:
- domains:
- "<service>-qa.tokenoftrust.com"
webroot: "/var/www/letsencrypt"
cert_name: "<service>-qa"
certbot_services:
- name: "<service>-qa"
cert_source: "<service>-qa"
Verify group_vars file exists:
ls -la ~/src/tot/servers/ansible/group_vars/*-qa.yaml
Scenario 4: Nginx Won't Reload After Renewal
Symptoms: Renewal succeeds but Nginx fails to reload
Troubleshooting:
- If config invalid, check for common issues:
- Missing semicolons
- Incorrect file paths in
ssl_certificatedirectives - Syntax errors in recent changes
Manually reload Nginx (after fixing config):
sudo systemctl reload nginx
Check deploy hook logs:
sudo journalctl -t certbot-deploy -n 50
Check Nginx config syntax:
sudo nginx -t
Manual Certificate Hosts
Hosts NOT Using Let's Encrypt
The following hosts still use manually managed certificates (not automated):
| Host | Certificate Type | Renewal Process |
|---|---|---|
| None currently | N/A | All hosts migrated to Let's Encrypt |
Note: As of November 2025, all production and non-production environments use automated Let's Encrypt certificates.
Quick Reference Commands
Check Certificate Status
# On any server
sudo certbot certificates
# Remote check
echo | openssl s_client -connect <domain>:443 -servername <domain> 2>/dev/null \
| openssl x509 -noout -dates -issuer
Test Renewal (Dry Run)
sudo certbot renew --dry-run
Force Immediate Renewal
sudo certbot renew --cert-name <cert-name> --force-renewal
Check Monitoring Cron
sudo crontab -l -u ubuntu | grep cert_renewal
View Recent Renewal Logs
sudo tail -100 /var/log/letsencrypt/letsencrypt.log
sudo journalctl -t certbot-deploy -n 50
Test ACME Endpoint
curl -I http://<domain>/.well-known/acme-challenge/
Additional Resources
- Ansible Playbooks:
~/src/tot/servers/ansible/playbooks/deployService.yml- Full service deployment (includes certbot)letsencrypt-setup.yml- Standalone certificate management
- Certbot Role Documentation:
~/src/tot/servers/ansible/roles/certbot/README.md - Monitor Scripts:
~/src/tot/servers/ansible/roles/tot_ops_monitor/files/monitors/cert_renewal.sh- Automatic renewal + notificationcert_expiry.sh- Expiration warnings
- Let's Encrypt Rate Limits: https://letsencrypt.org/docs/rate-limits/
- 50 certificates per domain per week
- 5 duplicate certificates per week
Questions?
Operations Team: operations@tokenoftrust.com
Managed By: Ansible role certbot + tot_ops_monitor
Code Repository: ~/src/tot/servers/ansible