Implementing a Rollback Strategy
#rollback#deployment#database#devops#cicd#release-management
When a deployment goes wrong, you need a clear way to reverse it. For application code, rollback is often straightforward: redeploy the previous build or artifact. For databases and other stateful systems, reverting is harder because of persistent storage and migration history. This post outlines two patterns—rolling back and rolling forward—and when to use each.
Rollback scripts are not a substitute for backup
Rollback (undo) scripts can reverse some schema changes, but they are not a full replacement for a backup and restore strategy. For example:
- Drops — If you drop a table or column, an undo script cannot bring the data back; you need to restore from backup.
- Data loss — Only backup/restore guarantees recovery of lost data.
So: design rollback and maintain tested backups. Use rollback for controlled reversal of migrations where possible; use backup/restore when data or structure has been lost and cannot be recovered by scripts.
Rolling back vs rolling forward
Two ways to get the database (or schema) back to a previous state:
| Approach | What it does | Version view |
|---|---|---|
| Rolling back | Restores a previous schema state as if the later deployment never happened. | After rolling back from v4 to v3, the database considers itself at v3. The v4 deployment is effectively undone. |
| Rolling forward | Treats the bad deployment as having happened, then applies a new deployment that restores the desired state. | You go from v4 to v5, where v5 is a new version whose schema matches what v3 had. History stays: v3 → v4 → v5. |
Example: You deployed v4 (e.g. added a column, then a migration went wrong). You want the schema to look like v3 again.
- Roll back: Revert to v3. The database is at v3; v4 is “erased” from the version history. You can later change or remove the v4 migration scripts so they are never run again.
- Roll forward: Deploy v5, where v5’s migrations undo the v4 changes (e.g. drop the column v4 added). The database ends up with the same structure as v3, but the audit trail is v3 → v4 → v5.
When to roll forward
As a general rule, for live production databases, rolling forward is preferred:
- Simpler — It’s just another deployment (v5) that reverses the bad change. No special “rewind” tooling.
- Audit trail — You keep a full deployment history (v3 → v4 → v5), which helps compliance and debugging. You never pretend a deployment didn’t happen.
Exception: When the errant migration leaves the database in a bad or dangerous state and you do not want that migration to run again anywhere—e.g. other production databases, or when standing up new dev/test environments. In that case, rolling back and removing or fixing the problematic migration scripts is the right approach so future deployments never pass through that state.
When to roll back
Rolling back is “back in time”: the database version goes backward (e.g. v4 → v3), and you can modify or delete migration scripts for versions above the current one. Use it when:
| Situation | Why roll back |
|---|---|
| Production never went live | The database was upgraded (e.g. to v4) but is still in a downtime window and no new transactional data has been written. Reverting to v3 is safe and keeps the story simple. |
| Destructive errant migration | The migration is so bad that you don’t want any future deployment (other prod DBs, new dev/test DBs) to run it. Roll back, then fix or remove the bad scripts so no environment ever transitions through that state again. |
| Dev or test databases | You need an environment at an older version for troubleshooting or to test a forward migration again. Rolling back is practical and doesn’t affect production audit. |
| Testing recoverability | In automation: upgrade a test DB, roll it back, then upgrade again, and assert the schema is correct at each step. This validates that your rollback path works. |
After a rollback, you can alter or remove the migration scripts that were “undone” so they don’t cause problems in future runs.
Summary
| Goal | Prefer | Reason |
|---|---|---|
| Revert a live production deployment | Roll forward | Keeps audit trail; just another deployment (e.g. v5) that undoes v4. |
| Revert while still in downtime, no live traffic | Roll back OK | Database hasn’t seen new data; going “back to v3” is clean. |
| Bad migration must never run again | Roll back + fix scripts | Prevents other DBs or new environments from hitting that state. |
| Dev/test at an older version | Roll back | Simple for local or test use. |
| Compliance / audit | Roll forward | Full history (v3 → v4 → v5) is preserved. |
Always pair rollback strategy with backup and restore for true data recovery (e.g. after drops or data loss). Use rollback scripts and roll-forward migrations for controlled, repeatable reversal of schema changes.
Comments