quyennv.com

Senior DevOps Engineer · Healthcare, Singapore

Implementing a Rollback Strategy

#rollback#deployment#database#devops#cicd#release-management

When a deployment goes wrong, you need a clear way to reverse it. For application code, rollback is often straightforward: redeploy the previous build or artifact. For databases and other stateful systems, reverting is harder because of persistent storage and migration history. This post outlines two patterns—rolling back and rolling forward—and when to use each.

Rollback scripts are not a substitute for backup

Rollback (undo) scripts can reverse some schema changes, but they are not a full replacement for a backup and restore strategy. For example:

  • Drops — If you drop a table or column, an undo script cannot bring the data back; you need to restore from backup.
  • Data loss — Only backup/restore guarantees recovery of lost data.

So: design rollback and maintain tested backups. Use rollback for controlled reversal of migrations where possible; use backup/restore when data or structure has been lost and cannot be recovered by scripts.

Rolling back vs rolling forward

Two ways to get the database (or schema) back to a previous state:

ApproachWhat it doesVersion view
Rolling backRestores a previous schema state as if the later deployment never happened.After rolling back from v4 to v3, the database considers itself at v3. The v4 deployment is effectively undone.
Rolling forwardTreats the bad deployment as having happened, then applies a new deployment that restores the desired state.You go from v4 to v5, where v5 is a new version whose schema matches what v3 had. History stays: v3 → v4 → v5.

Example: You deployed v4 (e.g. added a column, then a migration went wrong). You want the schema to look like v3 again.

  • Roll back: Revert to v3. The database is at v3; v4 is “erased” from the version history. You can later change or remove the v4 migration scripts so they are never run again.
  • Roll forward: Deploy v5, where v5’s migrations undo the v4 changes (e.g. drop the column v4 added). The database ends up with the same structure as v3, but the audit trail is v3 → v4 → v5.

When to roll forward

As a general rule, for live production databases, rolling forward is preferred:

  • Simpler — It’s just another deployment (v5) that reverses the bad change. No special “rewind” tooling.
  • Audit trail — You keep a full deployment history (v3 → v4 → v5), which helps compliance and debugging. You never pretend a deployment didn’t happen.

Exception: When the errant migration leaves the database in a bad or dangerous state and you do not want that migration to run again anywhere—e.g. other production databases, or when standing up new dev/test environments. In that case, rolling back and removing or fixing the problematic migration scripts is the right approach so future deployments never pass through that state.

When to roll back

Rolling back is “back in time”: the database version goes backward (e.g. v4 → v3), and you can modify or delete migration scripts for versions above the current one. Use it when:

SituationWhy roll back
Production never went liveThe database was upgraded (e.g. to v4) but is still in a downtime window and no new transactional data has been written. Reverting to v3 is safe and keeps the story simple.
Destructive errant migrationThe migration is so bad that you don’t want any future deployment (other prod DBs, new dev/test DBs) to run it. Roll back, then fix or remove the bad scripts so no environment ever transitions through that state again.
Dev or test databasesYou need an environment at an older version for troubleshooting or to test a forward migration again. Rolling back is practical and doesn’t affect production audit.
Testing recoverabilityIn automation: upgrade a test DB, roll it back, then upgrade again, and assert the schema is correct at each step. This validates that your rollback path works.

After a rollback, you can alter or remove the migration scripts that were “undone” so they don’t cause problems in future runs.

Summary

GoalPreferReason
Revert a live production deploymentRoll forwardKeeps audit trail; just another deployment (e.g. v5) that undoes v4.
Revert while still in downtime, no live trafficRoll back OKDatabase hasn’t seen new data; going “back to v3” is clean.
Bad migration must never run againRoll back + fix scriptsPrevents other DBs or new environments from hitting that state.
Dev/test at an older versionRoll backSimple for local or test use.
Compliance / auditRoll forwardFull history (v3 → v4 → v5) is preserved.

Always pair rollback strategy with backup and restore for true data recovery (e.g. after drops or data loss). Use rollback scripts and roll-forward migrations for controlled, repeatable reversal of schema changes.

← All posts

Comments