AWS Database Migration Service (AWS DMS) is a managed migration and replication service that helps move your database and analytics workloads to AWS quickly, securely, and with minimal downtime and zero data loss. AWS DMS supports migration between 20-plus database and analytics engines, such as Oracle to Amazon Aurora MySQL-Compatible Edition, MySQL to Amazon Relational Database (RDS) for MySQL, Microsoft SQL Server to Amazon Aurora PostgreSQL-Compatible Edition, MongoDB to Amazon DocumentDB (with MongoDB compatibility), Oracle to Amazon Redshift, and Amazon Simple Storage Service (S3).
In software engineering, a schema migration (also database migration, database change management) refers to the management of version-controlled, incremental and reversible changes to relational database schemas. A schema migration is performed on a database whenever it is necessary to update or revert that database's schema to some newer or older version.
Migrations are performed programmatically by using a schema migration tool. When invoked with a specified desired schema version, the tool automates the successive application or reversal of an appropriate sequence of schema changes until it is brought to the desired state.
Most schema migration tools aim to minimize the impact of schema changes on any existing data in the database. Despite this, preservation of data in general is not guaranteed because schema changes such as the deletion of a database column can destroy data (i.e. all values stored under that column for all rows in that table are deleted). Instead, the tools help to preserve the meaning of the data or to reorganize existing data to meet new requirements. Since meaning of the data often cannot be encoded, the configuration of the tools usually needs manual intervention.
Applying a schema migration to a production database is always a risk. Development and test databases tend to be smaller and cleaner. The data in them is better understood or, if everything else fails, the amount of data is small enough for a human to process. Production databases are usually huge, old and full of surprises. The surprises can come from many sources:
Schema migrations may take a long time to complete and for systems that operate 24/7 it is important to be able to do database migrations without downtime. Usually it is done with the help of feature flags and continuous delivery.
When developing software applications backed by a database, developers typically develop the application source code in tandem with an evolving database schema. The code typically has rigid expectations of what columns, tables and constraints are present in the database schema whenever it needs to interact with one, so only the version of database schema against which the code was developed is considered fully compatible with that version of source code.
In software testing, while developers may mock the presence of a compatible database system for unit testing, any level of testing higher than this (e.g. integration testing or system testing) it is common for developers to test their application against a local or remote test database schematically compatible with the version of source code under test. In advanced applications, the migration itself can be subject to migration testing.
With schema migration technology, data models no longer need to be fully designed up-front, and are more capable of being adapted with changing project requirements throughout the software development lifecycle.
Under good software testing practice, schema migrations can be performed on test databases to ensure that their schema is compatible to the source code. To streamline this process, a schema migration tool is usually invoked as a part of an automated software build as a prerequisite of the automated testing phase.
Schema migration tools can be said to solve versioning problems for database schemas just as version control systems solve versioning problems for source code. In practice, many schema migration tools actually rely on a textual representation of schema changes (such as files containing SQL statements) such that the version history of schema changes can effectively be stored alongside program source code within VCS. This approach ensures that the information necessary to recover a compatible database schema for a particular code branch is recoverable from the source tree itself. Another benefit of this approach is the handling of concurrent conflicting schema changes; developers may simply use their usual text-based conflict resolution tools to reconcile differences.
Developers no longer need to remove the entire test database in order to create a new test database from scratch (e.g. using schema creation scripts from DDL generation tools). Further, if generation of test data costs a lot of time, developers can avoid regenerating test data for small, non-destructive changes to the schema.
This document introduces concepts, principles, terminology, and architecture ofnear-zero downtime database migration for cloud architects who are migratingdatabases to Google Cloud from on-premises or other cloud environments.
Database migration is the process of migrating data from one or more sourcedatabases to one or more target databases by using a database migration service.When a migration is finished, the dataset in the source databases resides fully,though possibly restructured, in the target databases. Clients that accessed thesource databases are then switched over to the target databases, and the sourcedatabases are turned down.
A database migration service runs within Google Cloud and accesses bothsource and target databases. Two variants are represented: (a) shows themigration from a source database in an on-premises data center or a remote cloudto a managed database like Cloud Spanner; (b) shows a migration to adatabase on Compute Engine.
database migration: A migration of data from source databases to targetdatabases with the goal of turning down the source database systems after themigration completes. The entire dataset, or a subset, is migrated.
heterogeneous migration: A migration from source databases to targetdatabases where the source and target databases are of different databasemanagement systems from different providers.
data migration process: A configured or implemented process executed by thedata migration system to transfer data from source to target databases, possiblytransforming the data during the transfer.
database replication: A continuous transfer of data from source databasesto target databases without the goal of turning down the source databases.Database replication (sometimes called database streaming) is an ongoingprocess.
In a database migration, you move data from source databases to targetdatabases. After the data is completely migrated, you delete source databasesand redirect client access to the target databases. Sometimes you keep thesource databases as a fallback measure if you encounter unforeseen issueswith the target databases. However, after the target databases are reliablyoperating, you eventually delete the source databases.
With database replication, in contrast, you continuously transfer data fromthe source databases to the target databases without deleting the sourcedatabases. Sometimes database replication is referred to as database streaming.While there is a defined starting time, there is typically no defined completiontime. The replication might be stopped or become a migration.
Database migration is understood to be a complete and consistent transfer ofdata. You define the initial dataset to be transferred as either a completedatabase or a partial database (a subset of the data in a database) plus everychange committed on the source database system thereafter.
A homogeneous database migration is a migration between the source and targetdatabases of the same database technology, for example, migrating from a MySQLdatabase to a MySQL database, or from an Oracle database to an Oracledatabase. Homogeneous migrations also include migrations between a self-hosteddatabase system such as PostgreSQL to a managed version of it such asCloud SQL (a PostgreSQL variant).
In a homogenous database migration, the schemas for the source and targetdatabases are likely identical. If the schemas are different, the data from thesource databases must be transformed during migration.
Heterogeneous database migration is a migration between source and targetdatabases of different database technologies, for example, from an Oracledatabase to Spanner. Heterogeneous database migration can bebetween the same data models (for example, from relational to relational), orbetween different data models (for example, from relational to key-value).
Migrating between different database technologies doesn't necessarily involvedifferent data models. For example, Oracle, MySQL, PostgreSQL, andSpanner all support the relational data model. However,multi-model databases like Oracle, MySQL, or PostgreSQL support different datamodels. Data stored as JSON documents in a multi-model database can be migratedto MongoDB with little or no transformation necessary, as the data model is thesame in the source and the target database.
Although the distinction between homogeneous and heterogeneous migration isbased on database technologies, an alternative categorization is based ondatabase models involved. For example, a migration from an Oracle database toSpanner is homogeneous when both use the relational data model; amigration is heterogeneous if, for example, data stored as JSON objects inOracle is migrated to a relational model in Spanner.
Categorizing migrations by data model more accurately expresses the complexityand effort required to migrate the data than basing the categorization on thedatabase system involved. However, because the commonly used categorization inthe industry is based on the database systems involved, the remaining sectionsare based on that distinction. 59ce067264