Samuelkb Logo

Large-Scale VMware Migration to GCP Cloud

Terraform

Author

Samuel Hernandez

Date Published

Overview

Stabilized a drifted Google Cloud environment and completed a large-scale migration of more than 50 VMware virtual machines to GCP. Addressed foundational gaps caused by lost Terraform state and incomplete governance, networking, and backup configurations before proceeding with the migration. Re-established a reliable infrastructure baseline, migrated workloads in a controlled and repeatable manner, and validated systems post-migration. All workloads were brought back under infrastructure-as-code and protected with tested backup and disaster recovery policies, enabling the client to move critical systems to the cloud with restored operational confidence and a scalable foundation for future growth.



My role

I served as the technical expert for the engagement, supporting the project manager by guiding the client’s engineering team through key architectural decisions and cloud best practices. I identified infrastructure risks early, advised on corrective actions, and helped define a stable foundation before migrations began. I was responsible for designing and implementing the infrastructure using Terraform, including project and network structures, firewall rules, DNS, and backup and disaster recovery configurations on GCP. Throughout the migration, I worked closely with the client to validate workloads, ensure systems were brought back under infrastructure-as-code, and support a smooth transition to cloud operations.


Tech Stack & Architecture Summary

  • Cloud & Platform: Google Cloud Platform (GCP)
  • Infrastructure as Code: Terraform with remote state stored in GCS
  • Compute & Migration: Google Compute Engine (GCE) for migrated workloads, including both Linux and Windows virtual machines and Migrate to Virtual Machines tool used to migrate on-prem VMware workloads to GCP with controlled cutovers and minimal downtime.
  • Networking & Security: VPC Networks, Firewall Rules, Cloud DNS.
  • Data Protection & Resilience: Backup and DR (GCP)


Situation & Challenge

The client set out to modernize their infrastructure by reducing their on-premises data center footprint and migrating workloads to Google Cloud. The goal was not only to move virtual machines, but to establish a secure, compliant, and well-governed cloud environment that could support critical services with high availability and reliable operations. Infrastructure-as-code using Terraform was a contractual requirement, intended to provide consistency and long-term maintainability.

However, when the migration initiative began, the existing cloud foundations were already drifted. Terraform state files had been lost, parts of the environment were created or modified manually, and several foundational resources were no longer aligned with the codebase. While the environment initially appeared functional, even small Terraform changes triggered destructive plans, revealing deeper inconsistencies. Proceeding with a large-scale migration on top of this foundation would have compromised the entire infrastructure-as-code strategy and made day-to-day operations increasingly fragile.

At the same time, the migration itself carried significant complexity. The workloads included a mix of Linux and Windows virtual machines, production and development systems, and dependencies such as newly provisioned database servers required by migrated applications. Downtime had to be carefully controlled through maintenance windows aligned with business hours. The client’s engineering team was relatively new to GCP and Terraform, under time pressure, and already frustrated after months of effort without seeing workloads successfully running in the cloud.

The challenge was not only to migrate more than 50 virtual machines, but to restore confidence in the cloud platform itself by re-establishing strong foundations, aligning the environment with infrastructure-as-code best practices, and enabling a migration approach that was safe, repeatable, and operationally sustainable.


Solution

The solution began by re-establishing a stable infrastructure-as-code foundation. The existing GCP environment was realigned with Terraform by correcting project structures, networking, firewall rules, DNS, and access controls, ensuring that all foundational resources were consistently managed through code. This step eliminated destructive Terraform plans, reduced operational risk, and restored confidence in making changes safely.

With the foundations stabilized, the migration strategy was designed to support both scale and reliability. Custom Terraform modules were introduced to standardize how workloads were deployed, including project creation, subnet configuration, firewall policies, and service dependencies. This made each migrated workload predictable and repeatable, while keeping governance and security controls consistent across environments.

Virtual machine migrations were executed using GCP’s Migrate to Virtual Machines tooling, allowing workloads to be tested, validated, and cut over during planned maintenance windows. Linux and Windows systems were handled according to their specific requirements, and newly provisioned database servers were created to support application dependencies. After each successful migration, virtual machines were imported back into Terraform state to maintain full infrastructure-as-code ownership.

Finally, backup and disaster recovery were implemented and validated using GCP Backup and DR, ensuring that migrated workloads were protected from day one. By combining a corrected IaC foundation, a structured migration process, and built-in reliability controls, the client was able to move from stalled progress to a migration workflow that was both fast and operationally sound, enabling continued migrations with confidence and momentum.


Results & Impact

  • Successfully migrated 50+ virtual machines from VMware to GCP with minimal downtime, enabling the client to complete a long-stalled cloud migration initiative.
  • Restored full Infrastructure-as-Code ownership by stabilizing Terraform configurations and eliminating unsafe or destructive plans, allowing teams to make changes confidently.
  • Re-established trust in the platform and delivery process, transforming a high-risk, drifted environment into a predictable and governable cloud foundation.
  • Enabled a repeatable migration workflow, allowing future VM migrations to be executed faster, with less risk and clearer operational procedures.
  • Improved operational reliability and resilience through validated backups and disaster recovery using GCP Backup and DR.
  • Reduced ongoing operational risk by standardizing networking, firewall rules, DNS, and project structures across the environment.
  • Accelerated post-migration momentum, shifting the organization from months of stagnation to steady, scalable cloud adoption.