Case Study: Media Server
Background
The Media Server was a crucial part of Pearson's digital infrastructure, providing content producers with the ability to upload, manage, and deliver digital content worldwide. However, as demand for digital content grew, the limitations of the legacy architecture—reliant on physical servers and NetApp storage clusters—became apparent. Scalability, performance, and reliability issues began to impact the platform's ability to meet evolving business needs.
To address these challenges, Pearson initiated a strategic project to migrate the Media Server to Amazon Web Services (AWS), leveraging cloud technology to ensure the platform could handle future growth and operational requirements.
As the Product Manager, I led this migration project, overseeing execution, stakeholder communication, risk mitigation, and ensuring alignment with Pearson’s strategic goals.
Project Overview
Objective:
The primary objective of the Media Server migration was to transition from a legacy physical infrastructure to AWS, with the following key goals:
- Scalability: Use AWS's elastic infrastructure to scale resources based on demand, particularly during peak periods.
- Performance Improvement: Leverage AWS’s global network and services to reduce latency and improve content delivery speed.
- Reliability and Availability: Utilize AWS’s fault-tolerant infrastructure to increase the reliability and availability of the Media Server.
- Cost Efficiency: Shift to a pay-as-you-go model to reduce capital expenditure and optimize costs by eliminating the need for physical infrastructure.
Challenges
1. Data Migration and Integrity:
One of the largest hurdles was migrating over 200 million assets, totaling approximately 30 TB of data, to AWS. Ensuring no data loss or corruption during this process was paramount.
2. Minimizing System Downtime:
Keeping downtime to an absolute minimum during the migration was critical to ensure continuous availability for Pearson’s content producers and end-users. Any significant service interruption could result in a loss of productivity.
3. Compatibility with Legacy Systems:
The new AWS-based infrastructure needed to integrate seamlessly with the existing workflows, tools, and systems used by content producers. Ensuring backward compatibility was essential to avoid disruption.
4. Security and Compliance:
Maintaining robust security and compliance was a top priority, given the sensitive nature of Pearson's digital content. AWS’s security protocols had to align with Pearson’s existing compliance frameworks.
Solution
Phased Migration Approach:
To mitigate risk, the migration was executed in phases. We began by migrating non-critical components to AWS, followed by more critical systems. This phased approach allowed for comprehensive testing and validation, reducing the risk of data loss or system failure. A dual environment was maintained during the migration, allowing content producers to continue using the legacy system while the AWS-based system was tested, refined, and gradually adopted.
AWS Services Utilized:
- Amazon S3: Core storage solution for secure and scalable media file storage, with high availability and durability.
- Amazon CloudFront: AWS’s CDN was used to accelerate global content distribution, reducing latency for end-users.
- Amazon EC2: EC2 instances ran scalable web servers, dynamically allocating computing power based on real-time traffic demands.
- Amazon RDS: A managed database service for secure, reliable access to Pearson’s databases, with automatic backups and patch management.
- AWS Lambda: Serverless computing was used for background tasks, reducing server management needs and lowering operational overhead.
Automation and CI/CD:
Automation played a key role in the migration. We developed scripts to handle the installation of necessary packages, configuration files, and environment setup on AWS. CI/CD pipelines were implemented to streamline the deployment process, ensuring quick and reliable updates with minimal manual intervention. This approach also allowed for faster rollbacks in case of issues during deployment.
Monitoring and Maintenance:
Once the migration was completed, monitoring tools like Grafana and NewRelic were used to continuously track system performance. These tools provided real-time insights, allowing for proactive issue resolution. A triage checklist was created to ensure stability during high-usage periods, maintaining performance under heavy traffic.
Stakeholder Communication:
Regular updates were provided to stakeholders throughout the migration process. Detailed timelines, risk assessments, and mitigation strategies ensured all parties were informed and prepared for each phase of the transition.
Outcomes
1. Enhanced Scalability:
The Media Server now scales effortlessly to meet peak demand, particularly during back-to-school seasons and other high-traffic periods. AWS’s elastic infrastructure allows for dynamic resource allocation, ensuring consistent performance.
2. Improved Performance:
The migration to AWS CloudFront improved content delivery speed and reduced latency, providing users with faster access to digital content.
3. Increased Reliability and Availability:
AWS’s fault-tolerant infrastructure provided higher availability, significantly reducing downtime. By distributing content across multiple AWS regions, the system became resilient to regional outages, ensuring continuous global access to content.
4. Cost Efficiency:
The pay-as-you-go model helped Pearson optimize operational costs, eliminating physical infrastructure expenses and reducing capital expenditures. The result was a more predictable and manageable cost structure.
Lessons Learned
1. Importance of a Phased Migration:
Migrating in phases allowed for testing, validation, and resolution of issues before they impacted end-users. This approach significantly reduced the risk of failure and data loss.
2. Managing Data Integrity:
The successful migration of over 200 million assets required meticulous planning and coordination between development, system architecture, and operations teams. Ensuring data integrity was essential to the migration’s success.
3. Balancing Modernization and Compatibility:
While modernizing the infrastructure, ensuring backward compatibility with legacy systems and workflows was key to avoiding disruption to content producers and maintaining productivity.
4. Continuous Monitoring and Automation:
Implementing continuous monitoring tools and automating processes ensured the system's stability post-migration. This approach allowed for proactive issue detection and timely resolution, maintaining the Media Server’s performance.
Conclusion
The migration of Pearson’s Media Server to AWS was a strategic success, resolving the scalability, performance, and reliability challenges posed by the legacy infrastructure. By leveraging AWS’s cloud services, Pearson enhanced the Media Server’s capabilities, ensuring it could support the growing demand for digital content.
This case study highlights how cloud infrastructure can transform large-scale digital operations, enabling organizations to achieve greater scalability, cost efficiency, and performance, while ensuring reliable and secure service delivery to users worldwide.