While at Vox, the company acquired multiple products and CMSs running in different AWS accounts. As we onboarded these new products, they needed to communicate with services across various AWS accounts, increasing network complexity. Additionally, some services in Google Cloud Platform (GCP) required access to AWS resources. To address th
While at Vox, the company acquired multiple products and CMSs running in different AWS accounts. As we onboarded these new products, they needed to communicate with services across various AWS accounts, increasing network complexity. Additionally, some services in Google Cloud Platform (GCP) required access to AWS resources. To address these challenges, I built out a centralized networking solution to simplify routing, inter-account connections, and the existing Cato VPN setup. The existing Cato setup relied on running the Cato client in multiple AWS accounts on EC2 instances. However, accounts that did not have Cato deployed were not accessible via VPN, creating connectivity gaps and limiting network visibility across the infrastructure.
To implement this, I created a dedicated AWS account for networking and deployed AWS Cloud WAN along with a new Cato VPN instance. Previously, the Cato setup relied on running the Cato client in multiple AWS accounts on EC2 instances, which meant that accounts without the client were not accessible via VPN. By centralizing the Cato VPN in the new networking AWS account, I ensured that all AWS accounts, regardless of whether they had previously run Cato, could securely communicate over the VPN.
Cloud WAN enabled seamless VPC attachments across multiple AWS accounts by leveraging AWS Resource Access Manager (RAM) to share the Cloud WAN resource. Once attached, private subnet routes were automatically configured, allowing me to replace individual Transit Gateways and directly connect VPCs to designated network segments (staging, production, or shared). This setup allowed services across AWS accounts to communicate as if they were within the same account, provided security groups permitted access. Additionally, I established a VPN connection between the networking AWS account and GCP project VPCs, integrating GCP services into Cloud WAN for seamless cross-platform communication.
Since Cloud WAN was relatively new at the time, I updated our Terraform modules to automate VPC attachments, ensuring proper route propagation and preventing IP conflicts. This improvement streamlined the onboarding of new accounts and products, laying the foundation for a unified, scalable infrastructure.
As our developers created more complex pre-deployment and testing in jobs to our GitHub Actions, the pipelines began to slow down significantly and using the hosted runners began to be costly. I built out a custom AMI for running Self-Hosted runners. These were built in GitHub using Packer and AWS Cli together. I created the Packer AMI
As our developers created more complex pre-deployment and testing in jobs to our GitHub Actions, the pipelines began to slow down significantly and using the hosted runners began to be costly. I built out a custom AMI for running Self-Hosted runners. These were built in GitHub using Packer and AWS Cli together. I created the Packer AMI builder and the GitHub actions as templates, that could be shared with other Organizations and AWS Accounts, by filling in variable values and adding pre-run scripts of their own. The builder would build the AMI, update the staging AutoScalingGroup (ASG) and deploy staging runners to test with. Once the testing was completed, and the PR for the build is merged, it deployed the updated the AMI to the production ASG, refreshed the instances running allowed the jobs to pick up the updated runners. This sped up the pipelines and resulted in some cost reduction. We had to created scheduled ASG min/max for the time of day and days of the week, as a true 'AUTO' scaling was not possible this way.
Once I completed the mentioned centralized network (see above), we were able to run the self-hosted runners on one EKS cluster for all the AWS Accounts across the Organization. This significantly reduced the cost of running the runners on EC2 instances or on the hosted runners, but it also allowed the use of true autoscaling. All of this significantly improved the overall speed and cost of our pipelines.
At Tune, we relied heavily on MySQL replication running on EC2 instances. When developers needed a new database, we had to manually provision and configure it, slowing down development and introducing operational overhead. Additionally, setting up MySQL replication on EC2 was time-consuming, and managing master-slave replication posed cha
At Tune, we relied heavily on MySQL replication running on EC2 instances. When developers needed a new database, we had to manually provision and configure it, slowing down development and introducing operational overhead. Additionally, setting up MySQL replication on EC2 was time-consuming, and managing master-slave replication posed challenges—databases could quickly become too large, and outages often required extensive effort to fix replication issues or recover missing/corrupt data. To improve reliability, scalability, and efficiency, we decided to migrate our MySQL databases to AWS Aurora.
I led the effort to migrate our large, replication-based databases to AWS Aurora, which provided better performance, higher availability, and significantly reduced setup complexity. Once the migration was complete, I developed a Terraform module that allowed developers to self-provision their own Aurora databases. Instead of waiting for manual database creation, they could simply submit a pull request with the necessary parameters, and we would review, approve, and apply the changes. This shift enabled a faster, self-service model that developers appreciated, while also ensuring consistency across all database instances.
With this automation in place, we eliminated the need for a dedicated DBA or Database Reliability Engineering team. As a result, each of us transitioned into different roles within the company—this is when I moved into the DevOps/SRE team, officially beginning my career as an SRE!
I have completed online courses on various topics including:
I have experience with multiple programming languages, including Ruby, Node.js, Python, and Shell. I specialize in cloud networking, infrastructure development, and observability, with additional expertise in database management systems such as MySQL and PostgreSQL.
Additional skills:
Cloud Platform Engineering, Amazon CloudWatch, AWS Cloud WAN, AWS Managed Services, AWS RDS Aurora, SQL, MySQL, Postgres, Kubernetes (Kops/EKS), Software Configuration Management (Ansible, Puppet, Chef), Docker, Git, SVN, Linux, Shell Scripting, API, AWS API Gateway, Start-ups, GitLab, Pipelines, GitHub Actions, Fastly, Systems Administration, Scalability, Database Administration, Continuous Integration, IaC/Terraform, Datadog, Prometheus, Observability, Pipelines, Spinnaker, Cloud Networking (AWS/GCP), VPC, Network Design, Traffic and Routing Management, Unix, TCP/IP, Networking, DNS, HTTP(s), TTL/SSL, Troubleshooting
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.