Introduce

Say Hi from Espira Platform Engineer | Specializing in DevOps and Site Reliability Solutions

I craft and perfect innovative infrastructure solutions, driven by a deep passion for DevOps and SRE. I love making technology work seamlessly!

Rounded Text

8+

Years of
Experience

30+

projects completed on
different technologies

About

Every great platform has a story,
let's create an even better one

Since embarking on my career over six years ago in DevOps and site reliability, I've developed and managed infrastructure solutions for diverse organizations, from innovative startups to established research institutes. My remote collaborations have not only spanned global agencies but also involved direct consultations that tailored technology to unique business needs. I approach my work with a quiet confidence and a natural curiosity, always striving to enhance my expertise and solve complex infrastructure challenges—one project at a time.

Resume

Education & Experience

Research Software Engineer, Infrastructure

Eco Health Alliance

  • Architected serverless infrastructure on AWS, enhancing scalability and reducing costs by 30%.
  • Implemented robust systems for data collection and warehousing tailored to the needs of field researchers.
  • Managed high-performance computing environments with Docker and orchestration improving system reliability during peak periods and large workload.
  • Led the application of Large Language Models (LLMs) on GPU setups, boosting machine learning capabilities.
  • Automated infrastructure deployment processes, enhancing operational transparency and reducing workload by 40%.

SRE Engineer

Sport Server

  • Managed scale challenges unique to sports servers on GCP, maintaining 99.9% uptime and 99.95% data integrity.
  • Enhanced service lifecycle from design to refinement, ensuring robust and scalable architecture.
  • Integrated Google Cloud tools to enhance data observability and real-time quality checks.

DevOps Tech Lead

Sarami · Contract · Nairobi, Kenya

  • Developed software release management strategies and migrated applications from Heroku to AWS.
  • Implemented autoscaling, load balancing, and managed DNS with AWS Route 53.
  • Increased automation using GitHub Actions and Jenkins, and managed deployments to Kubernetes via GitOps.
  • Built secure systems on AWS, utilizing services like Amplify, EC2, S3, ELB, and CloudFormation.
  • Used AWS SDK in Python to interact with AWS services, reducing configuration errors.
  • Improved application performance with microservices architecture.
  • Enhanced system observability and monitored services using APM tools.

System Engineer

ICIPE - International Centre of Insect Physiology and Ecology

  • Upgraded AWS-based computing platforms for enhanced scalability and efficiency in scientific computations.
  • Streamlined big data analysis and deployment processes using Slurm and Terraform.
  • Developed and maintained genomic data analysis pipelines, improving data reusability and research efficiency.
  • Collaborated with technology vendors to minimize system failures and enhance infrastructure reliability.
  • Implemented performance monitoring solutions with Grafana, crucial for maintaining system health and uptime.

Data Systems Consultant

ICIPE - International Centre of Insect Physiology and Ecology

  • Optimized data processes for scalability, enhancing the efficiency of data operations.
  • Developed analytical tools to generate actionable insights, supporting strategic decision-making.
  • Integrated and built data analytics platforms tailored for entomological research.
  • Designed machine learning models for real-time monitoring of insect populations, enhancing research accuracy.

DevOps Engineer

ICIPE - International Centre of Insect Physiology and Ecology

  • Architected and deployed comprehensive CI/CD pipelines using Jenkins and CircleCI, enhancing real-time monitoring and cloud integration.
  • Developed and managed Docker and Singularity containers, orchestrating deployments across Kubernetes clusters to ensure scalable infrastructure.
  • Administered high-performance computing (HPC) environments, supporting complex data processing tasks.
  • Optimized batch queuing systems in a massively parallel production setting, enhancing processing efficiency and throughput.
  • Conducted system utilization analysis to ensure over 90% uptime and robust system health.
  • Led infrastructure redesign initiatives, automating core processes and enhancing scalability, thereby reducing operational overhead.
  • Constructed sophisticated SQL-based infrastructure for data extraction and loading, developing analytical tools to transform data insights into actionable intelligence.
  • Collaborated closely with executive, product, data, and design teams to support infrastructure needs and address technical challenges.
  • Implemented and continually refined data management procedures, enhancing data system functionality to support ongoing research and operational needs.

Education & Certifications

Education & Credentials

2013 - 2018

BSc in Computer Technology

Jomo Kenyatta University of Agriculture and Technology

Professional Certifications

View my verified achievements on

Credly Badges

This link leads to the Credly platform where I have listed all my professional certifications and badges that showcase my specialized skills.

Services

My Specializations

Cloud Computing

Proficient in deploying and managing scalable cloud solutions across AWS and GCP, my expertise encompasses a wide range of cloud technologies. I focus on aligning with the pillars of the Well-Architected Framework to ensure operational excellence, security, reliability, performance efficiency, and cost optimization in diverse cloud environments.

Multiple Projects

Kubernetes & Containerization

Proficient in advanced container orchestration techniques with Kubernetes, enhancing cloud-native applications' scalability and resilience.

Multiple Projects

Observability & Monitoring

Specialized in designing high-performance data observability frameworks, I focus on leveraging cutting-edge technologies for comprehensive system monitoring, ensuring enhanced visibility and operational insights across various cloud environments

Multiple Projects

Big Data & Data Engineering

Specialized in big data technologies and data engineering, optimizing data processes for scalability and efficiency across distributed systems.

Multiple Projects

CI/CD Pipelines

Skilled in enhancing software development and delivery through the automation of deployments with CI/CD pipelines, ensuring streamlined and efficient processes across development environments. I incorporate rigorous security checks to maintain high standards of software integrity and security

Multiple Projects

Large Scale Distributed Computing

Experienced in managing complex, large-scale distributed computing environments to support intensive data processing and analysis tasks.

Multiple Projects

Machine Learning & AI

Skilled and experienced in orchestrating the deployment and integration of Large Language Models (LLMs) and machine learning solutions within GPU-accelerated environments, enhancing performance and driving innovative data processing capabilities

Multiple Projects

Infrastructure as Code

Skilled and experienced in applying Infrastructure as Code (IaC) methodologies across diverse platforms, from cloud environments to bare metal clusters. My approach automates the provisioning and management of resources, ensuring consistency, scalability, and compliance across all operational environments.

Multiple Projects

My Skills

My Technical Proficiencies

Programming Languages

Programming Languages

  • Python, Java, Golang, JavaScript, R, Rust
Containerization and Orchestration

Containerization & Orchestration

  • Docker, Singularity, Kubernetes, Docker Swarm
Infrastructure as Code

Infrastructure as Code

  • Terraform, Ansible, Helm
Cloud Platforms

Cloud Platforms

  • AWS, GCP
CI/CD Tools

CI/CD Tools

  • Jenkins, CircleCI, GitHub Actions
Monitoring and Logging

Monitoring and Logging

  • Grafana, Prometheus, ELK Stack, Open Telemetry, Honeycomb
Database Management

Database Management

  • PostgreSQL, MongoDB, MySQL, Cassandra
Server and Version Control Management

Server Management & Version Control

  • Linux, Windows, GitHub, GitLab
Project Management Tools

Project Management Tools

  • Asana, Jira, Confluence, Opsgenie

Portfolio

Featured Projects

Advanced Data Observability

Developed comprehensive observability solutions using Google Cloud Platform (GCP) tools such as BigQuery, Cloud Logging, and Stackdriver. These solutions enabled real-time data analytics, log management, and monitoring to ensure system reliability and performance. Implemented automated alerting and dashboards using Grafana to provide actionable insights and improve operational efficiency.

Tools Used: GCP, BigQuery, Cloud Logging, Stackdriver, Grafana, Python, Java.

Microservices Monitoring with OpenTelemetry

Implemented OpenTelemetry for distributed tracing and monitoring across a microservices architecture. Utilized application performance management (APM) tools to collect and analyze telemetry data, providing deep insights into system performance and enhancing reliability. Integrated with existing monitoring solutions like Prometheus and Grafana to visualize and manage service health.

Tools Used: OpenTelemetry, Prometheus, Grafana, Python, Docker, Kubernetes.

NLP and Large Language Model Deployment

Focused on leveraging large language models (LLMs) and natural language processing (NLP) techniques to extract and structure data from unstructured text sources. Conducted intensive model training on localized GPU setups to enhance processing efficiency and accuracy. Developed pipelines for fine-tuning models to specific datasets and tasks, improving their applicability in real-world scenarios.

Tools Used: Python, TensorFlow, PyTorch, GPU Computing, Docker, Kubernetes.

Enterprise Cloud Migration

Managed a major cloud migration project to Amazon Web Services (AWS), utilizing Infrastructure as Code (IaC) tools such as Terraform to ensure a smooth and efficient transition. Implemented automated deployment pipelines using CI/CD tools like Jenkins and GitHub Actions. Enhanced operational efficiency and scalability by leveraging AWS services such as EC2, S3, and RDS.

Tools Used: AWS, Terraform, Jenkins, GitHub Actions, Python, Docker.

High-Performance Computing Setup

Designed and optimized a Kubernetes-based High-Performance Computing (HPC) environment for complex data analysis tasks. Managed container orchestration using Docker and Kubernetes to ensure efficient resource utilization and scalability. Integrated with parallel computing frameworks and batch scheduling systems for optimized performance.

Tools Used: Kubernetes, Docker, Slurm, Terraform, Python, R, Ansible.

Data Security Enhancement Initiative

Led the development and implementation of advanced security measures to protect sensitive data. Utilized AWS security services, encryption techniques, and continuous monitoring to ensure data integrity and compliance with industry standards. Implemented automated security checks and incident response protocols to enhance overall security posture.

Tools Used: AWS Security Services, Python, Terraform, Jenkins, ELK Stack.

Prompt-Engineering for Open-Source LLMs

Developed prompt-engineering techniques for enhancing the performance of open-source large language models (LLMs). Implemented and fine-tuned prompts to improve the accuracy and applicability of LLMs in various use cases, such as text summarization, sentiment analysis, and information extraction.

Tools Used: Python, TensorFlow, PyTorch, OpenAI GPT, Docker.

Dockerized Jupyter Notebooks Setup

Created a Dockerized environment for running Jupyter notebooks, enabling easy sharing and consistent setups across teams. Utilized Docker Compose for managing multiple containerized environments, simplifying the deployment and maintenance of Jupyter notebooks for data science and research projects.

Tools Used: Docker, Docker Compose, Jupyter, dockerswarm, EKS, Python.

AWS ParallelCluster Setup

Implemented AWS ParallelCluster for managing High-Performance Computing (HPC) clusters, supporting various instance types and job schedulers like AWS Batch and Slurm. Developed and maintained HPC infrastructure to enable scalable and efficient computational workflows for research and data-intensive applications.

Tools Used: AWS ParallelCluster, AWS Batch, Slurm, Python, Terraform.

Serverless Architecture Migration

Migrated applications to a serverless architecture using AWS Lambda, CloudFront, Cognito, API Gateway, and integrated monitoring with AWS CloudWatch and X-Ray for scalability and performance. This approach aimed to achieve cost efficiency, high availability, and enhanced security for cloud-based applications.

Tools Used: AWS Lambda, CloudFront, Cognito, API Gateway, CloudWatch, X-Ray, Terraform, Python, MongoDB.

Contact

Let's Create Something Great!

Every great platform has a story. Let’s build yours together.

Are you ready to transform your ideas into reality? Whether it’s a cutting-edge cloud infrastructure, robust data observability solution, or innovative machine learning project, I’m here to help you every step of the way. Let's collaborate and craft an amazing story for your platform.

Email: masundeespira@gmail.com

Phone: +5518041964

Reach out today and let's start building your next big success!