Platform / SRE Engineer

Location Singapore
Discipline: Financial Technology
Job type: Permanent
Contact name: Lewis Piper

Contact email: lewis.piper@venturesearch.com
Job ref: 3736
Published: about 5 hours ago

A high-performance trading technology firm is hiring a Platform/Site Reliability Engineer to join its Infrastructure team. The firm develops all core systems in-house and operates large-scale, latency-sensitive research and production platforms.

The Role

This is a senior, hands-on SRE position with ownership of cloud and on-prem infrastructure supporting research, batch compute, and production workloads. You will work closely with engineering and research teams to design, operate, and evolve highly reliable systems, while helping embed a strong SRE culture across the organisation.

What You’ll Do

  • Build and operate observability platforms (monitoring, logging, tracing, alerting) for high availability and rapid incident response

  • Architect and maintain scalable infrastructure across cloud and on-prem environments

  • Support and evolve research compute clusters, including batch and workflow-driven workloads

  • Investigate and resolve live production issues end-to-end

  • Improve CI/CD pipelines, tooling, and developer experience in partnership with engineers

  • Drive SRE best practices and operational excellence

What We’re Looking For

  • 8+ years experience in SRE / Platform / Infrastructure engineering

  • Background in trading, quantitative research, or other performance-critical environments

  • Strong Kubernetes experience (design and operations)

  • Practical knowledge of GitOps and modern CI/CD workflows

  • Experience supporting batch, workflow, or HPC-style systems

  • Solid cloud fundamentals (AWS or GCP)

  • Proficiency in Python and/or Go

  • Comfortable owning production systems and incidents

  • Strong communication skills and an ownership mindset

Nice to Have

  • Kubernetes operators

  • Bare-metal or hybrid infrastructure

  • Containerisation and configuration management tools

  • Security-aware infrastructure engineering

  • Observability tooling (Prometheus, ELK)

  • Enterprise Linux (RedHat / CentOS)

  • CI/CD tooling (GitLab CI, Jenkins)

  • Open-source contributions