Site Reliability Engineer | High-Performance Trading Systems
Location: London (Hybrid)
About the Role
A cutting-edge quantitative trading firm is seeking a Site Reliability Engineer (SRE) to join its high-frequency trading (HFT) technology team. This is a hands-on engineering role focused on designing, building, and maintaining performant, secure, and scalable infrastructure across multiple cloud and bare-metal environments. You’ll work in a fast-paced setting alongside developers, traders, and researchers, ensuring the reliability and efficiency of global trading systems.
Key Responsibilities
-
Build, monitor, and optimise production environments across cloud and on-premises infrastructure.
-
Collaborate closely with traders and developers to support a dynamic trading platform.
-
Design and implement automation, orchestration, and CI/CD pipelines for reliable deployments.
-
Introduce and integrate new technologies to improve system performance and scalability.
-
Manage multi-region environments with a focus on low latency, high availability, and security.
-
Drive improvements in observability, monitoring, and incident response processes.
What You’ll Bring
-
Degree in Computer Science, Engineering, or related technical field.
-
5+ years of experience in Site Reliability, DevOps, or Platform Engineering.
-
Strong expertise with Kubernetes (including Helm and related ecosystem tools).
-
Proficient with AWS and infrastructure-as-code tools such as Terraform.
-
Hands-on experience with Ansible, developing OS-agnostic roles and managing complex inventories.
-
Excellent Linux administration skills (Ubuntu/Debian preferred).
-
Proficiency in Python for automation, integration, and system tooling.
-
Experience with Git, CI/CD, and modern collaborative development practices.
-
Mindset of ownership and accountability, with a passion for innovation and reliability.
Nice to Have
-
Experience in low-latency or trading infrastructure environments.
-
Familiarity with Grafana, Prometheus, Loki, and OpenTelemetry.
-
Knowledge of network optimisation and multi-cloud connectivity.
-
Background in bare-metal provisioning or hybrid cloud architectures.
-
Experience managing databases (PostgreSQL, ClickHouse) and streaming systems (Kafka).
-
Exposure to VPN technologies (WireGuard, Tailscale, Zero Trust) and security tooling.
-
Contributions to open-source projects or a demonstrated passion for infrastructure innovation.