HFT Site Reliability Engineer

Location London
Discipline: Financial Technology
Job type: Permanent
Contact name: Lewis Piper

Contact email: lewis.piper@venturesearch.com
Job ref: 3596
Published: about 7 hours ago

Site Reliability Engineer | High-Performance Trading Systems

Location: London (Hybrid)

About the Role

A cutting-edge quantitative trading firm is seeking a Site Reliability Engineer (SRE) to join its high-frequency trading (HFT) technology team. This is a hands-on engineering role focused on designing, building, and maintaining performant, secure, and scalable infrastructure across multiple cloud and bare-metal environments. You’ll work in a fast-paced setting alongside developers, traders, and researchers, ensuring the reliability and efficiency of global trading systems.


Key Responsibilities

  • Build, monitor, and optimise production environments across cloud and on-premises infrastructure.

  • Collaborate closely with traders and developers to support a dynamic trading platform.

  • Design and implement automation, orchestration, and CI/CD pipelines for reliable deployments.

  • Introduce and integrate new technologies to improve system performance and scalability.

  • Manage multi-region environments with a focus on low latency, high availability, and security.

  • Drive improvements in observability, monitoring, and incident response processes.


What You’ll Bring

  • Degree in Computer Science, Engineering, or related technical field.

  • 5+ years of experience in Site Reliability, DevOps, or Platform Engineering.

  • Strong expertise with Kubernetes (including Helm and related ecosystem tools).

  • Proficient with AWS and infrastructure-as-code tools such as Terraform.

  • Hands-on experience with Ansible, developing OS-agnostic roles and managing complex inventories.

  • Excellent Linux administration skills (Ubuntu/Debian preferred).

  • Proficiency in Python for automation, integration, and system tooling.

  • Experience with Git, CI/CD, and modern collaborative development practices.

  • Mindset of ownership and accountability, with a passion for innovation and reliability.


Nice to Have

  • Experience in low-latency or trading infrastructure environments.

  • Familiarity with Grafana, Prometheus, Loki, and OpenTelemetry.

  • Knowledge of network optimisation and multi-cloud connectivity.

  • Background in bare-metal provisioning or hybrid cloud architectures.

  • Experience managing databases (PostgreSQL, ClickHouse) and streaming systems (Kafka).

  • Exposure to VPN technologies (WireGuard, Tailscale, Zero Trust) and security tooling.

  • Contributions to open-source projects or a demonstrated passion for infrastructure innovation.