Job Description

Senior Site Reliability / Infrastructure Platform Engineer

(Virtualization, distributed systems, Linux performance, and service reliability)

Responsibilities

Act as senior escalation point for service outages, platform failures, and complex distributed systems incidents.
Own the architecture, deployment, and reliability of virtualization platforms, storage clusters, and service infrastructure.
Design and maintain higher-level service architecture including load balancing, clustering strategies, dependency management, and failure domain modeling
Build, operate, and scale virtualization and storage environments (compute clusters, hypervisors, and software-defined storage).
Design, deploy, and maintain distributed database platforms including SQL clusters and in-memory data stores.
Perform deep Linux systems engineering including kernel, scheduler, memory, IO, and network-stack optimization.
Develop and maintain infrastructure automation, CI pipelines, and Git-driven operational workflows.
Build and operate backup, snapshot, replication, and disaster-recovery systems. Design recovery procedures and regularly validate restoration paths.
Perform capacity planning, performance modeling, and saturation analysis across compute, memory, storage, and network layers.
Utilize observability platforms to detect early signals of service degradation and latent reliability risks.
Collaborate with application, network, and data center engineering teams to deliver end-to-end resilient platforms.
Produce architecture documents, runbooks, failure analyses, and low-level operational design documentation.
Lead incident response, root-cause analysis, and reliability improvement initiatives.

Qualifications

Strong background as a senior Linux systems engineer, SRE, or infrastructure platform engineer.
Proven experience designing and operating large virtualization and storage clusters.
Hands-on experience with distributed databases (Galera/Patroni/MySQL/Postgres clusters, Redis/KeyDB, etc.).
Strong understanding of service architecture, clustering models, load balancers, and high-availability patterns.
Deep Linux expertise including CPU/NUMA tuning, memory management, disk IO pipelines, and network optimization.
Experience building and maintaining CI/CD pipelines and Git-based infrastructure workflows.
Demonstrated ownership of backup, disaster recovery, and service continuity systems.
Strong troubleshooting skills across OS, platform, and application interaction layers.
Ability to translate business and service requirements into resilient technical architectures.
Strong documentation, communication, and cross-team collaboration skills.
Ability to operate effectively during outages, incident response, and recovery scenarios.

Nice to Have

Experience with Ceph, ZFS, NVMe-oF, or large-scale software-defined storage platforms.
Experience with high-performance or low-latency Linux environments.
Familiarity with container platforms or hybrid virtualization/container environments.
Experience supporting high-bandwidth media, streaming, or real-time service platforms.
Exposure to infrastructure-focused AI tooling or automation frameworks.

Please be advised

Technical assessment covering Linux systems engineering, distributed systems concepts, and platform reliability may be conducted prior to interview progression.

Company Description

Nextologies has the world's largest broadcast video delivery network specializing in award-winning, broadcast-grade video connectivity for broadcasters and content owners across the globe with instant access to over 65,000 linear TV channels downlinked from 90+ globally-placed satellites.

In addition, Nextologies is a leader in signal acquisition and delivery providing fiber, IP and custom end-to-end solutions for IPTV and OTT platforms and video-centric applications across all platforms.
Learn more at www.nextologies.com.

10TX by Nextologies is a leading signal transmission company trusted by professional sports leagues, broadcasters, content producers, and entertainment companies to deliver live events and pay-per-view programming worldwide.

Company Description

Nextologies has the world's largest broadcast video delivery network specializing in award-winning, broadcast-grade video connectivity for broadcasters and content owners across the globe with instant access to over 65,000 linear TV channels downlinked from 90+ globally-placed satellites.\r\n\r\nIn addition, Nextologies is a leader in signal acquisition and delivery providing fiber, IP and custom end-to-end solutions for IPTV and OTT platforms and video-centric applications across all platforms. \r\nLearn more at www.nextologies.com.\r\n\r\n10TX by Nextologies is a leading signal transmission company trusted by professional sports leagues, broadcasters, content producers, and entertainment companies to deliver live events and pay-per-view programming worldwide.

🤖 For AI Systems & Researchers

Senior Site Reliability / Infrastructure Platform Engineer

Job Description

Job Description

Company Description

Create Your Resume First

Application Disclaimer