High Compute Engineer
Company: Leidos
Location: Arlington
Posted on: July 16, 2025
|
|
Job Description:
Job Description Description At Leidos, we deliver innovative
solutions through the efforts of our diverse and talented people
who are dedicated to our customers’ success. We empower our teams,
contribute to our communities, and operate sustainable practices.
Everything we do is built on a commitment to do the right thing for
our customers, our people, and our community. Our Mission, Vision,
and Values guide the way we do business. Employees enjoy career
enrichment opportunities available through mobility and development
and experience rewarding relationships with supportive supervisors
and talented colleagues and customers. Your most important work is
ahead. If this sounds like the kind of environment where you can
thrive, keep reading! We are seeking a forward-leaning High Compute
Engineer to lead the design, optimization, and integration of
GPU-centric high-performance compute environments. The ideal
candidate will be responsible for managing existing NVIDIA A100 and
DGX-1 systems while designing scalable architectures to incorporate
emerging GPU hardware as mission demands evolve. This role is
critical to our advanced compute initiatives, where performance,
stability, and future-readiness drive every architectural decision.
You'll work cross-functionally with data scientists, AI/ML
developers, cybersecurity experts, and infrastructure teams to
create a robust, secure, and performant GPU compute ecosystem. This
is a 100% on-site position. All work must be performed at the
customer site in Bethesda at the Intelligence Community Campus.
Responsibilities: - Manage, optimize, and monitor existing
high-performance GPU systems including NVIDIA A100s and DGX-1
platforms. - Architect integration plans for scaling GPU compute
infrastructure, including newer platforms (e.g., H100, Grace
Hopper, AMD Instinct). - Collaborate with data science teams to
fine-tune GPU workloads for AI/ML pipelines. - Design and implement
high-speed networking (InfiniBand/RDMA) and storage solutions
optimized for GPU data flow. - Develop automation workflows using
infrastructure-as-code (IaC) tools (e.g., Ansible, Terraform,
SaltStack). - Ensure system security, compliance, and patch
management in alignment with NIST, RMF, or agency-specific
controls. - Analyze compute performance metrics and provide
strategic recommendations for system enhancements. - Maintain
documentation on system architectures, configurations, and
operational procedures. You Bring - Bachelor's or higher degree in
Computer Engineering, Computer Science, or a related field with at
least 12 years of related technical experience. Additional years of
experience may be considered in lieu of a degree. - 5 years
experience supporting GPU compute environments in mission-critical
or enterprise settings. - Proficiency with NVIDIA technologies:
A100, DGX-1, CUDA, cuDNN, NCCL. - Strong background in Linux
(RHEL/CentOS/Ubuntu), kernel tuning, and HPC stack deployment. -
Experience with containerized GPU workloads using Docker,
Kubernetes, and NVIDIA GPU Operator. - Familiarity with distributed
compute frameworks (e.g., SLURM, Kubernetes, Ray). - Strong
scripting skills: Bash, Python, or similar. - Proven ability to
plan and execute large-scale system upgrades and migrations. -
Candidate must, at a minimum, meet DoD 8570.11- IAT Level II
certification requirements (currently Security CE, CCNA-Security,
GICSP, GSEC, or SSCP along with an appropriate computing
environment (CE) certification). An IAT Level III certification
would also be acceptable (CASP, CCNP Security, CISA, CISSP, GCED,
GCIH, CCSP). Clearance - Active TS/SCI clearance with Polygraph
required OR active TS/SCI and willingness to obtain and maintain a
Poly. - US Citizenship is required due to the nature of the
government contracts we support. Preferred Qualifications -
Experience with hybrid cloud GPU environments (AWS, GCP, or Azure
with NVIDIA support). - Familiarity with AI/ML tooling such as
PyTorch, TensorFlow, ONNX, and RAPIDS. - Experience integrating
GPUs with storage systems (e.g., Lustre, BeeGFS, Ceph). - Exposure
to hardware acceleration platforms (e.g., FPGA, custom ASIC). Why
Join Us - Shape the future of high-performance computing within a
cutting-edge technical team. - Influence procurement and system
design decisions for future GPU investments. - Work alongside
industry leaders in machine learning, cyber operations, and
advanced analytics. - Access to premier NVIDIA hardware in real
production environments. Original Posting:June 24, 2025 For U.S.
Positions: While subject to change based on business needs, Leidos
reasonably anticipates that this job requisition will remain open
for at least 3 days with an anticipated close date of no earlier
than 3 days after the original posting date as listed above. Pay
Range:Pay Range $126,100.00 - $227,950.00 The Leidos pay range for
this job level is a general guideline only and not a guarantee of
compensation or salary. Additional factors considered in extending
an offer include (but are not limited to) responsibilities of the
job, education, experience, knowledge, skills, and abilities, as
well as internal equity, alignment with market data, applicable
bargaining agreement (if any), or other law.
Keywords: Leidos, Reston , High Compute Engineer, IT / Software / Systems , Arlington, Virginia