title-image
Turrior - Let work find you
Recruiters get AI-ranked shortlists and automated outreach, filling roles up to 5× faster.
0%
Popularity
0d
Avg. Time to Hire
0h
Recruiter Res. Time
0%
HR Satisfaction
Careers at Rackspace
All open opportunities, right here. Explore, apply, grow.
Apply now

Senior Systems Engineer HPC - R-21841

Full Time
full time
6 Oct 2025
India - Gurgaon

About the job

Responsibilities:System Administration & Maintenance: Install, configure, and maintain HPC clusters (hardware, software, operating systems), perform regular updates/patching, manage user accounts and permissions, and troubleshoot/resolve hardware or software issues.Performance & Optimization: Monitor and analyse system and application performance, identify bottlenecks, implement tuning solutions, and profile workloads to improve efficiency.Cluster & Resource Management: Manage and optimize job scheduling, resource allocation, and cluster operations using tools such as Slurm, LSF, Bright Cluster Manager / Base Command Manager, OpenHPC, and Warewulf.Networking & Interconnects: Configure, manage, and tune Linux networking (TCP/IP, DNS, routing) and high-speed HPC interconnects (InfiniBand, Ethernet) to ensure low-latency, high-bandwidth communication.Storage & Data Management: Implement and maintain large-scale storage and parallel file systems (Lustre, Ceph, GPFS), ensure data integrity, manage backups, and support disaster recovery.Security & Authentication: Implement security controls, ensure compliance with policies, and manage authentication and directory services such as LDAP and Active Directory.DevOps & Automation: Use configuration management and DevOps practices (Ansible, Terraform, Jenkins, Git) to automate deployments, application packaging (RPM/DEB), and system configurations.User Support & Collaboration: Provide technical support, documentation, and training to researchers; collaborate with scientists, HPC architects, and engineers to align infrastructure with research needs.Planning & Innovation: Contribute to the design and planning of HPC infrastructure upgrades, evaluate and recommend hardware/software solutions, and explore cloud-based HPC solutions where applicable.

Qualifications:

    • Bachelor’s degree in Computer Science, Engineering, or a related field (equivalent experience may substitute for degree).
    • Minimum of 10 years of systems experience, including at least 5 years working specifically with HPC.
    • Strong knowledge of Linux operating systems (e.g., Rocky Linux, Ubuntu) with a fundamental understanding of Linux internals, system administration, and performance tuning.
    • Experience building and managing RPM and DEB packages.
    • Experience with cluster management tools such as Bright Cluster Manager, OpenHPC stack, or Warewulf.
    • Proficiency with job schedulers and resource managers such as Slurm and LSF.
    • Strong understanding of Linux networking (e.g., TCP/IP, DNS, routing) and HPC interconnects (e.g., InfiniBand, Ethernet) including performance tuning.
    • Knowledge of parallel file systems such as Lustre, Ceph, or GPFS.
    • Working knowledge of Linux authentication and directory services such as LDAP and Active Directory.
    • Proficiency in scripting languages (e.g., Python, Bash, R) and familiarity with MPI libraries for parallel and distributed computing (nice to have).
    • Strong experience with DevOps and configuration management tools, including Ansible, Terraform, Jenkins, and Git.
    • Knowledge of HPC in cloud environments (e.g., AWS, Azure, GCP HPC offerings) is a plus.
    • Strong knowledge of Linux security, compliance standards, and data protection best practices.
    • Excellent communication, interpersonal, and problem-solving skills.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Similar Jobs

5 months agoOther
HPC System Software Engineer
Lawrence Berkeley National Laboratory
Other
6 months agoFull Time
Full Time
5 months agoOther
HPC System Software Engineer
Lawrence Berkeley National Laboratory
Other
5 months ago
Senior HPC Linux Systems Engineer
Oak Ridge National Laboratory
6 months ago
Senior HPC Engineer
Oak Ridge National Laboratory

End-to-end AI hiring for modern HR teams

Turrior uses artificial intelligence to create job listings, automate candidate screening, conduct video interviews, and apply comprehensive AI scoring — helping companies hire faster, more accurately, and with lower operational costs.

Key benefits:

  • AI-powered job creation and structured job data
  • Intelligent candidate screening and automated shortlisting
  • Video interviews with AI-based answer analysis
  • Comprehensive AI scoring of skills, experience, and role fit
  • Recruitment process automation and reduced time-to-hire

Share job