title-image
Turrior - Let work find you
Recruiters get AI-ranked shortlists and automated outreach, filling roles up to 5× faster.
0%
Popularity
0d
Avg. Time to Hire
0h
Recruiter Res. Time
0%
HR Satisfaction
Careers at Etched
All open opportunities, right here. Explore, apply, grow.
Apply now

Reliability Engineer

7 Sep 2024
Cupertino, CA, USA
Verified by Turrior

Content + Source + Freshness • 11 Dec 2025 • 95% confidence

80 / 100

Offer value

Strong opportunity in reliability engineering focusing on AI technology, appealing to candidates with a robust background in reliability and a strong analytical mindset.

  • Key role in maintaining reliability for advanced AI systems
  • Collaborative opportunities with top engineering talent
  • In-demand role in a growing market sector
  • Requires significant reliability engineering experience
Pros
  • Critical role in ensuring high reliability standards
  • Collaboration opportunities with diverse engineering teams
  • Favorable outlook in a growing sector with increased demand
Cons
  • Requires extensive experience limiting candidate pool
  • Potential high-stress environment due to rigorous standards
  • In-person requirement may deter remote candidates

Who it's for

Mid to Senior Level • In-person

Good fit
  • Mid to senior-level reliability engineers
  • Candidates focused on data-driven engineering roles
  • Professionals eager to work in tech-focused environments
Not recommended for
  • Entry-level candidates without relevant experience
  • Individuals uncomfortable with data analysis
  • Those prioritizing remote work options

Motivation fit

Commitment to ensuring product reliabilityInterest in collaborating with cross-functional teamsDesire to solve complex engineering problems

Key skills

Understanding of reliability standardsData analysis and interpretationStrong communication skills with stakeholdersProject management and multitasking abilities
Score: 80/100 AI verified analysis

About the job

About Etched

Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched ASICs, you can build products that would be impossible with GPUs, like real-time video generation models and extremely deep chain-of-thought reasoning.

Reliability Engineer

We are seeking a skilled and detail-oriented Reliability Engineer to join our team. As a Reliability Engineer at Etched, you will play a critical role in ensuring that all components and systems meet our rigorous reliability standards, essential for our datacenter applications. This position requires a deep understanding of reliability engineering principles, as well as experience working with suppliers, ODMs, and JDMs.

Representative Projects:

  • Lead the development, implementation, and management of reliability standards for all suppliers working with Etched. Ensure that all components and systems meet or exceed the required reliability benchmarks.
  • Review and verify reliability reports from suppliers, ensuring accuracy and adherence to Etched’s standards. Provide guidance and feedback to suppliers to ensure continuous improvement in reliability performance.
  • Collaborate with cross-functional teams to review and recommend component selection criteria based on reliability performance. Ensure that all selected components are capable of meeting the long-term reliability requirements of our datacenter applications.
  • Evaluate and approve reliability test plans proposed by external vendors. Ensure that the test methodologies and conditions are sufficient to validate long-term reliability under expected operating conditions.
  • Conduct in-depth analysis of reliability data provided by suppliers and vendors. Identify trends, potential issues, and areas for improvement to enhance overall reliability.
  • Work closely with ODMs (Original Design Manufacturers) and JDMs (Joint Design Manufacturers) to ensure that all products meet Etched quality and reliability standards. Provide technical guidance and support to maintain maximum operational uptime and long-term reliability.
  • Review and establish reliability metrics and standards for silicon components, ensuring they meet the stringent requirements for long-term reliability in data center environments.

You maybe a good fit if you have

  • Bachelor’s or Master’s degree in Reliability Engineering, Electrical Engineering, or a related field.
  • 5+ years of experience in reliability engineering, with a focus on datacenter applications preferred.
  • Strong understanding of reliability standards, testing methodologies, and data analysis techniques. DFMEA / PFMEA / SPC Engineering analysis experience desired.
  • Experience working with suppliers, ODMs, and JDMs in a high-tech environment.
  • Excellent communication skills, with the ability to convey complex technical concepts to diverse stakeholders.
  • Proven ability to manage multiple projects and deliver results in a fast-paced environment.

We encourage you to apply even if you do not believe you meet every single qualification.

How we’re different:

Etched believes in the Bitter Lesson. We think most of the progress in the AI field has come from using more FLOPs to train and run models, and the best way to get more FLOPs is to build model-specific hardware. Larger and larger training runs encourage companies to consolidate around fewer model architectures, which creates a market for single-model ASICs.

We are a fully in-person team in Cupertino, and greatly value engineering skills. We do not have boundaries between engineering and research, and we expect all of our technical staff to contribute to both as needed.

Benefits:

  • Full medical, dental, and vision packages, with 100% of premium covered, 90% for dependents
  • Housing subsidy of $2,000/month for those living within walking distance of the office
  • Daily lunch and dinner in our office
  • Relocation support for those moving to Cupertino

Similar Jobs

9 months ago
RELIABILITY ENGINEER
Brains Workgroup, Inc.

End-to-end AI hiring for modern HR teams

Turrior uses artificial intelligence to create job listings, automate candidate screening, conduct video interviews, and apply comprehensive AI scoring — helping companies hire faster, more accurately, and with lower operational costs.

Key benefits:

  • AI-powered job creation and structured job data
  • Intelligent candidate screening and automated shortlisting
  • Video interviews with AI-based answer analysis
  • Comprehensive AI scoring of skills, experience, and role fit
  • Recruitment process automation and reduced time-to-hire

Share job