High Performance Computing Architect

Mount Sinai Health System
Apply Now

Job Description

Roles & Responsibilities:

The Scientific Computing and Data group at the Icahn School of Medicine at Mount Sinai partners with scientists to accelerate scientific discovery. To achieve these aims, we support a cutting-edge high-performance computing and data ecosystem along with MD/PhD-level support for researchers. The group is composed of a high-performance computing team, the research clinical data warehouse team and a research data services team.

The HPC Architect, High Performance Computational and Data Ecosystem, is responsible for the architecture, design and deployment of Scientific Computing’s computational and data science ecosystem. This ecosystem includes high-performance computing (HPC) systems, clinical research databases, and a software development infrastructure for local and national projects. To meet Sinai’s scientific and clinical goals, the Architect has a deep technical understanding of the best practices for computational, data and software development systems along with a strong focus on customer service for researchers. The HPC Architect is an expert troubleshooter and productive team member. The incumbent is a productive partner for researchers and technologists throughout the organization and beyond. This position reports to the Director for Computational & Data Ecosystem in Scientific Computing. Specific responsibilities are listed below.

  • Responsible for the technical operations including the architect, design, expansion, monitoring, support, and maintenance for Scientific Computing’s computational and data science ecosystem consistent with best practices. Key components include ~30,000 cores with high bandwidth, low latency interconnects, GPUs, large shared memory nodes, databases, scientific workflows and 30+ petabytes of storage in production, clinical data warehouse and software development environment.
  • Maintains, tunes and manages computational, data, cloud technologies and workflow systems for ISMMS researchers, scientists and their external collaborators. Defines and deploys a comprehensive computational and data vision. Identifies and communicates system advantages/disadvantages and tradeoffs.
  • Troubleshoots isolates and resolves application, system and other technical problems (hardware, software, and network). Actively monitors the systems.
  • Design, develop, implement all system administration tasks, including hardware and software configuration, configuration management, system monitoring (including the development and maintenance of regression tests), usage reporting, system performance (file systems, scheduler, interconnect, high availability, etc.), security, networking and metrics, etc.
  • Ensures that the design and operation of the HPC ecosystem is productive for research.
  • Collaborates effectively with research and hospital system IT, compliance, HIPAA, security and other departments to ensure compliance with all regulations and Sinai policies.
  • Partners with other peers regionally, nationally and internationally to discover, propose and deploy a world-class research infrastructure for Mount Sinai.
  • Participates in the integration of HPC resources with laboratory equipment such as sequencers, clinical and research data resources and systems, etc. Incorporate and link data and compute resources.
  • Researches, deploys and optimizes resource management and scheduling software and policies and actively monitoring.
  • Designs, tunes, manages and upgrades parallel file systems, storage and data-oriented resources.
  • Researches, deploys and manages security infrastructure, including development of policies and procedures.
  • Assists in developing and writing system design for research proposals.
  • Works effectively and productively with other team members within the group and across Mount Sinai.
  • Provide after hours support in case of a critical system issue.

Requirements:

  • Bachelor’s degree in computer science, engineering or another scientific field. Master's or PhD preferred
  • 6 years of progressive HPC system administration and operations (preferably in a Redhat/CentOS Linux administration, Batch HPC cluster environment)
  • Must be an expert troubleshooter; Must be a team player and customer focused
  • Experience with configuration management systems such as xCAT, Puppet and/or Ansible
  • Experience with networking and security
  • Experience with Infiniband and Gigabit Ethernet
  • Experience with LSF and GPFS Spectrum Scale parallel file systems and storage
  • Excellent communication skills, analytical ability, strong judgment and management skills, and the ability to work effectively as a liaison between both research and technology teams.
  • Strong written, oral, and interpersonal communication skills
  • Script and programming experience

Preferred Experience

  • Experience with archival storage and tape libraries (TSM) is highly preferred
  • Experience with databases and web services is highly preferred
  • Compliance, HIPAA
  • Experience with managing web access to HPC resources (such as Open OnDemand)
  • Singularity and/or docker containers
  • Academic and/or healthcare research setting
  • Nagios

Strength Through Diversity

The Mount Sinai Health System believes that diversity, equity and inclusion are drivers for excellence. We share a common devotion to delivering exceptional patient care. Yet we’re as diverse as the city we call home- culturally, ethically, in outlook and lifestyle. When you join us, you become a part of Mount Sinai’s unrivaled record of achievement, education, and advancement as we revolutionize medicine together and participate actively as a leader within the Mount Sinai Health System by:

  • Serving as the primary resource management representative of the Mount Sinai leadership teams, committees, etc., and acting as the primary executive leader interface between Mount Sinai and key executives from the health systems’ vendors and partners.
  • Engaging with relevant thought leaders and policy-makers at the federal and state levels, and representing the Health System as assigned.
  • Using a lens of equity in establishing and promoting policies and procedures and providing opportunities for all to thrive.
  • Confronting racist, sexist or other inappropriate behavior and challenges exclusionary organizational practices and serving as a role model to promote anti-racist behaviors.
  • Inspiring and fostering an environment of anti-racist behaviors among and between departments and co-workers.

We work hard to acquire and retain the best people, and to create a welcoming, nurturing work environment where you can develop professionally. We share the belief that all employees, regardless of job title or expertise, can make an impact on quality patient care.

Explore more about this opportunity and how you can help us write a new chapter in our story!

EOE Minorities/Women/Disabled/Veterans

Compensation

The Mount Sinai Health System (MSHS) provides a salary range to comply with the New York City Law on Salary Transparency in Job Advertisements. The salary range for the role is $120,000.00 - $180,060.00 Annually. Actual salaries depend on a variety of factors, including experience, education, and hospital need. The salary range or contractual rate listed does not include bonuses/incentive, differential pay or other forms of compensation or benefits.

Company Info.

Mount Sinai Health System

The Mount Sinai Health System is a hospital network in New York City. It was formed in September 2013 by merging the operations of Continuum Health Partners and the Mount Sinai Medical Center. The Health System includes more than 6,600 primary and specialty care physicians and 13 ambulatory surgical centers. It has ambulatory practices throughout the five boroughs of New York City, Westchester County, and Long Island, along with more than 30 aff

Get Similar Jobs In Your Inbox

Mount Sinai Health System is currently hiring High Performance Computing Engineer Jobs in New York, NY, USA with average base salary of $120,000 - $190,000 / Year.

Similar Jobs View More