Senior High Performance Computing Specialist - Scientific Computing & Data

Mount Sinai Health System
Apply Now

Job Description

As a member of the MSSM Scientific Computing Group, the Senior HPC Specialist partners and plans joint biomedical projects that leverages MSSM HPC (high performance computing) clusters, data science tools and other modern computational and data science techniques. This individual understands research needs and best practices to decide the path forward for new applications and technologies. Working together with the whole HPC team, the expert Specialist will be responsible for troubleshooting and engineering the HPC system and providing direct researcher support. The incumbent is a customer-focused leader who sets a positive, productive and courteous tone for the Scientific Computing group and operates under little to no supervision. This position reports to the Director for Computational & Data Ecosystem in 

Responsibilities

  • Partners directly with scientific and research staff to lead projects to efficiently and effectively use MSSM HPC clusters and other modern data science tools and infrastructure. Leads effort to develop a needs requirements and improves researcher productivity through highly optimized computational and data workflows. 
  • Leads the research, analysis and deployment of new applications, systems and other technologies such as GPU’s, web applications, database, container and cloud computing.
  • Assists in maintenance and system administration of HPC clusters with over 30,000 cores with Infiniband, 30+ petabytes of data storage and databases in production. Support Day-to-day operations of the Linux HPC clusters and storage systems. Designs and develops scripts for system administration and usage reporting. Proactively monitors, analyzes, and resolves all technical issues.
  • Researches, deploys and optimizes resource management and scheduling software and policies.
  • Troubleshoots, isolates and resolves application and system problems as the most senior level of support. Resolves user report issues and manages support tickets requests.
  • Collect usage metrics in all aspects and develop reports.
  • Coordinates and leads software and environment benchmarking efforts.
  • Develop and implement automated regression tests for the HPC environment. Evaluates and approves application programming environments.
  • Oversees the installation infrastructure and management of numerical libraries and application performance tools. Apply best practices in software engineering, delivering projects on time, on budget, and with excellent quality.
  • Oversee and directly contributes to significant ongoing HPC integrations with the laboratory equipment such as sequencers and others, etc.
  • Mentors and trains junior staff, cross training peers and users.
  • Manage the Mount Sinai Data Commons - Data Ark server space and all data sets, creating and maintaining an efficient file structure and providing user-access according to the specific restrictions of the different data sets and user support needed.
  • Conducts user surveys and leads the allocations process. Leads the chargeback/fee recovery analysis and provides suggestions to make operations sustainable.
  • Provides recommendations for innovative technology periodically and customer service approaches that are science/research driven to improve the effectiveness, efficiency, and customer satisfaction of the resources; Communicates with clinicians, researchers and other team members to understand and develop use cases and jointly develop plans, proposals and provide effective technological solutions.
  • Writes, publishes and presents papers at national and international conferences.
  • Assists in developing and writing proposals and reports. Creates and provides clear documentation on all items.
  • Works as a strong team player within the group, within Mount Sinai, and externally.
  • Provide after hours support in case of a critical issue.
  • Other relevant duties as assigned.

Qualifications

  • Ph.D.or equivalent in a scientific domain required.
  • 7+ years (higher preferred) in a scientific/academic computing environment. 10 years experience with a Master's degree.
  • Experience in batch scientific computing HPC cluster environment with a parallel file system
  • Experience with Linux administration preferably in a Redhat/Centos Linux; Monitoring software administration (Nagios, Grafana etc); Configuration management (Ansible preferred)
  • Experience with storage solution and network is preferred
  • Strong experience installing and supporting bio, genomics, chemistry codes and AI programs such as GATK, CryoSPARC, R, Python, Shiny, Jupyter, Rstudio, NAMD, AMBER, Matlab, Gromacs, Schrödinger, pytorch, tensorflow, etc
  • Experience supporting applications based on CUDA, OpenCL, MPI, OpenMP and numerical libraries
  • Experience with Job scheduler such as LSF or Slurm
  • Strong programing and scripting skills and excellent attention to detail; proficiency in at least C, C++, python and bash
  • Experience with scientific workflows in an academic or research community environment
  • Experience with data transfer tools such as globus and cloud CLI
  • Experience with data science, containers and cloud environments
  • Experience with instrumenting and optimizing application codes
  • Experience with Open OnDemand is preferred
  • Experience with web applications such as python flask, Shiny, R Markdown, etc, and database such as MySQL or MongoDB is preferred
  • Experience with HIPAA compliance is preferred

Employer Description

Strength Through Diversity

The Mount Sinai Health System believes that diversity, equity, and inclusion are key drivers for excellence. We share a common devotion to delivering exceptional patient care. When you join us, you become a part of Mount Sinai’s unrivaled record of achievement, education, and advancement as we revolutionize medicine together. We invite you to participate actively as a part of the Mount Sinai Health System team by:

  • Using a lens of equity in all aspects of patient care delivery, education, and research to promote policies and practices to allow opportunities for all to thrive and reach their potential.
  • Serving as a role model confronting racist, sexist, or other inappropriate actions by speaking up, challenging exclusionary organizational practices, and standing side-by-side in support of colleagues who experience discrimination.
  • Inspiring and fostering an environment of anti-racist behaviors among and between departments and co-workers.

We work hard to acquire and retain the best people and to create an inclusive, welcoming and nurturing work environment where all feel they are valued, belong and are able to professional advance. We share the belief that all employees, regardless of job title or expertise contribute to the patient experience and quality of patient care.

Company Info.

Mount Sinai Health System

The Mount Sinai Health System is a hospital network in New York City. It was formed in September 2013 by merging the operations of Continuum Health Partners and the Mount Sinai Medical Center. The Health System includes more than 6,600 primary and specialty care physicians and 13 ambulatory surgical centers. It has ambulatory practices throughout the five boroughs of New York City, Westchester County, and Long Island, along with more than 30 aff

Get Similar Jobs In Your Inbox

Mount Sinai Health System is currently hiring High Performance Computing Engineer Jobs in United States with average base salary of $128,390 - $192,585 / Year.

Similar Jobs View More