Data Manager, Biostatistics

Yale University
Apply Now

Job Description

Position Focus:

This position will function as a lead data analytics and bioinformatics expert in the laboratory who will lead the processing, quality checks, all stages of data analyses, interpretation and visual representation of data that emerge from proteomics (mass spectrometry) AND transcriptomics (bulk and single cell RNAseq) studies performed in the laboratory. Bioinformatics analyses of these two types of data have distinct requirements and training, BOTH of which will be expected for this position.

The position will also interface with biologists in the team, to address their data needs, provide input in analytical plans and execute these plans in a timely manner. The position will require excellent multitasking abilities, and a hands-on and deep knowledge of programming languages not limited to R, Python, and SPSS. 

The individual will be responsible for data management, data storage, cataloging of all data, data cleanup and will also ensure that datasets are appropriately disseminated to the public (via deposition in public repositories) after scientific publications.

The position will also participate in grant writing, manuscript preparation, scientific presentations and lab meetings. The individual needs to have a demonstrated ability to constantly improve and keep abreast with rapidly evolving fields of bioinformatics, systems biology and will therefore be expected to attend data science seminars, conferences and workshops. 

The individual will also take mentoring roles focused on improving data analytical skills among all laboratory members, including senior members, technicians and trainees (pre- and post-doctoral).

Essential Duties

  • Serves as the biostatistician and the primary data manager for the division in a bioinformatics leadership capacity, as well as the administrator for a large departmental data repository currently in the planning stages of development, which will house all current and future research study data.
  • Writes policy and standard operating procedures for data monitoring, data management, data repository development, data verification and oversees compliance with information security etc.
  • Designs data architecture, establishes data dictionaries, uses statistical programming and transitions old databases into the new RedCap system.
  • Manages and constructs limited use de-identified datasets for permitted, shared use, overseeing proper authorization for data use agreements and secure release of information from the repository according to regulatory policy.
  • Works with research project teams to determine needs for coding conventions, data summary sheets and develops scoring algorithms for raw data sets.
  • Converts data stored in existing databases (i.e. Excel, MS Access, SQL etc.) into one universal format for RedCap database use. 
  • Uses statistical computer programs (e.g. SAS, SPSS) and existing database programs to construct and manage limited datasets relevant for immediate analyses in CND projects.
  • Develops data conventions for managing incomplete data sets, runs query checks and resolves edits and out of range values with study teams, documenting all processes; overseeing this process when done by others. 
  • Works in conjunction statisticians at collaborating research institutions (i.e. Strong Star, Mount Sinai, Yale etc.) to ensure uniformity in data entry and storage methods for shared projects across sites. 
  • Conducts interim statistical analyses on study data to assist investigators with go, no-go decision making for their research projects, as well as final analyses for publication of results. 
  • Provides expert advice with regard to data management strategies, data storage strategies and forms development to division staff, overseeing this work done by others. 
  • Serves as subject matter expert and data liaison for all aspects of the division’s data collection and management, including: statistical sections of study protocols, final study reports, final statistical analyses for research projects and data repository utilization. 
  • Summarizes complex neuroimaging, EEG, laboratory, behavioral and other statistical data into tables and graphs for papers, presentations and meetings. 
  • Supervises two staff members directly and indirectly influences and oversees the work standards of all research staff within the division who enter, manage, verify and monitor data.
  • Coordinates and oversees the set-up and utilization of standardized electronic assessment tools and cognitive testng batteries on VA computers. 
  • Develops and maintains data systems designed to ensure that databases used for research projects comply with administrative policies, procedures and requirements set forth by both the incumbent as well as the Research Office, Institutional Review Boards, Office of Research Oversight and the data security guidelines outlined by the Information Security and Compliance Offices, as well as overseeing proper authorization for data use agreements and secure release of information from the data repository according to regulatory policy.
  • Prepares regular workload reports for Core review and performs other related duties as assigned.

Required Education and Experience

Master’s Degree in statistics, bioinformatics, epidemiology or a related field and 5 years of related experience or an equivalent combination of education and experience.

Required Skill/Ability 1:

Familiarity with RNAseq pipelines (not limited to deseq2, EdgeR and Limma packages). At least completed 1 course on single cell RNAseq data analysis (with at least 8 contact-hours). Familiarity with proteomics pipelines (not limited to MAxQuant, Proteome Discoverer, MSFragpipe).

Required Skill/Ability 2:

Ability to independently troubleshoot analytic pipelines, and custom build pipelines to suit experimental needs. Ability to adopt existing code/pipelines from open-source forums (Ex.GitHub) and implement for in-house use.

Required Skill/Ability 3:

Demonstrated ability in Bioinformatics pipeline development, including RNAseq and proteomics analyses, including differential expression analysis, network analysis, gene set enrichment and pathway analysis. Quantitative academic focus and >2 years of data analysis/programming experience; or equivalent combination of education and experience.

Required Skill/Ability 4:

Extensive knowledge of and ability to apply standard software development principles, theories, concepts and techniques to data analysis. Strong programming skills in R, and/or Python. Demonstrated ability with data handing and processing using Amazon Cloud services (AWS). Managing large data sets.

Required Skill/Ability 5:

Excellent written and oral communication skills. Ability to effectively communicate with bioinformatics personnel/experts from other institutions to independently plan analytic pipelines. Ability to assemble high-quality figures for grants and publications. Ability to assemble slides and posters for talks/presentations

Preferred Education, Experience and Skills:

Master’s Degree in Biostatistics or Statistics AND two years of experience; or equivalent combination of education and experience, in the analyses of mass spectrometry proteomics and RNAseq data, including bulk RNAseq and single cell RNAseq.

Company Info.

Yale University

Yale University is a private Ivy League research university in New Haven, Connecticut. Founded in 1701, Yale is the third-oldest institution of higher education in the United States and one of the nine colonial colleges chartered before the American Revolution.

  • Industry
    Education
  • No. of Employees
    5,118
  • Location
    New Haven, CT, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Yale University is currently hiring Biostatistician Jobs in New Haven, CT, USA with average base salary of $120,000 - $250,000 / Year.

Similar Jobs View More