Degree in Computer Engineering
AWS, C++, Database, Deep Learning, Git, Google Cloud Platform (GCP), Machine learning techniques, Nextflow, Python Programming, R Programming, Snakemake, SQL
Key to insitro’s approach to rethinking drug development is the use of machine learning on high-content human data to build models of biological states and understand the effect of perturbations on those states. A particular focus area are molecular measurements of biological states, including levels of gene expression (transcriptomics) and protein expression (proteomics).
As an applied Machine Learning Scientist for Molecular Omics, you will develop and apply cutting edge machine learning and bioinformatic methods to analyze high-content omics data to build models of biological state as manifested in these data. and uncover new disease biology. We acquire such data from both human samples and from our high throughput wetlab, where we build and assay iPSC-derived cellular disease models under genetic and chemical perturbation, using both single-cell and bulk RNA-seq. You will devise new and meaningful representations for these omic data sets that reveal underlying biological processes and the effect of diverse factors and interventions on those biologies.
Your work will involve the development and deployment of cutting edge methods in classical genomics and machine learning, including deep learning. The data we deal with will require addressing challenges such as distribution shift, experimental artifacts, data sparsity, and small sample sizes, among other unique challenges. You will need to develop fit-for-purpose approaches that utilize methods such as self-supervised learning, multi-task learning, few-shot learning, network models, and more. You will work in collaboration with the software engineering team to develop these methods as robust, reusable platform components.
You will work closely with biological collaborators to design and analyze in-house experiments, ensuring that the experimental designs produce data that are fit for purpose for machine learning. You will also provide input to our corporate development team on initiatives to acquire or construct data from external sources. Finally, you will integrate data with patient genetics and diverse clinical and cellular phenotypes (including microscopy) to identify molecular targets for impactful therapeutics.
You will be joining an agile and fast growing biotech startup that has long-term stability due to significant funding. You will have ample opportunities for significant impact. You will work closely with a very talented team, learn a broad range of skills, and help shape insitro’s culture, strategic direction, and outcomes. Join us, and help make a difference to patients!
About You
Nice to Have
Compensation & Benefits at insitro
Our target starting salary for successful US-based applicants for this role is $160,000 - $215,000. To determine starting pay, we consider multiple job-related factors including a candidate’s skills, education and experience, the level at which they are actually hired, market demand, business needs, and internal parity. We may also adjust this range in the future based on market data.
This role is eligible for participation in our Annual Performance Bonus Plan (based on company targets by role level and annual company performance) and our Equity Incentive Plan, subject to the terms of those plans and associated policies.
In addition, insitro also provides our employees:
insitro is a data-driven drug discovery and development company using machine learning and data at scale to transform the way that drugs are discovered and developed for patients. insitro is developing predictive machine learning models to discover underlying biologic state based on human cohort data and in-house generated cellular data at scale. These predictive models can be brought to bear on key bottlenecks in pharmaceutical R&D.
South San Francisco, CA, USA
2-4 year
South San Francisco, CA, USA
4-6 year
South San Francisco, CA, USA
2-4 year