GrowSquares is searching for a data scientist intern to assist in the development and maintenance of our data science engine. You’re ready to explore new concepts and build in an agile, sprint-driven method of product iteration. You ideally have an interest in Smart Cities, urban agriculture, GIS mapping, and networked systems.
You’re interested in deep learning, A/B and multivariate testing methodologies, statistical analysis, and high-quality prediction systems. We're a small team so you'll need to be excited at the opportunity of taking an idea from theory into deployment and excited to work in a dynamic environment of discovery.
What you will be doing:
As a data science intern, you will be key to the development of our plant recommendation and support engines of our digital products and services. You will be responsible for defining the data structures we use to determine which plants work best, which includes training acquired data, model selection (comparing, validating, and choosing parameters and models), model relevant feature selection, and performing predictive analytics.
You will collaborate with marketing, researchers, software developers and business leaders to define product requirements and provide analytical support. As we grow, you will also be primarily responsible for improving the engine behind plant success, developing tests that isolate susceptibility to varying disease/fungi, and improving plant yield.
- Conducting exploratory data analysis and performing data wrangling and cleaning
- Write high-quality code, actively participate in code reviews, and consistently help to ship software
- Working closely with researcher and product manager to parse relevant academic and industry papers in order to better develop solutions
- Building visualization dashboards for internal and external stakeholders
- Developing end-to-end machine learning pipelines focusing on predictive models and time-series analysis
- Deploying solutions to the cloud and working with the developer to integrate them into our relevant platforms (mobile application, website, etc.)
- Working with clients in order to properly integrate needed/requested solutions into our product offerings
- Improving upon existing methodologies by developing new data sources, testing model enhancements, and fine-tuning model parameters
- Bachelor's Degree in Computer Science, Statistics, Data Science, Engineering, or other quantitative disciplines (Master’s degree preferred)
- 1+ years of relevant working experience in an analytical role involving data extraction, analysis, and communication
- 2+ years of experience with data querying, scripting, and statistical/mathematical languages (e.g. Python, R, SQL, MARLAB, etc.)
- Direct experience with both supervised learning methods (linear and logistic regression, time-series modeling, generalized linear models, decision trees, random forests, support vector machines, etc.) and unsupervised learning methods (K-means, hierarchical clustering)
- Understanding of tools to work with large data sets, e.g. NoSQL, ElasticSearch, MongoDB
- Working knowledge of cloud computing technologies (preferably GCP and AWS) and how to leverage them for use in data science
- Experience with REST APIs and Flask framework
- Excellent verbal and written communication skills, familiar with tools like Slack and Asana
- Demonstrable track record of dealing well with ambiguity, prioritizing needs, and delivering results in a dynamic environment (previous startup experience)
- Familiarity with the agricultural industry and basic plant biology knowledge
GrowSquares engineers outdoor gardens that are fun, productive and easy to use. Our system leverages a mobile application able to identify the best plants for
a user's environment as well as the components enabling improved plant growth. We package these components (nutrients, bacteria, minerals, etc) along with the the seeds of a users selected plants into modular, biodegradable square toppers, delivering them directly to users. Once a garden