Posted on:10 Jan 2023
BACK TO SEARCH
Java Programming, Python Programming, C++, C Programming, SQL, Apache Hadoop, Scala Programming, Machine learning techniques, Data science techniques, PyTorch, TensorFlow, MapReduce, R Programming
- Design, build data models to support structured and unstructured data
- Design, build and deploy scalable high-volume data pipelines to move data across systems
- Lead architecture and implementation of batch and real-time data pipelines with instrumentation
- Design, build data transformations, metrics and KPI with data governance and data privacy policies
- Build centralized data lake, data warehouse and visualizations that support multiple use cases across different products for engineering and enterprise
- Work with Product team to deliver features on time
- Build data subject matter expertise and own data quality
- Design and develop software and data solutions that help product, engineering and business teams make data-driven decisions
- Owning existing processes running in production, problem solving and optimization
- Partner with data science team to provide quality data for model development and productionizing machine learning models
- Partner with analytics team to build datasets that support visualizations
- Conduct design and code reviews to deliver production quality code
- 4+ years of experience in data warehouse space
- 4+ years of experience in custom ETL/ELT design, patterns for efficient data integration, change data capture, implementation, and maintenance
- 4+ years of experience in query writing(SQL & NoSQL), schema design, normalized data model and dimensional model
- 2+ years of experience in Python , Spark, API, Git, CI/CD, and AWS Cloud
- 2+ years of experience in any MPP databases (AWS Redshift, Snowflake, etc) and RDBMS (PostgreSQL, mySQL)
- Experience processing variety of data sources : Structured, Unstructured, Semi-Structed, SQL, PubSub, API and Event based in cloud based infrastructure and data services
- Experience in Airflow, S3, DBT
- Excellent communication and collaboration skills
- Strong coding skills in Python
- A passion for building flexible data sets that enable current and future use cases
- Analyzing large volumes of data to provide data driven insights, gaps
- Experience using development environments such as Docker, Kubernetes.
Appen is a global leader in the development of high-quality, human-annotated datasets for machine learning and artificial intelligence. Appen brings over 20 years of experience capturing and enriching a wide variety of data types including speech, text, image and video. With deep expertise in more than 180 languages and access to a global crowd of over 1 million skilled contractors, Appen partners with technology, automotive and eCommerce companies — as well as governments worldwide — to help them develop, enhance and use products that rely on natural languages and machine learning.
Appen Limited is a publicly traded data company listed on the Australian Securities Exchange under the code APX. Appen provides or improves data used for the development of machine learning and artificial intelligence products.
Get Similar Jobs In Your Inbox
Appen Limited is currently hiring Senior Data Engineer Jobs in United States with average base salary of $160,000 - $240,000 / Year.