Aartificial intelligence, DevOps, Effective communication skills, Java Programming, JavaScript, Leadership Skill, Machine learning techniques, Python Programming, Rust, Teamwork
[AI Lab]
Hyperconnect AI Lab discovers and solves problems that are difficult to approach with existing technologies but can be solved with machine learning technology in services that connect people, thereby innovating user experience. To this end, we develop numerous models in various domains including video/voice/natural language/recommendation, and we aim to solve problems encountered while stably providing them through mobile and cloud servers, so that the technology created by AI Lab contributes to the growth of actual services. Under this goal, Hyperconnect AI Lab has been developing machine learning technologies that contribute to Hyperconnect’s products, including Azar, for several years.
[Introducing the ML Software Engineering Team]
The ML Software Engineering team under the AI Lab aims to apply all of Hyperconnect's AI technologies to products to create business impact and develop sustainable systems/platforms to accelerate the application of AI technologies. To achieve this goal, the following tasks are being performed ( Interview ).
[Machine Learning-Based Backend Application Design and Implementation]
We develop various machine learning-based backend services to improve the quality of services operated by Hyperconnect and Match Group ( related tech blog ). We mainly focus on developing personalized recommendation systems and search systems. They are designed with a lot of consideration from a performance perspective to enable real-time operation on a global scale, and the microservices operated by the team handle the highest level of traffic within the company.
there is.
[Developing a real-time data pipeline for model inference]
We develop pipelines (Apache Flink, KSQL) that process real-time events and use them for model inference. We consider and design systems (ex. streaming applications, feature stores) to quickly and reliably collect, process, and serve features. Sometimes, we proactively discover features that improve model performance during the pipeline development process. For more detailed information, please refer to the following content.
- Operating an event-based live streaming recommendation system
- Data storage technology for machine learning applications
- Deview 2023 - Feature Store Implementation for Real-Time Recommendation Systems
[Development of a model serving platform]
We provide a unified serving platform using custom kubernetes operators and NVIDIA Triton. This allows you to quickly deploy ML models trained with various deep learning frameworks (Tensorflow, PyTorch) in various domains to production. We optimize the speed and throughput of model inference through software and hardware improvements, and optimize costs through continuous monitoring and high-efficiency computing resources such as AWS Neuron. We are currently operating more than 50 models in production and solving complex technical challenges that arise at this scale. For more detailed information, please refer to the following content.
[ML Ops Infrastructure Construction and Tool Development]
We build on-premise GPU clusters (NVIDIA DGX systems) and high-speed distributed storage, and manage/operate them using Ansible to save on human and material costs required for research (Reference: Building an Ultra-High-Performance Deep Learning Cluster Part 1 ). In addition, we develop developer portals, CLI tools, etc. that can control and utilize the ML Ops components and serving platforms provided by the ML Platform team. We also conduct PoCs of rapidly developing new MLOps technologies and apply them to production when necessary.
[Building a Continuous Learning Pipeline]
We build an automated virtuous cycle structure (AI Flywheel) that utilizes data obtained from products to retrain, evaluate, deploy models, and then improve products again. We provide ML Ops components for each stage of the ML pipeline (ML data processing, ML model training, ML data deployment) to help researchers easily build ML pipelines by combining them. In addition, based on the ML Ops workflow tool, we are developing a data pipeline that collects data from various domains and a data platform that seamlessly connects cloud storage and learning environments, providing cloud infrastructure and tools to configure automatic learning workflows, and exploring new areas so that it can be utilized in both experiments and pipelines.
[Development of an inference engine that operates on mobile devices]
We research and develop an inference engine SDK that can utilize Hyperconnect's on-device model using TFLite, PytorchMobile, etc. We perform conversion, quantization, SIMD optimization, development environment construction, etc. of mobile models together with the AI organization.
Responsibilities
Required Qualifications
Match Group is an American internet and technology company headquartered in Dallas, Texas. It owns and operates the largest global portfolio of popular online dating services including Tinder, Match.com, Meetic, OkCupid, Hinge, PlentyOfFish and OurTime, among a total over 45 global dating companies. The company was owned by IAC until July 2020 when Match Group was spun off as a separate, public company.
West Hollywood, CA, USA
2-4 year
Los Angeles, CA, USA
2-4 year
San Francisco, CA, USA
2-4 year
Palo Alto, CA, USA
2-4 year
New York, NY, USA
8-10 year