Job Details
Senior LLM Developer for YandexGPT Reasoning Team
Senior LLM Developer at YandexGPT Reasoning Team. Focuses on improving reasoning abilities of Alice AI LLM family, combining reasoning, external function usage, and search capabilities. Tasks include developing YandexGPT's core reasoning, integrating signals into a unified model, and experimenting with RL algorithms.
Our team is improving the reasoning capabilities of the Alice AI LLM family. We aim to create a unified model that combines the strengths of reasoning, external function usage, and search capabilities of our services. What tasks await you: • Development of YandexGPT's core reasoning capabilities The immediate goal is to learn how to more effectively scale RLVR and RLHF training of reasoning models for B2C tasks for a wide range of users. You will conduct experiments to improve the general properties of the model (usefulness, logic, structure) by increasing the quality of reasoning chains and strengthening patterns (backtracking, verification) on tasks of various topics. • Integration of signals into a unified model We aim to create new alignment stages and reward systems that will allow us to combine signals on mathematical and scientific tasks with creative ones, improving the quality of reasoning in each area compared to independent training. You will conduct research with new reward models, stages, and distillation within a unified model. • Experiments with RL algorithms You will search for SOTA approaches that benefit the real product: experiment with RL algorithms, test methods for stabilizing the training of Dense and MoE architectures.
We expect you to: • Have an excellent understanding of NLP and classical ML • Be proficient in Python • Understand how modern LLMs work, have solved applied problems with them, or have relevant research experience
We care about the children of Yandex employees and organize children's days at the office. This is not all the bonuses - a full list is here: https://yandex.ru/jobs/pages/benefits?utm_campaign=ya_nanimaet
Don't miss a single job
Subscribe to our Telegram channel