Updates
February 2026 - Released Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets paper, accepted to EACL 2026 MME!
January 2026 - Very honored to be included in the Forbes Ukraine's Top 20 AI Leaders in Ukraine list!
December 2025 - I have joined ETH AI Center as a Research Engineer to work on post-training for the next versions of Apertus open model as part of Swiss AI Initiative!
November 2025 - Presented a lecture on adapting LLMs to EEU languages at the Lviv Polytechnic University Data Science Club.
October 2025 - Presented our work on multilingual models for EEU languages at the GIST workshop in Geneva.
September 2025 - Released MamayLM v1.0 Ukrainian-focused multimodal LLM.
July 2025 - Joined Hugging Face as a Machine Learning Research Engineer Intern to work on data science coding agents.
May 2025 - Finished my master thesis under the supervision of prof. Vechev and released a resulting Ukrainian-focused MamayLM v0.1 model.
October 2024 - Our paper A Synthetic Dataset for Private Attribute Inference has been accepted to NeurIPS 2024 Datasets and Benchmarks Track.
Work experience
ETH AI Center
Research Engineer
December 2025 - current
Post-training (SFT), multilinguality and alignment for Apertus model series
Hugging Face🤗
Machine Learning Research Engineer Intern
July 2025 - September 2025
◆ Improving data sourcing and training methods for data science coding agents
◆ Testing multi-node multilingual training
ETH Zurich
Research Assistant
May 2025 - July 2025
◆ Research Assistant at Secure, Reliable, and Intelligent Systems Lab (SRI)
◆ Improving multilingual training, alignment and safety for EEU languages
ETH Zurich
Research Assistant
Nov 2023 - May 2024
◆ Project Intern at Secure, Reliable, and Intelligent Systems Lab (SRI)
◆ Documenting capabilities and vulnerabilities of the state-of-the-art large language models
◆ Contributing to LVE (Language Model Vulnerabilities and Exposures) project
◆ SynthPAI: A Synthetic Dataset for Personal Attribute Inference: Semester project under supervision of Prof. Martin Vechev (co-supervised by Robin Staab and Mark Vero)
Fractal Analytics
Junior Data Scientist
Sep 2021 - Jul 2022
◆ Went through a 3-month internship with intensive training for statistics, machine learning techniques, data engineering and cloud (Azure)
◆ Executed an end-to-end Market Mix Modelling project for a particular segment of one of the world's biggest CPG companies, including methods research, EDA, developing a statistical model and fine-tuning it
Education
ETH Zurich
Statistics MSc
2022-2025
⭐ Main courses: Natural Language Processing, Large Language Models, Interactive Machine Learning: Visualization & Explainability, Probabilistic AI, Big Data for Engineers, AI4Good.
⭐ Master thesis "Enhancing Mid-Resource Language Performance in Large Language Models": end-to-end pipeline recipe for efficient bilingual LLM training and alignment (under supervision of prof. Vechev).
⭐ Extracurricular activities:
◆ Statistics representative at Seminar fur Statistik (SfS): organizing and leading events for students of Statistics MSc program
◆ Statistics MSc mentor: mentoring incoming first-year students of the program
◆ Member of VMP (student organization of D-MATH ETH department)
National Technical University of Ukraine
Economic Cybernetics MSc
2021-2022
⭐ Master thesis: 'Modeling the investment portfolio of E-commerce companies':
◆ Twitter sentiment analysis of E-commerce stock tickers
◆ Stock prediction using Generative Adversarial Networks (GAN)
◆ Investment portfolio modeling (option hedge fund, stock portfolio prediction)
National Technical University of Ukraine
Economic Cybernetics BSc
2017-2021
⭐ Grade: 94/100
◆ Graduated with honors
◆ Bachelor thesis: 'Modeling an equity investment fund using financial derivative management and hedging strategies'
Projects and publications
Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets
EACL 2026 MME
Efficient pipeline for automated translation of benchmarks and datasets, resolving translation issues for Eastern and Southern European languages.

MamayLM v1.0
INSAIT Institute
The First Open Multimodal Ukrainian LLM.

Jupyter Agent
Hugging Face 🤗
Multi-step pipeline to generate synthetic Jupyter notebooks with custom scaffolding to finetune efficient data science coding agents.

MamayLM v0.1
INSAIT Institute/ETH Zurich
An efficient bilingual LLM with cutting-edge performance in Ukrainian and English.

SynthPAI: A Synthetic Dataset for Private Attribute Inference
NeurIPS D&B 2024
LLM generated collection of synthetic texts to ensure privacy-preserving research in area of private attribute inference benchmarking of Large Language Models.

LVE Project
ETH Zurich
An open-source repository of Language Model Vulnerabilities and Exposures (LVEs).

Urban Planning Project
ETH Zurich (Interactive Machine Learning: Visualization & Explainability Course FS23)
A project for ETH Zurich course 'Interactive Machine Learning: Visualization & Explainability', spring semester 2023.
