About
Hi! I’m Nurdaulet (you can call me Nur).
I’m a first-year PhD student at MBZUAI, where I’m fortunate to be advised by Prof. Kentaro Inui and Prof. Preslav Nakov. My research focuses on AI Safety and Mechanistic Interpretability. I’m particularly interested in understanding how large language models work internally, and developing techniques to make them more reliable, safe, and aligned with human values.
Previously, I collaborated with Prof. Nils Lukas and Prof. Hanan Aldarmaki on the SPIRIT project — a paper addressing jailbreak attacks in speech language models. I’ve also contributed to the development of Llama-3.1-Sherkala-8B-Chat, as well as KazMMLU, initiatives aimed at advancing Kazakh language.