home
Mohamed Arbi Nsibi
tag

#finetuning

2 posts

CUDA Agent paper notes

Notes on CUDA Agent and how it uses an agentic RL workflow with tools for CUDA kernel optimization.

ReFT: The Future of LLM Fine-Tuning

A new method called Reinforced Fine-Tuning (ReFT) uses reinforcement learning to help large language models solve complex math problems more effectively than traditional supervised fine-tuning.