The first reasoning-trained Mistral model, built for chain-of-thought and task planning. Trained entirely with RL, it enhances instruction-following and multimodal reasoning.