Paper accepted at NeurIPS 2025 FoRLM workshop.

We stress-test the monitorability of chain-of-thought reasoning in language models, investigating whether reasoning models can obfuscate their reasoning processes.

Read the paper on arXiv →