Linus Torvalds Gets Coding To Improve Linux’s User-Mode Stack Expansion

Linux creator Linus Torvalds doesn’t write as much actual kernel code these days as he used to. These days he’s often busy overseeing the upstream kernel development community with reviewing code, managing releases, and chiming in on mailing list discussions. Once in a while though he gets down and dirty with some low-level kernel hacking just as he’s done now for Linux 6.5 with improving the user-mode stack expansion code.

The just shy of 500 lines of code rework is for bettering the user mode stack-expansion bits. He explained in the merge:

This modifies our user mode stack expansion code to always take the mmap_lock for writing before modifying the VM layout.

It’s actually something we always technically should have done, but because we didn’t strictly need it, we were being lazy (“opportunistic” sounds so much better, doesn’t it?) about things, and had this hack in place where we would extend the stack vma in-place without doing the proper locking.

And it worked fine. We just needed to change vm_start (or, in the case of grow-up stacks, vm_end) and together with some special ad-hoc locking using the anon_vma lock and the mm->page_table_lock, it all was fairly straightforward.

That is, it was all fine until Ruihan Li pointed out that now that the vma layout uses the maple tree code, we really don’t just change vm_start and vm_end any more, and the locking really is broken. Oops.

It’s not actually all that horrible to fix this once and for all, and do proper locking, but it’s a bit painful. We have basically three different cases of stack expansion, and they all work just a bit differently.

For Linux 6.5 this should now be addressed thanks to Linus Torvalds taking on the coding challenge himself. More details in the merge.