2007-12-25

comp.lang.lisp needs to stop complaining about overcommit

or: How I got sucked back into SBCL hacking on Christmas.
: david@radon:~; ps -o pid,vsz,rss,comm -p `pidof sbcl`
  PID    VSZ   RSS COMMAND
 1019 570424  4424 sbcl
: david@radon:~; ps -o pid,vsz,rss,comm -p `pidof sbcl`
  PID    VSZ   RSS COMMAND
 1019 963320 404204 sbcl
: david@radon:~; ps -o pid,vsz,rss,comm -p `pidof sbcl`
  PID    VSZ   RSS COMMAND
 1019 3988328 825804 sbcl

4 comments:

Anonymous said...

Context?

Unknown said...

Well, I definitely used the word overcommit on comp.lang.lisp in a recent thread

But I was basically complaining about linux's vm.overcommit_memory (a sysctl) "feature", not SBCL as such. No amount of SBCL hacking can fix the Linux overcommit "feature", perhaps only try to mitigate the effects of default overcommit on(heuristic) (=0) braindamage in self-defense*.

David L. apparently is working on an SBCL branch that does dynamic/incremental allocation, a nice feature in itself though! - and depending on how it works, could at least lessen the impact of linux's overcommit behaviour on SBCL.

(* Assuming you don't turn overcommit off, which is after all a trivial sysctl tweak. There are other factors to worry about if you do turn off overcommit (i.e. =2) though - Situation might have changed since 2005, but if not, means you might need a lot of swap if you do turn overcommit off, since MAP_NORESERVE won't be handled as well as it could be (you could just run with lots of swap though - disk is cheap): If linux had (has? - I could be out-of-date) a MAP_NORESERVE that worked as suggested in that thread at least when overcommit was off (i.e. SIGSEGV-perhaps/OOM-Kill-never rather than SIGSEGV-never/OOM-Kill-never or SIGSEGV-perhaps/OOM-Kill-perhaps), could be nice for SBCL and Java and others on linux))

David Lichteblau said...

Hi David...

Sorry if I made it sound like I was referring to your article, which certainly was one of the more interesting articles in that thread.

I was more referring to the thread as a whole. Javier's original post seemed rather arrogant, not just because of the subject line, but also because the size of his data set seems chosen specificially so that an implementation with a contiguous heap cannot handle it on a 32 bit architecture, while a different implementation happens to just cope. Perhaps it was an honest question after all, but it sounded more like a pointless provocation.

That is probably also why nobody who usually works on the code related to memory management in SBCL bothered to even reply to him and point out the real issue underlining all of this:

He's using a data set nearly too large for a 32 bit architecture, and wouldn't have any problems at all if he switched to a 64 bit kernel. CMUCL/SBCL's use of a contiguous heap worked fine back when 4 GB seemed huge. Today, it works fine /again/, now that we have 64 bit address spaces with plenty of contiguous areas to use. So the problem Javier complained about is merely a symptom of the end of an era, with 32 bit architectures beginning to cause trouble.

That said, I have indeed worked on a patch for incremental allocation, which avoids overcommit and allows a non-contiguous dynamic space.

As for overcommit itself, my patch makes SBCL run smoothly with overcommit=2, which I think is a feature in itself.

Not sure how many other applications really depend on MAP_NORESERVE. I was disappointed to find out that Linux doesn't interpret it in a useful manner. But with my patch, we don't need it anymore.

Anonymous said...

My [I'm the "David" (sorry about the useless nickname) who just commented, need to work out how to get blogger/google to use a more distinctive nickname...] confusion arising from your title was probably just terminological:

It's the kernel that overcommits the available memory resources on the assumption userspace has requested more than it will actually use, like a budget airline overcommiting seats because they know that someone'll probably cancel. A person (SBCL) booking the seat isn't doing the overcommitting, they'll just be infuriated by it if they get to the checkin late and all the passengers did in fact show up and it turns out the seat (memory page) they've booked isn't available because the airline (linux) assumed they had "overrequested" it.

That said, I don't actually know if you can use "overcommit" to mean the userspace's overrequest too. I've tried to avoid that usage myself to date, anyway.