-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve memory instrumentation #1790
Improve memory instrumentation #1790
Conversation
This increases the accuracy of crash dumps and the upcoming allocation tagging feature.
This may be of interest in crash dumps and allows the upcoming allocation tagging feature to track allocations on a per-NIF basis. Note that this is only updated when user code calls a NIF; it's not altered when the emulator calls NIFs during code upgrades or tracing.
352e106
to
12c31ee
Compare
This commit replaces the old memory instrumentation with a new implementation that scans carriers instead of wrapping erts_alloc/erts_free. The old implementation could not extract information without halting the emulator, had considerable runtime overhead, and the memory maps it produced were noisy and lacked critical information. Since the new implementation walks through existing data structures there's no longer a need to start the emulator with special flags to get information about carrier utilization/fragmentation. Memory fragmentation is also easier to diagnose as it's presented on a per-carrier basis which eliminates the need to account for "holes" between mmap segments. To help track allocations, each allocation can now be tagged with what it is and who allocated it at the cost of one extra word per allocation. This is controlled on a per-allocator basis with the +M<S>atags option, and is enabled by default for binary_alloc and driver_alloc (which is also used by NIFs).
12c31ee
to
573a5ab
Compare
I just ended up reaching for this and turns out its gone :( What this new rewrite lacks is the most important feature, the fact you can search which terms are in memory and how many references they have. This helps quickly identify what garbage is lying about, for example this module helped me identify the binary heap leak inside jiffy. Now it just tells you information but nothing detailed. (you got a pointer to where the term was inmemory, then you can just read it from the processes memory). This new version completely removes this feature. Any plans to perhaps add it back? |
No, you've always needed a debugger to do what you've described and the rewrite merely changes how you use it. The gdb commands below will get you more or less the same information as before, and it's easily extended to include a lot more than the old instrument module did. # printf at the start of the block scan loop in gather_ahist_scan. Your
# line numbers may differ.
## *After* `block = SBC2BLK(allocator, carrier);`
dprintf erl_alloc_util.c:7638,"SB: %p, %p\n", (((char*)(block)) + sizeof(Block_t)), block->bhdr
## At `UWord block_size = MBC_BLK_SZ(block);`
dprintf erl_alloc_util.c:7653,"MB: %p, %p\n", (((char*)(block)) + sizeof(Block_t)), block->bhdr
# Log the above to a file
set logging overwrite on
set logging file allocations.txt
set logging redirect on
set logging on
set pagination off
# Break out of the debugger, and then run instrument:allocations().
# Don't forget to turn off logging and/or redirect when you're done. |
Thanks I will check that out, but its not correct. You did not need a debugger because you can just read back from /proc/mem and use os:getpid/0 to get the runtime pid. And the memory_data provided the ref counts as well as addresses. So I do not see the need to use a debugger to extract this information unless I am misunderstanding something.. |
I'd say that's a debugger by a different name, you're reading the raw memory of another process without its consent.
It didn't as far as I can tell, maybe I'm missing something: otp/erts/emulator/beam/erl_instrument.c Lines 637 to 831 in 6a99831
In either case it's rather trivial to modify the above example to provide reference count for binaries etc. The allocation tag tells you what type the object is and where it came from, so you can just set a breakpoint after the tag is read and print the reference count if it's a binary. |
Still inside managed code it makes things much easier, as well as you get the actual pid that did the allocation. VS using GDB not only does that raise the skill level to debug memory issues much higher, it also creates the requirement to write a bunch of custom tooling around GDB now if you want to get more information out the running system (vs just dumping clear text logs). Before this was all accessible from inside the emulator, without leaving it.
I have not used it in a while, maybe it gave you the address to the allocated block the term was in (for shared binary heap terms atleast)? Yea but its a very useful tool as it was before IMO for figuring out the hard stuff. Things like Recon (and this new instrument module) do not really cut it as they tell you there is a problem, but you cannot really pinpoint what exact code is causing the problem. So you need to cut to GDB, but maybe that makes sense, pinpointing issues like this are a task for GDB otherwise it gets hard to maintain. Like its out of scope from inside managed code to be able to do stuff like this. Also GDB seems the goto tool as well for other issues like scheduler locking around ets tables. |
Wrangling I don't get it. It's like using a fingernail to drive in a screw, why not use the right tool for the job?
In return you had no idea where it originated and had to resort to reading block contents with a debugger to figure that out, which is pretty much only feasible with binaries that contain recognizable data. The pids weren't particularly useful either since they were only present when a process was directly responsible for the allocation (ports, system tasks, async jobs, etc were hopeless), and they could be long dead by the time you inspected them.
It doesn't take long to pick up the basics and there are countless tutorials on the net to help you. I don't think it's too much to ask.
No, it just prints the block address, type, and size. It doesn't say anything about the contents nor do they have anything to do with Erlang terms. For example, the It's just a pile of bytes and you can't even say what the reference count is without knowing the struct layout. With
This basically sums it up. For what it's worth, I'm sorry this PR derailed your workflow. |
Would it be possible to extend this module so it can show all the terms in a processes eheap? Like for example I know I have a memory issue, eheap is growing insanely fast if some unhappy path is taken. Up to 10MB is allocated for eheap then it stops growing (maybe this is some kind of internal limit after which GC is forced?). If the unhappy path is not taken in that 1% or something chance, everything is fine. Great, so how do I find now the unhappy path that is triggering the allocations? One way would be if I could dump all the terms in the eheap or at least get a pointer to it. This way I can easily and quickly see. Of course the best way would be tie every term in the eheap to the line number in source code that allocated it. But this might be complex due to all the optimizations with memory copying? |
No, things like that are out of scope and you can already do this with |
This PR replaces the old memory instrumentation with a new implementation that scans carriers instead of wrapping
erts_alloc
/erts_free
. The old implementation could not extract information without halting the emulator, had considerable runtime overhead, and the memory maps it produced were noisy and lacked critical information.Since the new implementation walks through existing data structures there's no longer a need to start the emulator with special flags to get information about carrier utilization/fragmentation. Memory fragmentation is also easier to diagnose as it's presented on a per-carrier basis which eliminates the need to account for "holes" between
mmap
segments.To help track allocations, each allocation can now be tagged with what it is and who allocated it at the cost of one extra word per allocation. This is controlled on a per-allocator basis with the
+M<S>atags
option, and is enabled by default forbinary_alloc
anddriver_alloc
(which is also used by NIFs).The old interface has been removed since it can't be adapted to the new implementation and the module has always been experimental. It could coexist with the new one if we kept the old implementation around, but it will be removed unless someone makes a strong case for keeping it.
There's still some work left on the documentation, but with rc1 coming soon I figured it's best to get feedback as soon as possible.
Examples:
instrument:carriers/0-1
return a list with information about each carrier's total size, combined allocation size, allocation count, whether it's in the migration pool, and a histogram of free block sizes (log2, the first interval ends at 512 by default).In the example below,
ll_alloc
carrier has no free blocks at all, thebinary_alloc
andeheap_alloc
ones look healthy with a few very large blocks, and thefix_alloc
carrier is somewhat fragmented with 22 free blocks smaller than 512 bytes (although this is not a problem for that allocator type).instrument:allocations/0-1
return an allocation summary with histograms of allocated block sizes grouped by their origin and allocation type. The example below is taken with+Muatags true
to track allocations on all allocator types.instrument:allocations/1
andinstrument:carriers/1
let you tweak the histograms and which allocators to look in.