Tenerife Skunkworks

Trading and technology

Tracking IO Patterns in Memory-mapped Dynamic Libraries

The Mac OSX dynamic linker uses mmap to load dynamic libraries into memory. The memory range occupied by the libraries is then backed by a set of virtual memory pages, chunks of 4096 bytes each.

Using mmap is efficient because pages are lazily loaded from disk as needed and the virtual memory pager is free to evict them behind the scenes when memory is needed for something else.

Each page is filled with code for functions that live in the dynamic library and pages are fetched from disk when a call is made to a function in the dynamic library.

Pages should be accessed sequentially for best performance but how do you find out if this is the case? The only way to find out is to plug into the virtual memory manager and track when pages backing a particular dynamic library are paged-in from disk.

This is doable but a tad complicated as it requires access to internal kernel structures. As always, DTrace comes to the rescue! Note that the following DTrace script will only work on Snow Leopard. Let’s take it apart and see how it works…

First we define a an alias for an internal kernel type. This is not necessary but saves on typing.

1
typedef struct nameidata* nameidata_t;

A dynamic library is usually opened with a call to dlopen. The first argument (arg0) is the library path.

1
2
3
4
5
6
pid$target::dlopen:entry
/arg0/
{
 self->ppath = arg0;
 self->dylib = 1;
}

The Mac OSX dynamic linker (dyld) does not use dlopen on dynamic libraries, though, so we need to do different.

1
2
3
4
5
6
7
8
pid$target::dyld??loadPhase5*:entry,
pid$target::dyld??loadPhase4*:entry
/!self->dylib && arg0/
{
 self->path = copyinstr(arg0);
 self->func = probefunc;
 self->dylib = 1;
}

We want to trigger subsequent probes only if we entered via one of the two entry points above. We need to reset the flag once we return and this is what we do below.

1
2
3
4
5
6
7
8
9
10
11
pid$target::dlopen:return
{
 self->dylib = 0;
}

pid$target::dyld??loadPhase5*:return,
pid$target::dyld??loadPhase4*:return
/self->func != 0 && self->func == probefunc/
{
 self->dylib = 0;
}

It often happens that memory we want to access in DTrace is not available at the function entry point. This is what happens sometimes with the dlopen entry probe. The open call is triggered by dlopen so we can copy the file path into kernel space using the self->ppath pointer we saved earlier.

1
2
3
4
5
6
7
/* file name memory should be wired in by now */

pid$target::open:entry
/self->dylib && self->ppath && self->path == 0/
{
 self->path = copyinstr(self->ppath);
}

The Mac OSX virtual memory manager identifies a file by its virtual node (vnode) in the virtual filesystem (VFS) layer. We need to somehow match up the dynamic library name with its vnode and this is where the vn_open_auth entry probe comes in.

Note that unlike the DTrace pid$target provider we used before, we use the Function Boundary Tracing (FBT) provider since the virtual filesystem layer lives in the kernel.

1
2
3
4
fbt::vn_open_auth:entry
{
 self->ndp = (nameidata_t)arg0;
}

This clause is not absolutely necessary but I’m populating self->curpath as a shortcut, to be used a few times in later probes.

1
2
3
4
5
6
7
/* wait to make sure ndp and vnode are fully populated */

fbt::vn_open_auth:return
/self->path != 0/
{
 self->curpath = stringof((self->ndp)->ni_pathbuf);
}

It’s not obvious how to best match up a vnode with a file name so I cheated and studied the XNU kernel source code which Apple makes available. The vnode is not populated until vn_open_auth returns so we have to wait until it happens fo fetch the path.

We are almost done with our task, just need to save the library name and library path to vnode mappings if our library names match.

1
2
3
4
5
6
7
8
9
10
/* make sure we are opening the same file */

fbt::vn_open_auth:return
/self->curpath != 0 && self->path == self->curpath/
{
 this->vp = (vnode_t)(self->ndp)->ni_vp;
 this->lib = stringof((this->vp)->v_name);
 self->lib[this->lib] = self->path;
 self->vnode[this->lib] = this->vp; 
}

We do need to clean up sometime and so we do.

1
2
3
4
5
6
7
8
9
fbt::vn_open_auth:return
{
 self->path = 0;
 self->ppath = 0;
 self->curpath = 0;
 self->ndp = 0;
 self->func = 0;
 self->dylib = 0;
}

We use the same function boundary tracing (fbt) DTrace provider to track pageins. We print the file offset of the data we are paging in, as well as the size.

1
2
3
4
5
6
7
8
9
10
11
12
13
fbt::vnode_pagein:entry
{
 self->v_name = stringof(((vnode_t)arg0)->v_name);
}

/* vnode pointers should match but v_name seems more secure */

fbt::vnode_pagein:entry
/self->lib[self->v_name] != 0/
{
 printf("vnode_pagein: %s, offset: %u, size: %u\n", 
   self->v_name, arg3, arg4);
}

It does look from the output that we are loading multiple pages at the same time, e.g. size 1019904 corresponds to 249 pages. Page access looks quite random, though, which is killing us.

Now that we know what the access pattern is, we should try to first identify the symbols that are being accessed in each set of pages and then make the linker rearrange the code such that page access is more sequential and less random.

Note that you are not likely to see page-ins when running this DTrace script unless you have just restarted your Mac and are running Firefox for the first time. This is because the dynamic libraries will be stored in the Unified Buffer Cache (UBC) after first access and there won’t be any subsequent disk access until they are evicted from the cache.

I wrote about hacking the Unified Buffer Cache before but the same technique does not work with mmap-ed data since the virtual manager and the cache are the same thing. Evicting the libraries would involve allocating at at least twice as much virtual memory as there’s RAM and then touching each page to make sure it’s cached. This is unlikely to correspond to normal use of Firefox, though.

Comments