Files read and written get cached in the Unified Buffer Cache (UBC) on Mac OSX.
The UBC was hindering me because I was processing a huge file in chunks but throwing out each chunk, never to be reused again, after writing out the processed chunk. I would see gigabytes of memory get eaten away by the UBC until the system started swapping and became unresponsive.
UBC cannot be limited to a given maximum amount of memory.
UBC cannot be inspected programmatically.
UBC can be cleared by running ‘purge’ which allocates a lot of memory to force the cache to clear. The following bit of code can be used turn caching off for a particular file:
1
| |
This can be done in any process and the file can be closed after. The setting persists through out the lifetime of the file. If the file is removed and re-created then the setting is lost.
How can you tell if the setting took hold and the file is indeed NOT being cached?
1 2 3 4 5 6 7 8 | |
Tried reading the file three times. Speed is about the same.
What about a regular file that’s cached by default?
1 2 3 4 5 | |
Notice that reading from the cache is much faster the second time around.
Kudos to Dominic Giampaolo from Apple for explaining all this to me!