Thursday, October 30, 2014

Proposal: Static Memory-Management Cost Reasoning in Garbage Collected Languages

While features of dynamic languages tend to have costs associated with them again, a large number of real analyses and optimizations have shown that large programs in languages such as python, javascript, and ruby are frequently much slower than they need to be because the syntactic cheapness of rapid allocation hides the cost. This is in stark contrast to explicit memory allocation systems, in which operations likely to cause memory management to egregiously slow a given function require a significant amount of verbosity.

These can be caught in profiling, but this process tends to be time-consuming and prone to only addressing problems in the most frequent path.

Given information about the rate of degradation of a language’s performance as the heap becomes more full, and the relative costs of read/write barrier and inter-generational pointers, the total cost of a series of allocations can be guessed. The call graph can be used to statically determine the relative age of arguments to infer inter-generational pointers. When a data structure is abnormally short-lived, and the structure has a higher allocation cost, this system could recommend that the programmer consider solving the problem with more primitive structures or find a way to persist some of the information between calls. When garbage cycles are created in the same block in a reference counted language, the programmer could be gently informed that this cycle has a cost.

As this is a reasoning tool, static understanding of all memory access is not required for this to be useful or correct. Furthermore, I believe that it is a good platform for experimentation with static analysis of memory management activity. A compiler may one day use heuristics used here to determine refactors which could improve GC performance.

I would propose that this be implemented for the cpython interpreted environment as a reference implementation. This is due to the fact that there is write barrier, as it is reference counted. A cycle collector is used to break cycles. Cycle-collecting GC pause times have been considered an almost deal-breaking headache when relying on asynchronous IO for lower latency. The added complexity of the fact that the collector uses a generational scheme might require a degree of ‘interpretation’ of the allocation/deallocation statements in a block to determine the likelihood that it would cause or hasten the next expensive collection.

In order to make this information most useful, I would create bindings to this plugin from emacs/vim. I’m familiar with the scripting languages of these languages, and believe this to be the easier part.

I believe that much is to be gained by adding automated reasoning about memory management to the workflow of developers.

No comments:

Post a Comment