Speculative data load mitigation through register tagging and data provenance

Modern microarchitectural attacks often depend upon data value speculation using values passed between contexts through general purpose registers (GPRs). A value is passed from one privilege level (e.g. userspace, EL0) into a kernel (EL1), or from one sub-context of the same privilege level (e.g. within a JIT runtime). In either case, the registers can be automatically tagged to identify the provenance of the value and track it through the microarchitecture implementation, and in particular the speculative backend.

Modern machines are implemented to provide a certain number of generally usable registers (GPRs). These are used for interim calculations, often as part of a “load store architecture” in which values are loaded from memory, data is transformed, and results stored back to memory. Sophisticated contemporary computer architectures typically provide 31–32 GPRs and set few constraints upon their use, while less sophisticated legacy architectures usually provide fewer registers, and impose more constraints, but their function is similar.

These GPRs are not traditionally banked by exception level. That is to say that you get one flat file of registers (at the architecture level) that is shared by the different exception levels of privilege. An implementation may have a greater number of physical registers in its physical register file and use Out-of-Order computing techniques to map these onto the architectural register file, but there still remains no distinction between the differing exception levels.

As an example of contemporary use, suppose I have an ABI between userspace and kernel that implements a system call interface. Those values supplied by userspace are written into registers and then a system call instruction is performed to transition into kernel context. The registers are read by the kernel. No distinction is made between any of the registers (other than by software) as to who set them, and whether those values are trusted.

The problem comes when speculative computing backends are introduced. An implementation might aggressively speculate data loads based upon the values of GPRs, such as a load of a data value prior to performing a bounds check on another value. Since the backend has no way to know whether this speculation is safe, barrier instructions must be added to inform it not to perform subsequent loads speculatively prior to previous bounds checks.

Enter register tagging. Tagging the provenance of the registers allows for automatic restriction of speculation based upon those values.

This concept can be extended to same privilege level trivially by adding instructions or control registers to an architecture that allow the architecture to tag a subset of registers as being used by less privileged code (even within the same exception level), such as within a Java JIT or other runtime.

There are some problems with the idea as it stands. Chiefly, that of register spill and save/restore not conveying the tag information. There are a few potential ways to address this that I am currently working on. Thus, this is a work in progress, but I wanted to share this now to get people thinking. I will be writing this up in more detail and aim to turn it into a paper. If any of the architecture companies are interested in working with me on it, ping away.

Footnote: This idea occurred to me last year. Originally, it was going to be part of an (unrelated to security) patent filing, the purpose of which (as with my other microarchitectural filings) would have been solely to keep it out of the hands of certain others who might misuse it. But rather than file this one, I want to explicitly release it into the public domain as a useful idea. It needs work, there are holes (such as the register spill/restore), but we can hopefully refine this into something actually usable that lets us keep speculating away.

Addendum: When you write to memory (spill registers), store the tags in the D$. Then, when you reload, pull them from there. If you don’t have tags for registers to identify their provenance, default to disabling speculation. You can get creative by using e.g. a bloom filter to efficiently store the tag info.

Computer Architect