Combining Bound-T and SWEET to Analyse Dynamic Control Flow in Machine-Code Programs

2014 
The first step in the static analysis of a machine-code subprogram is to construct the control-flow graph. The typical method is to start from the known entry-point address of the subprogram, retrieve and decode the instruction at that point, insert it in the control-flow graph, determine the address(es) of the successor instruction(s) from the known semantics of the instruction set, and repeat the process for the successor instructions until all reachable instructions and control flows are discovered and entered in the control-flow graph. This procedure is straight-forward as long as the successors of each instruction are statically defined. However, most instruction sets allow for dynamically determined successors, usually by allowing the target address of a branch to be set by the run-time, dynamically computed value of a register. We call such instructions dynamic branches. To construct the control-flow graph, a static analyser must somehow discover the possible values of the target address, in other words, it must perform a value-analysis of the program. This is problematic for two reasons. Firstly, the value-analysis must be applied to an incomplete control-flow graph, which means that the value-analysis will also be incomplete, and may be an under-estimate of the value-set for the complete subprogram. Second, value-analyses typically over-estimate the value-set, which means that the set of possible target addresses of the dynamic branch may be over-estimated, which leads to an over-estimate of the control- flow graph. The over-estimated graph may include instructions and control flows that do not really belong to the subprogram under analysis. This report describes how we connected two analysis tools, Bound-T from Tidorum Ltd and SWEET from Malardalen University, so that the powerful "abstract execution" analysis in SWEET can be invoked from Bound-T to resolve dynamic branches that Bound-T finds in the machine-code program under analysis. The program-representation language ALF, defined by the SWEET group, is used as an interface language between Bound-T and SWEET. We evaluate the combined analysis on example programs, including both synthetic and real ones, and conclude that the approach is promising but not yet a great improvement. Bound-T contains several special-case analyses for dynamic branches, which currently perform slightly better than SWEET's more general analyses. However, planned improvements to SWEET may result in an analysis which is at least as powerful but more robust than the analyses in Bound-T alone.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    10
    References
    3
    Citations
    NaN
    KQI
    []