Automatic differentiation in ML: Where we are and where we should be going

2018 
We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) closely related to A-normal form (ANF) which is specifically aimed at supporting fully-general AD for array programming efficiently. Unlike existing dataflow programming representations in ML frameworks, our intermediate representation (IR) naturally supports function calls, higher-order functions, recursion, etc. making ML models easier to implement. The ability to represent closures allows us to perform AD using ST without a tape, making the resulting derivative (adjoint) program amenable to ahead-of-time optimization using tools from functional language compilers, and enabling higher-order derivatives. Lastly, we introduce a proof of concept compiler toolchain called Myia which uses a subset of Python as a front end.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    37
    References
    21
    Citations
    NaN
    KQI
    []