Pearlmutter, Barak A. (1994) Fast Exact Multiplication by the Hessian. Neural Computation, 6 (1). pp. 147160. ISSN 08997667

Download (148kB)
 Preview

Abstract
Just storing the Hessian H (the matrix of second derivatives a2E/aw, aw, of the error E with respect to each pair of weights) of a large neural network is difficult. Since a common use of a large matrix like H is to compute its product with various vectors, we derive a technique that directly calculates Hv, where v is an arbitrary vector. To calculate Hv, we first define a differential operator Rycf (w)}=(a/&) f (w+ W) J~=~, note that%{Vw}= Hv and%{w}= v, and then apply R {.} to the equations used to compute 0,.
Item Type:  Article 

Keywords:  Fast Exact Multiplication; Hessian; 
Academic Unit:  Faculty of Science and Engineering > Computer Science Faculty of Science and Engineering > Research Institutes > Hamilton Institute 
Item ID:  5501 
Depositing User:  Barak Pearlmutter 
Date Deposited:  15 Oct 2014 10:21 
Journal or Publication Title:  Neural Computation 
Publisher:  MIT Press 
Refereed:  Yes 
URI:  
Use Licence:  This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BYNCSA). Details of this licence are available here 
Repository Staff Only(login required)
Item control page 
Downloads
Downloads per month over past year