The tremendous progress in high performance Very-Large Scale Integrated (VLSI) circuit technology now allows several million transistors to be incorporated onto a single silicon chip with on-chip clock rates as
26 to illustrate the placement of the corresponding weights into these nodes. To compute the different dot-products of the matrix multiplication, the data inputs are provided in a sequence of read commands. To compute the output of single layer, the pages of weights are then read sequentially ...
Hauck, S., et al., “Totem: Domain-Specific Reconfigurable Logic,” IEEE Transactions on VLSI Systems, 2006 Month N/A, pp. 1-25. Kaviani, A., et al., “Computational Field Programmable Architecutre,” Custom Integrated Circuits Conference, Proceedings of the IEEE 1998, May 11-14, 1998...