tive, either forward or backward. The overhead of keeping track of these directional derivatives works to the detriment of the greedy method. For ℓ 1 regression, the overhead is relatively light, and greedy coordinate descent is substantially faster than cyclic coordinate descent. Although the ...