class MyNeuron extends Neuron method upward(messages [m1, m2, ..., ]) sum ← 0 for each w ∈ [m1, m2, ..., ] do sum ← sum + m.input * m.weight // propagate squashed output value to neurons of next layer propagate(squashingFunction(sum)); method downward(messages [m1, m2, ..., ]) for each w ∈ [m1, m2, ..., ] do gradient ← this.output * (1 - this.output) * m.delta * m.weight propagate(gradient); // weight collections w ← w + Δw (α * this.output * m.delta) // push updates to parameter server push(weights);The reason I separate it into two methods (unlike Google's Pregel) is that framework can determine whether it's upward or downward phase by message type internally. Moreover, with this, we can reduce the user-side code complexity. The advantage of this is very fit for multi-thread programming and parallel computing of each gradient calculations at neuron level.
Additionally, instead of allowing user to write a formula for some calculations directly within neuron-centric programming model, we abstract some arithmetic operations and generate the code for GPU acceleration internally. With this, we compile it to a GPU-oriented code that batches for speed.
That way the user will not have to think about the specifics of GPU, but instead focus on the algorithm.
No comments:
Post a Comment