Outrageously Large Neural Networks: The Sparsely-Gated Mixture-Of-Experts Layer(https://arxiv.org/abs/1701.06538)200 points|aaronyy|8 years ago|81 comments
Outrageously Large Neural Nets: Sparsely-Gated Mixture-of-Experts Layer (2017)(https://arxiv.org/abs/1701.06538)65 points|msoad|6 years ago|33 comments
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts (2017)(https://arxiv.org/abs/1701.06538)60 points|georgehill|1 year ago|10 comments
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-Of-Experts Layer(https://arxiv.org/abs/1701.06538)3 points|tonybeltramelli|8 years ago|0 comments
Outrageously Large Neural Networks: Up to 137B Parameters(https://arxiv.org/abs/1701.06538)2 points|serialx|8 years ago|1 comments
Outrageously Large Neural Networks(https://arxiv.org/abs/1701.06538)1 points|groar|8 years ago|0 comments
Regularizing Neural Networks by Penalizing Confident Output Distributions(https://arxiv.org/abs/1701.06548)2 points|tonybeltramelli|8 years ago|1 comments