Optimization¶
Training Innovations¶
(整理自:https://github.com/robertsdionne/neural-network-papers)
- Adding Gradient Noise Improves Learning for Very Deep Networks
- Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
- Net2Net: Accelerating Learning via Knowledge Transfer
- Learning the Architecture of Deep Neural Networks
- GradNets: Dynamic Interpolation Between Neural Architectures
- Reducing the Training Time of Neural Networks by Partitioning
- The Effects of Hyperparameters on SGD Training of Neural Networks
- Gradient-based Hyperparameter Optimization through Reversible Learning (github.com/HIPS/hypergrad)
- Learning Ordered Representations with Nested Dropout
- Learning Compact Convolutional Neural Networks with Nested Dropout
- Reducing Overfitting in Deep Networks by Decorrelating Representations
- Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets
- Efficient Per-Example Gradient Computations
- Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
- Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Highway Networks
- Random Walk Initialization for Training Very Deep Feedforward Networks
- Deeply-Supervised Nets
- Improving neural networks by preventing co-adaptation of feature detectors
- Maxout Networks
- Regularization of Neural Networks using DropConnect
- Distilling the Knowledge in a Neural Network
- Domain-Adversarial Neural Networks
- Weight Uncertainty in Neural Networks
- Notes on Noise Contrastive Estimation and Negative Sampling
- Scale-invariant learning and convolutional networks
- Empirical Evaluation of Rectified Activations in Convolutional Network
- Deep Boosting (github.com/google/deepboost)
- No Regret Bound for Extreme Bandits
- 【论文+代码:适应特征变换的神经网络不变反向传播训练算法】《Invariant backpropagation: how to train a transformation-invariant neural network》S Demyanov, J Bailey, R Kotagiri, C Leckie (2015)
- 代码: ConvNet
- 【论文+代码:加速异步分布式随机梯度下降(FASGD)】《Faster Asynchronous SGD》A Odena (2016)
- 代码: fred