dropout regularization
It works by randomly "dropping out" unit activations in a network for a single
gradient step. The more you drop out, the stronger the regularization:
1-) 0.0 = No dropout regularization.
2-) 1.0 = Drop out everything. The model learns nothing.
3-) Values between 0.0 and 1.0 = More useful.