Batch size vs epochs
Batch Size:
How many training data size is used each time when the model parameter is updated?
For example, the batch size of Stochastic Gradient Descent is one. While the batch size of Gradient Descent equals to the whole training data size. Mini Batch Gradient Descent is in between them.
Epochs :
When the ENRITE training data is used only ONCE, it’s called one epoch. Epochs stand for how many times the whole training data is used repeatedly?