MareArts Computer Vision Study.: How Gradient calculation in batch size.

9/30/2024

How Gradient calculation in batch size.

Let's use a simplified example with just 2 data points and walk through the process with actual numbers. This will help illustrate how gradients are calculated and accumulated for a batch.

Let's assume we have a very simple model with one parameter w, currently set to 1.0. Our loss function is the square error, and we're using basic gradient descent with a learning rate of 0.1.

Data points:

x1 = 2, y1 = 4
x2 = 3, y2 = 5

Batch size = 2 (both data points in one batch)

Step 1: Forward pass

For x1: prediction = w * x1 = 1.0 * 2 = 2
For x2: prediction = w * x2 = 1.0 * 3 = 3

Step 2: Calculate losses

Loss1 = (prediction1 - y1)^2 = (2 - 4)^2 = 4
Loss2 = (prediction2 - y2)^2 = (3 - 5)^2 = 4
Total batch loss = (Loss1 + Loss2) / 2 = (4 + 4) / 2 = 4

Step 3: Backward pass (calculate gradients)

Gradient1 = 2 * (prediction1 - y1) * x1 = 2 * (2 - 4) * 2 = -8
Gradient2 = 2 * (prediction2 - y2) * x2 = 2 * (3 - 5) * 3 = -12

Step 4: Accumulate gradients

Total gradient = (Gradient1 + Gradient2) / 2 = (-8 + -12) / 2 = -10

Step 5: Update weight (once for the batch)

New w = old w - learning_rate * total gradient
New w = 1.0 - 0.1 * (-10) = 2.0

So, after processing this batch of 2 data points:

We calculated 2 individual gradients (-8 and -12)
We accumulated these into one total gradient (-10)
We performed one weight update, changing w from 1.0 to 2.0

This process would then repeat for the next batch. In this case, we've processed all our data, so this completes one epoch.

MareArts Computer Vision Study.

Pages

9/30/2024

How Gradient calculation in batch size.

No comments:

Post a Comment