1. About Output Types (D):
No, the output D is not limited to fp32/int32. Looking at the table, D can be:
- fp32
- fp16
- bf16
- fp8
- bf8
- int8
2. Input/Output Patterns:
When A is fp16, you have two options:
```
Option 1:
A: fp16 → B: fp16 → C: fp16 → D: fp16 → Compute: fp32
Option 2:
A: fp16 → B: fp16 → C: fp16 → D: fp32 → Compute: fp32
```
The compute/scale is always higher precision (fp32 or int32) to maintain accuracy during calculations, even if inputs/outputs are lower precision.
3. Key Patterns in the Table:
- Inputs A and B must always match in type
- C typically matches A and B, except with fp8/bf8 inputs
- When using fp8/bf8 inputs, C and D can be higher precision (fp32, fp16, or bf16)
- The compute precision is always fp32 for floating point types
- For integer operations (int8), the compute precision is int32
4. Why Different Combinations?
- Performance: Lower precision (fp16, fp8) = faster computation + less memory
- Accuracy: Higher precision (fp32) = better accuracy but slower
- Memory Usage: fp16/fp8 use less memory than fp32
- Mixed Precision: Use lower precision for inputs but higher precision for output to balance speed and accuracy
Example Use Cases:
```
High Accuracy Needs:
A(fp32) → B(fp32) → C(fp32) → D(fp32) → Compute(fp32)
Balanced Performance:
A(fp16) → B(fp16) → C(fp16) → D(fp32) → Compute(fp32)
Maximum Performance:
A(fp8) → B(fp8) → C(fp8) → D(fp8) → Compute(fp32)
```
No comments:
Post a Comment