Environment Information
- Platform: Genio-720, D9300
- Toolkit Version: GAI-Deployment-Toolkit-v2.0.9_vicuna1.5-7b-tree-speculative-decoding-plus-v0.1.tar.gz
Issue Description:
Users are unsure how to locate or set the modelOutputQuantScale (as seen in Netron) for PTQ and Tree Speculative Decoding tasks.
Solution:
The toolkit automatically provides the correct modelOutputQuantScale in the config_<model_name>.yaml file. Manual extraction or adjustment is unnecessary.