[LLaMA3-8B] [PTQ] Common Issues in Calibration Dataset Generation

Environment Information

Issue Description:

  1. Tokenizer Class Mismatch:
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'PreTrainedTokenizerFast'. 
The class this function is called from is 'LlamaTokenizer'.
  1. Unsupported Argument padding_side:
TypeError: _batch_encode_plus() got an unexpected keyword argument 'padding_side'

Solution:

  1. For tokenizer class mismatch:
  • Add "tokenizer": "pretrained_fast" to config.json.
  1. For argument error:
  • Install the compatible version of transformers:
pip install transformers==4.41.2

The solution was tested and verified on July 7th, 2025