ERROR while generating caliberation dataset for LLM

I am working on implementing GenAi usecase on MTK platforms

I am using openlm-research/open_llama_3b_v2 · Hugging Face model and I am trying to run it following the 5.1.3 Tutorial for Large Language models present in Neuropilot 8.0.9 documentation.

I am currently stuck at step 2 (Generating Caliberation Dataset):

Following is the SS of the error I am getting:

I am using GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1toolkit: https://neuropilot.mediatek.com/resources/downloads/b850a7a4-fb11-4a7e-85de-bffb18c5123b

Python: 3.8

I would be grateful if the error can be resolved.

Regards,

Nimesh

Dear Nimesh,

Thank you for providing the detailed screenshots of the error you are encountering during Step 2 (Generating Calibration Dataset).

The error you are seeing may be related to a compatibility issue between that specific Open Llama variant and the toolkit’s current version.

Suggested Action: Try an Alternate Source Model

Since the name of your deployment toolkit (GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1toolkit) explicitly references llama3.2, this could indicate that the toolkit is better optimized for or expected to work with the official Llama 3.2 models.

We recommend that you try switching to this model to see if it resolves the issue: meta-llama/Llama-3.2-3B (Hugging Face Link)

Please attempt the following steps:

  1. Replace the current model (openlm-research/open_llama_3b_v2) with meta-llama/Llama-3.2-3B.
  2. Restart the process from Step 2: Generating Calibration Dataset, as per the Neuropilot documentation.

If the problem persists after switching the model, please provide a new screenshot so we can continue troubleshooting.

Best regards, Jing

1 Like

Hello,

Thanks for replying.

I followed your instructions of using Meta-Llama 3.2 3B, but while generating Caliberation Dataset, I got the following error:

I think there is tokenizer class incompatibility.

I will be really grateful, if you can help me understand the error and resolve it.

Regards,

Nimesh

Thank you for sharing the error details. Based on the message

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from

this issue may be related to a mismatch between the model checkpoint and the tokenizer configuration.

Suggested Action

Please try the following adjustment to your model configuration file:

  1. Open the config.json file in your model directory.
  2. Add the following line inside the configuration:
"tokenizer": "pretrained_fast"

Additionally, if you later encounter the following error:

TypeError: _batch_encode_plus() got an unexpected keyword argument 'padding_side'

this typically indicates a version compatibility issue with the Transformers library.

To resolve it, please install a compatible version using:

pip install transformers==4.41.2

If the problem persists after making these changes, please share the updated error message or a screenshot so we can continue troubleshooting further.

Best regards,
Jing

Hello Jing,

Thanks for your response, with your suggestion to make changes in config.json, I could resolve that error but further got the following error running the same command:

I am currently running Llama3.2-1B model.

I’ll be really grateful if you can help me resolve this error.

Regards,

Nimesh

Hi Nimesh,

I tested Llama‑3.2‑1B‑Instruct on my side using the model from https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main and did not encounter the error you mentioned. Here are the module versions I am using—please check if they match yours:

mtk-converter: 8.13.0
mtk_llm_sdk: 2.5.3
mtk-quantization: 8.2.0
sentencepiece: 0.2.0
torch: 2.4.1
torchvision: 0.19.1
transformers: 4.41.2

If your versions differ, I recommend updating to the versions above and testing again to rule out any issues caused by version mismatch.

Thanks!

Regards,
Jing

Hi Jing,

I am currently using 8.0.10 premium version of Neuropilot. In this and earlier versions, I have mtk_llm_sdk: 3.4.2 and 2.8.2 only. **I am unable to fetch 2.5.3 version.
**
I could find Mtk_converter: 8.13.0 and mtk_quantization: 8.2.0 from Neuropilot 8.0.7 version of the documentation.

If it is possible at your end, can you please share mtk_llm_sdk with 2.5.3 version with me so I can continue this task?

Regards,
Nimesh

Hi Nimesh,

The mtk_llm_sdk 2.5.3 installation package is included in the toolkit you’re using. You can find it at the following path:

GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1/mtk_llm_sdk/mtk_llm_sdk-2.5.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Additionally, for the Llama 3.2-1B model, recommend prioritizing the Instruct version available here:

https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct

This is the version we have tested and confirmed to work without issues.

Regards,
Jing

Hi Jing,

Thank you so much for your help, I was able to create caliberation datasets, do PTQ and fix the shape.

4 th step was to check if model is running fine on PC itself using “bash 4_optional_inference_tflite.sh”.

When I ran this command, I got the following response which is very gibberish:

Are such sort of responses expected due to model being of 1B parameters or have I made some mistake?

Just wanted to confirm this with you. Please let me know your thoughts.

Regards,
Nimesh

Hi Jing,

I continued with the further steps mentioned in the documentation, model compilation steps are finished and I am currently on 5.1.3.6.2. Pushing Dependencies to the Device

In step 3b, while trying to create sentencepiece tokenizer, I am getting the following error:

I am using sentencepiece version 0.2.0

I am getting the same error if I use tokenizer.json or tokenizer.model. Can you please help me to debug it?

Also I want to confirm that in step 3c, “After the tokenizer files have been prepared, push them to the device using Android Debug Bridge (adb).” : by tokenizer files does it mean that added_tokens.yaml, vocab.txt, merges.txt need to be pushed to device or some other files need to be pushed as well?

I ll be really grateful if you can help me debug 3b and 3c steps of the documentation.

Regards,
Nimesh

Hi Nimesh,

Regarding the tokenizer files, the required ones are the following three:

  • vocab.txt
  • merges.txt
  • added_tokens.yaml

Have you tried using the prepare_huggingface_tokenizer.py script? If so, are you encountering the same issue?

Example command:

python prepare_huggingface_tokenizer.py ../../post_training_quantize/models/Llama-3.2-1B-Instruct/tokenizer.json

Expected output:

Exported 'vocab.txt' from 'tokenizer.json'
Exported 'merges.txt' from 'tokenizer.json'
Exported 'added_tokens.yaml' from 'tokenizer.json'

Regards,
Jing