ERROR while generating caliberation dataset for LLM

I am working on implementing GenAi usecase on MTK platforms

I am using openlm-research/open_llama_3b_v2 · Hugging Face model and I am trying to run it following the 5.1.3 Tutorial for Large Language models present in Neuropilot 8.0.9 documentation.

I am currently stuck at step 2 (Generating Caliberation Dataset):

Following is the SS of the error I am getting:

I am using GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1toolkit: https://neuropilot.mediatek.com/resources/downloads/b850a7a4-fb11-4a7e-85de-bffb18c5123b

Python: 3.8

I would be grateful if the error can be resolved.

Regards,

Nimesh

Dear Nimesh,

Thank you for providing the detailed screenshots of the error you are encountering during Step 2 (Generating Calibration Dataset).

The error you are seeing may be related to a compatibility issue between that specific Open Llama variant and the toolkit’s current version.

Suggested Action: Try an Alternate Source Model

Since the name of your deployment toolkit (GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1toolkit) explicitly references llama3.2, this could indicate that the toolkit is better optimized for or expected to work with the official Llama 3.2 models.

We recommend that you try switching to this model to see if it resolves the issue: meta-llama/Llama-3.2-3B (Hugging Face Link)

Please attempt the following steps:

  1. Replace the current model (openlm-research/open_llama_3b_v2) with meta-llama/Llama-3.2-3B.
  2. Restart the process from Step 2: Generating Calibration Dataset, as per the Neuropilot documentation.

If the problem persists after switching the model, please provide a new screenshot so we can continue troubleshooting.

Best regards, Jing

1 Like

Hello,

Thanks for replying.

I followed your instructions of using Meta-Llama 3.2 3B, but while generating Caliberation Dataset, I got the following error:

I think there is tokenizer class incompatibility.

I will be really grateful, if you can help me understand the error and resolve it.

Regards,

Nimesh

Thank you for sharing the error details. Based on the message

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from

this issue may be related to a mismatch between the model checkpoint and the tokenizer configuration.

Suggested Action

Please try the following adjustment to your model configuration file:

  1. Open the config.json file in your model directory.
  2. Add the following line inside the configuration:
"tokenizer": "pretrained_fast"

Additionally, if you later encounter the following error:

TypeError: _batch_encode_plus() got an unexpected keyword argument 'padding_side'

this typically indicates a version compatibility issue with the Transformers library.

To resolve it, please install a compatible version using:

pip install transformers==4.41.2

If the problem persists after making these changes, please share the updated error message or a screenshot so we can continue troubleshooting further.

Best regards,
Jing

Hello Jing,

Thanks for your response, with your suggestion to make changes in config.json, I could resolve that error but further got the following error running the same command:

I am currently running Llama3.2-1B model.

I’ll be really grateful if you can help me resolve this error.

Regards,

Nimesh

Hi Nimesh,

I tested Llama‑3.2‑1B‑Instruct on my side using the model from https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct/tree/main and did not encounter the error you mentioned. Here are the module versions I am using—please check if they match yours:

mtk-converter: 8.13.0
mtk_llm_sdk: 2.5.3
mtk-quantization: 8.2.0
sentencepiece: 0.2.0
torch: 2.4.1
torchvision: 0.19.1
transformers: 4.41.2

If your versions differ, I recommend updating to the versions above and testing again to rule out any issues caused by version mismatch.

Thanks!

Regards,
Jing

Hi Jing,

I am currently using 8.0.10 premium version of Neuropilot. In this and earlier versions, I have mtk_llm_sdk: 3.4.2 and 2.8.2 only. **I am unable to fetch 2.5.3 version.
**
I could find Mtk_converter: 8.13.0 and mtk_quantization: 8.2.0 from Neuropilot 8.0.7 version of the documentation.

If it is possible at your end, can you please share mtk_llm_sdk with 2.5.3 version with me so I can continue this task?

Regards,
Nimesh

Hi Nimesh,

The mtk_llm_sdk 2.5.3 installation package is included in the toolkit you’re using. You can find it at the following path:

GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1/mtk_llm_sdk/mtk_llm_sdk-2.5.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

Additionally, for the Llama 3.2-1B model, recommend prioritizing the Instruct version available here:

https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct

This is the version we have tested and confirmed to work without issues.

Regards,
Jing

Hi Jing,

Thank you so much for your help, I was able to create caliberation datasets, do PTQ and fix the shape.

4 th step was to check if model is running fine on PC itself using “bash 4_optional_inference_tflite.sh”.

When I ran this command, I got the following response which is very gibberish:

Are such sort of responses expected due to model being of 1B parameters or have I made some mistake?

Just wanted to confirm this with you. Please let me know your thoughts.

Regards,
Nimesh

Hi Jing,

I continued with the further steps mentioned in the documentation, model compilation steps are finished and I am currently on 5.1.3.6.2. Pushing Dependencies to the Device

In step 3b, while trying to create sentencepiece tokenizer, I am getting the following error:

I am using sentencepiece version 0.2.0

I am getting the same error if I use tokenizer.json or tokenizer.model. Can you please help me to debug it?

Also I want to confirm that in step 3c, “After the tokenizer files have been prepared, push them to the device using Android Debug Bridge (adb).” : by tokenizer files does it mean that added_tokens.yaml, vocab.txt, merges.txt need to be pushed to device or some other files need to be pushed as well?

I ll be really grateful if you can help me debug 3b and 3c steps of the documentation.

Regards,
Nimesh

Hi Nimesh,

Regarding the tokenizer files, the required ones are the following three:

  • vocab.txt
  • merges.txt
  • added_tokens.yaml

Have you tried using the prepare_huggingface_tokenizer.py script? If so, are you encountering the same issue?

Example command:

python prepare_huggingface_tokenizer.py ../../post_training_quantize/models/Llama-3.2-1B-Instruct/tokenizer.json

Expected output:

Exported 'vocab.txt' from 'tokenizer.json'
Exported 'merges.txt' from 'tokenizer.json'
Exported 'added_tokens.yaml' from 'tokenizer.json'

Regards,
Jing

Hello Jing,

Thanks for your reply.

I am able to prepare vocab.txt, merges.txt and added_tokens.yaml
I have pushed them to my mobile aswell with other .dla files

Now, coming to inference step at 5.1.3.6.3 ,

I am facing the following issues:

For every .bat file, I am getting command not found error.
Also, I do not have a rooted mobile phone, so I am planning to run this use case via Neuron Adapter/USDK instead of Neuron Runtime.

Can you please help me with these errors?

Regards,
Nimesh

Hi Nimesh,

The reason the command can’t be found is that, starting from step 5.1.3.6, need to switch to a Windows PC to run the instructions. The .bat file is a Windows batch script, and it requires the following tools to already be available on Windows:

  • ndk-build
  • adb

Please open the built-in Windows terminal tool PowerShell, navigate to the directory where you placed the GAI toolkit, for example:

cd /path/to/GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1/inference
./build_all_usdk.bat

Regards,
Jing

Hi Jing,

Thank you so much for the reply.
I have run all the build files successfully on windows 11. I am currently stuck on the last step of inference:

$ ./run_<model_name>.bat

When I ran this command I got the following error:

CANNOT LINK EXECUTABLE “./main”: library “libre2.so” not found: needed by /data/local/tmp/llm_sdk/libtokenizer.so in namespace (default)

I tried to add this library “libre2.so” from

“GAIToolkit…/inference/obj/local/arm64-v8a/libre2.so” to the /data/local/tmp/llm_sdk/llama3.2-1B folder of android phone

, but still got the same error:


Also while inferencing this usecase on windows, I have imported only GAI-Development-Toolkit/inference folder from my linux machine.

Please let me know how to resolve this error and if more folders or files need to be imported as well from linux which might be causing this error.

Regards,
Nimesh

Hi Nimesh,

Under normal circumstances, there should not be any missing libraries.

When you cd to the following directory:

\path\to\GAI-Deployment-Toolkit-v2.0.6_llama3.2-1b-3b-v0.1\inference

and run:

build_all.bat

the build process should generate all required libraries, including libre2.so.

For reference, after a successful build, the llm_sdk directory on the device contains the following files:

aiot8391p2_64_bsp:/data/local/tmp/llm_sdk # ls -la

total 10761
drwxrwxrwx 3 root  root     3452 2025-05-16 03:40 .
drwxrwx--x 3 shell shell    3452 2025-05-16 03:33 ..
-rw-rw-rw- 1 root  root  1794776 2025-03-24 04:51 libc++_shared.so
-rw-rw-rw- 1 root  root   111848 2025-12-08 04:11 libcommon.so
-rw-rw-rw- 1 root  root   268592 2025-12-08 04:11 libhf-tokenizer.so
-rw-rw-rw- 1 root  root   931592 2025-12-08 04:11 libmtk_llm.so
-rw-rw-rw- 1 root  root  2118336 2025-12-08 04:11 libre2.so
-rw-rw-rw- 1 root  root  2064800 2025-12-08 04:11 libsentencepiece.so
-rw-rw-rw- 1 root  root   741448 2025-12-08 04:11 libtokenizer.so
-rw-rw-rw- 1 root  root  1745440 2025-12-08 04:11 libyaml-cpp.so
-rwxrwxrwx 1 root  root   274312 2025-12-08 04:11 main
-rw-rw-rw- 1 root  root   280304 2025-12-08 04:11 main_batch_gen
-rw-rw-rw- 1 root  root   349312 2025-12-08 04:11 main_medusa
-rwxrwxrwx 1 root  root   298304 2025-12-08 04:11 main_spec_dec

Please verify that the build step completed successfully and that the libraries were generated in the expected output directory.

Hi Jing,

Thank you for your reply.

By running:

build_all_usdk.bat

instead of running the all the .bat files manually, I could resolve the error. But while running the command:

$ ./run_llama3.2-1b.bat

I am facing the following error:


It appears the error is regarding no output tokens.

Please help me resolve this error, I am really grateful to you for helping me out with all the issues I am facing.

Regards,
Nimesh

Hi Nimesh,

Could you please share the contents of your config_llama3.2-1b_instruct.yaml file?

Additionally, could you let me know which model files are present on your device? You can provide the output of one of the following commands:

  • ls -la /data/local/tmp/llm_sdk/assets
  • ls -la /data/local/tmp/llm_sdk

This will help me verify whether the issue might be caused by missing files.

Best regards,
Jing

Hi Jing,
Happy New Year.

Thank you for responding.
Please find attached the following:

Do let me know if any further information is required from my side.

Regards,
Nimesh

Hi Nimesh,

Happy New Year!

Please try removing the vocab.txt and merges.txt entries from config_llama3.2-1b_instruct.yaml under tokenizerPath, as shown below:

tokenizerPath:
  - /data/local/tmp/llm_sdk/assets/llama3.2-1b/llama3.2_1b_tokenizer.tiktoken
  - /data/local/tmp/llm_sdk/assets/llama3.2-1b/added_tokens.yaml
  - /data/local/tmp/llm_sdk/assets/llama3.2-1b/vocab.txt #remove
  - /data/local/tmp/llm_sdk/assets/llama3.2-1b/merges.txt #remove

In practice, this model only requires the following two files:

  • llama3.2_1b_tokenizer.tiktoken
  • added_tokens.yaml

After removing vocab.txt and merges.txt, please check whether the issue is resolved.

Best regards,
Jing