Running a custom PyTorch model on the HTP NPU

Hi there,

I am trying to run a PyTorch toy example model on Rubik Pi 3 (QCS6490) using the code found here.

The model was successfully compiled and downloaded, and I have a mobilenetv2.tflite file locally.

Now I want to use the model for inference, so I am using the following code:

import numpy as np
import tflite_runtime.interpreter as tflite

def run_inference(model_path, input_data):

    # Load interpreter
    #interpreter = tflite.Interpreter(model_path=model_path)
    interpreter = tflite.Interpreter(model_path=model_path, experimental_delegates=[tflite.load_delegate('/usr/lib/libQnnTFLiteDelegate.so')])

    interpreter.allocate_tensors()

    # Get I/O details
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    # Set input
    interpreter.set_tensor(input_details[0]['index'], input_data)

    # Run inference
    interpreter.invoke()

    # Get output
    output = interpreter.get_tensor(output_details[0]['index'])

    return output

# Example usage
input_shape = (1, 3, 224, 224)
input_data = np.random.randn(*input_shape).astype(np.float32)

result = run_inference("mobilenetv2.tflite", input_data)

But I get the following error:

File "/home/ubuntu/.pyenv/versions/3.8.18/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 513, in __init__
    self._interpreter.ModifyGraphWithDelegate(
RuntimeError: Restored original execution plan after delegate application failure.

If I don’t explicitly specify a delegate and let it automatically select a delegate using interpreter = tflite.Interpreter(model_path=model_path), it runs on the CPU and gives the following output:

INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Output shape: (1, 1000)

How can I get the network accelerated on the HTP NPU? Is there anything wrong with my inference script?

Thank you in advance!

P.S. I will post the system info in the next comment.

  • cat /etc/os-release:
PRETTY_NAME="Ubuntu 24.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.3 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
  • qnn-platform-validator --backend all --testBackend:
PF_VALIDATOR: DEBUG: Calling PlatformValidator->setBackend
PF_VALIDATOR: DEBUG: Calling PlatformValidator->isBackendHardwarePresent
PF_VALIDATOR: DEBUG: Calling PlatformValidator->isBackendAvailable
Backend GPU Prerequisites: Present.
PF_VALIDATOR: DEBUG: Calling PlatformValidator->backendCheck
     5.3ms [  INFO ] Found /usr/lib/aarch64-linux-gnu/libOpenCL.so.1
PF_VALIDATOR: DEBUG: Building and running a simple Vector addition gpu program.
Unit Test on the backend GPU: Passed.
QNN is supported for backend GPU on the device.
PF_VALIDATOR: DEBUG: Calling PlatformValidator->setBackend
PF_VALIDATOR: DEBUG: Calling PlatformValidator->isBackendHardwarePresent
PF_VALIDATOR: DEBUG: Calling PlatformValidator->isBackendAvailable
PF_VALIDATOR: DEBUG: Should be able to access atleast one of libraries from : libc.so.6
PF_VALIDATOR: DEBUG: dlOpen successfull for library : libc.so.6
PF_VALIDATOR: DEBUG: Should be able to access atleast one of libraries from : libcdsprpc.so
PF_VALIDATOR: DEBUG: dlOpen successfull for library : libcdsprpc.so
Backend DSP Prerequisites: Present.
PF_VALIDATOR: DEBUG: Calling PlatformValidator->backendCheck
PF_VALIDATOR: DEBUG: Should be able to access atleast one of libraries from : libc.so.6
PF_VALIDATOR: DEBUG: dlOpen successfull for library : libc.so.6
PF_VALIDATOR: DEBUG: Should be able to access atleast one of libraries from : libcdsprpc.so
PF_VALIDATOR: DEBUG: dlOpen successfull for library : libcdsprpc.so
PF_VALIDATOR: DEBUG: Starting calculator test
PF_VALIDATOR: DEBUG: Loading sample stub: libQnnHtpV68CalculatorStub.so
PF_VALIDATOR: DEBUG: Successfully loaded DSP library - 'libQnnHtpV68CalculatorStub.so'.  Setting up pointers.
PF_VALIDATOR: DEBUG: Success in executing the sum function
Unit Test on the backend DSP: Passed.
QNN is supported for backend DSP on the device.
*********** Results Summary ***********
Backend = GPU
{
  Backend Hardware  : Supported
  Backend Libraries : Found
  Library Version   : Not Queried
  Core Version      : Not Queried
  Unit Test         : Passed
}
Backend = DSP
{
  Backend Hardware  : Supported
  Backend Libraries : Found
  Library Version   : Not Queried
  Core Version      : Not Queried
  Unit Test         : Passed
}
Error in saving the results
  • ls -la /usr/lib/libQnn*.so:
-rw-r--r-- 1 root root   198664 Nov 27 04:14 /usr/lib/libQnnChrometraceProfilingReader.so
-rw-r--r-- 1 root root  7091216 Nov 27 04:14 /usr/lib/libQnnCpu.so
-rw-r--r-- 1 root root   235504 Nov 27 04:14 /usr/lib/libQnnCpuNetRunExtensions.so
-rw-r--r-- 1 root root  1107792 Nov 27 04:14 /usr/lib/libQnnDsp.so
-rw-r--r-- 1 root root   256048 Nov 27 04:14 /usr/lib/libQnnDspNetRunExtensions.so
-rw-r--r-- 1 root root    10216 Nov 27 04:14 /usr/lib/libQnnDspV66CalculatorStub.so
-rw-r--r-- 1 root root    42992 Nov 27 04:14 /usr/lib/libQnnDspV66Stub.so
-rw-r--r-- 1 root root   555640 Nov 27 04:14 /usr/lib/libQnnGenAiTransformer.so
-rw-r--r-- 1 root root  2001088 Nov 27 04:14 /usr/lib/libQnnGenAiTransformerCpuOpPkg.so
-rw-r--r-- 1 root root    38888 Nov 27 04:14 /usr/lib/libQnnGenAiTransformerModel.so
-rw-r--r-- 1 root root  4836592 Nov 27 04:14 /usr/lib/libQnnGpu.so
-rw-r--r-- 1 root root   243688 Nov 27 04:14 /usr/lib/libQnnGpuNetRunExtensions.so
-rw-r--r-- 1 root root   210952 Nov 27 04:14 /usr/lib/libQnnGpuProfilingReader.so
-rw-r--r-- 1 root root   597184 Nov 27 04:14 /usr/lib/libQnnHta.so
-rw-r--r-- 1 root root   231400 Nov 27 04:14 /usr/lib/libQnnHtaNetRunExtensions.so
-rw-r--r-- 1 root root  2251408 Nov 27 04:14 /usr/lib/libQnnHtp.so
-rw-r--r-- 1 root root   649304 Nov 27 04:14 /usr/lib/libQnnHtpNetRunExtensions.so
-rw-r--r-- 1 root root  3780216 Nov 27 04:14 /usr/lib/libQnnHtpOptraceProfilingReader.so
-rw-r--r-- 1 root root 77395648 Nov 27 04:14 /usr/lib/libQnnHtpPrepare.so
-rw-r--r-- 1 root root   182280 Nov 27 04:14 /usr/lib/libQnnHtpProfilingReader.so
-rw-r--r-- 1 root root    10216 Nov 27 04:14 /usr/lib/libQnnHtpV68CalculatorStub.so
-rw-r--r-- 1 root root   305152 Nov 27 04:14 /usr/lib/libQnnHtpV68Stub.so
-rw-r--r-- 1 root root    10216 Nov 27 04:14 /usr/lib/libQnnHtpV69CalculatorStub.so
-rw-r--r-- 1 root root   305152 Nov 27 04:14 /usr/lib/libQnnHtpV69Stub.so
-rw-r--r-- 1 root root    10216 Nov 27 04:14 /usr/lib/libQnnHtpV73CalculatorStub.so
-rw-r--r-- 1 root root   313344 Nov 27 04:14 /usr/lib/libQnnHtpV73Stub.so
-rw-r--r-- 1 root root    10216 Nov 27 04:14 /usr/lib/libQnnHtpV75CalculatorStub.so
-rw-r--r-- 1 root root   313344 Nov 27 04:14 /usr/lib/libQnnHtpV75Stub.so
-rw-r--r-- 1 root root    10216 Nov 27 04:14 /usr/lib/libQnnHtpV79CalculatorStub.so
-rw-r--r-- 1 root root   309248 Nov 27 04:14 /usr/lib/libQnnHtpV79Stub.so
-rw-r--r-- 1 root root  2407000 Nov 27 04:14 /usr/lib/libQnnIr.so
-rw-r--r-- 1 root root   235528 Nov 27 04:14 /usr/lib/libQnnJsonProfilingReader.so
-rw-r--r-- 1 root root   556240 Nov 27 04:14 /usr/lib/libQnnLpai.so
-rw-r--r-- 1 root root   329776 Nov 27 04:14 /usr/lib/libQnnLpaiNetRunExtensions.so
-rw-r--r-- 1 root root   232136 Nov 27 04:14 /usr/lib/libQnnLpaiProfilingReader.so
-rw-r--r-- 1 root root  3373648 Nov 27 04:14 /usr/lib/libQnnModelDlc.so
-rw-r--r-- 1 root root   460792 Nov 27 04:14 /usr/lib/libQnnSaver.so
-rw-r--r-- 1 root root  3504888 Nov 27 04:14 /usr/lib/libQnnSystem.so
-rw-r--r-- 1 root root   984072 Nov 27 04:14 /usr/lib/libQnnTFLiteDelegate.so

We are currently syncing this issue internally.

1 Like

@kinkin
Thank you for your reply. I appreciate an update in this regard !