Embarking on the journey to integrate advanced AI capabilities into your Raspberry Pi projects can be both exhilarating and challenging. Recently, I decided to push the boundaries by running an example model on a newly installed Hailo AI HAT+ on my Raspberry Pi 5. Despite following the repository documentation meticulously, the initial attempts didn’t yield the desired results. Undeterred, I rolled up my sleeves, navigated through the hurdles, and successfully got the model running. Here’s a detailed account of my experience, the steps I took, the challenges I faced, and some valuable insights to help you on your own AI endeavors.
The Initial Hurdle: Documentation Discrepancies
Starting any AI project requires a clear set of instructions, and I was ready to dive in. However, following the repository documentation led me down a path of frustration as the expected results remained elusive. It became evident that the official guidelines might not cover all the nuances required for a seamless setup, especially when dealing with the intricate interplay between the Raspberry Pi 5 and the Hailo AI HAT+.
Crafting My Own Path: Step-by-Step Setup
Realizing the need to deviate from the standard instructions, I embarked on a self-guided setup process. Here’s a comprehensive breakdown of the steps that ultimately led to a successful model run:
1. Installing the Hailo Runtime
The first crucial step was to install the Hailo runtime, which forms the backbone of the AI HAT+ operations.
wget https://github.com/hailo-ai/hailort/archive/refs/tags/v4.19.0.zip -O hailort-v4.19.0.zip
unzip hailort-v4.19.0.zip
cd hailort-4.19.0
mkdir build
cd build
cmake ..
make
sudo make install
2. Downloading Code Examples
Next, I downloaded the repository containing the application code examples essential for understanding and running the model.
wget https://github.com/hailo-ai/Hailo-Application-Code-Examples/archive/refs/heads/main.zip
unzip main.zip
cd Hailo-Application-Code-Examples-main/runtime/cpp/onnxruntime/
3. Acquiring Required Libraries and Artifacts
To ensure the example application had all necessary dependencies, I fetched the required libraries and artifacts.
wget -q "https://hailo-csdata.s3.eu-west-2.amazonaws.com/resources/hefs/h8/yolov5m_wo_spp.hef" -O "yolov5m_wo_spp.hef"
wget -q "https://hailo-csdata.s3.eu-west-2.amazonaws.com/resources/onnxs/yolov5m_wo_spp_postprocess_v1.onnx" -O "yolov5m_wo_spp_postprocess_v1.onnx"
wget -q "https://github.com/microsoft/onnxruntime/releases/download/v1.18.1/onnxruntime-linux-aarch64-1.18.1.tgz" -O "onnxruntime-linux-aarch64-1.18.1.tgz"
sudo tar -xzf "onnxruntime-linux-aarch64-1.18.1.tgz" -C /opt
rm "onnxruntime-linux-aarch64-1.18.1.tgz"
4. Building the Example Application
With all dependencies in place, I proceeded to build the example application tailored for the Raspberry Pi 5.
mkdir -p build/aarch64
cmake -H. -Bbuild/aarch64
cmake --build build/aarch64
5. Running the Example Code
Finally, I executed the example code to verify the setup.
./build/aarch64/hailo_ort_example -hef=yolov5m_wo_spp.hef -onnx=yolov5m_wo_spp_postprocess_v1.onnx -image=image.jpg
The execution yielded the following output, indicating a successful inference:
-I- num_of_frames: 100
-I-----------------------------------------------
-I- ONNX Name
-I-----------------------------------------------
-I- yolov5m_wo_spp_postprocess_v1
-I-----------------------------------------------
-I- Input shape NCHW: (1, 255, 80, 80)
-I- Input shape NCHW: (1, 255, 20, 20)
-I- Input shape NCHW: (1, 255, 40, 40)
-I-----------------------------------------------
-I- Output shape NCHW: (1, 3, 20, 20)
-I- Output shape NCHW: (1, 3, 40, 40)
-I- Output shape NCHW: (1, 3, 80, 80)
-I-----------------------------------------------
-I-----------------------------------------------
-I- Hailo Network Name
-I-----------------------------------------------
-I- yolov5m_wo_spp
-I-----------------------------------------------
-I- Input shape NHWC: (1, 640, 640, 3)
-I-----------------------------------------------
-I- Output shape NHWC: (1, 80, 80, 0)
-I-----------------------------------------------
[ INFO:0@0.404] global registry_parallel.impl.hpp:96 ParallelBackendRegistry core(parallel): Enabled backends(3, sorted by priority): ONETBB(1000); TBB(990); OPENMP(980)
-I- Started write thread: yolov5m_wo_spp/input_layer1 (640, 640, 3)
-I- Started read thread: yolov5m_wo_spp/yolov5_nms_postprocess (80, 80, 0)
-I- Recv 100/100
-I- Inference finished successfully
-I-----------------------------------------------
-I- Total ONNXRuntime Time: 0.473099 sec
-I- Total Hailo Time: 0.458674 sec
-I- Total Time: 0.931772 sec
-I-----------------------------------------------
-I- Average ONNXRuntime FPS: 211.372
-I- Average Hailo FPS: 218.02
-I- Average FPS: 107.322
-I-----------------------------------------------
-I- ONNXRuntime Latency: 4.73099 ms
-I- Hailo Latency: 4.58674 ms
-I- Total Latency: 9.31772 ms
-I-----------------------------------------------
-I- Total inference run time: 1.41268 sec
The performance metrics were impressive, showcasing high Frames Per Second (FPS) and low latency.
Navigating the Challenges of Integrating Your Own ONNX Model
Successfully running an example model on the Hailo AI HAT+ was a significant achievement. How about deploying your own ONNX models on this platform? Below is an in-depth exploration of the primary challenges faced:
1. ONNX Compatibility Limitations
Issue:
The example provided by Hailo utilized ONNX exclusively for the Non-Maximum Suppression (NMS) task. NMS is a post-processing step commonly used in object detection models like YOLO to eliminate redundant bounding boxes.
Implications:
This limitation indicates that directly converting comprehensive models like YOLO, which integrate both inference and post-processing within the ONNX framework, into Hailo’s HEF format is not straightforward. Users attempting to deploy these models may find that essential functionalities are unsupported.
Solution:
To address this, one must separate the inference and post-processing components. While Hailo can handle the inference part effectively, the NMS and other post-processing tasks need to be managed externally or reimplemented within a supported framework. Alternatively, exploring custom implementations or leveraging Hailo’s proprietary post-processing capabilities, if available, could bridge this compatibility gap.
2. Model Conversion Complexities
Issue:
Merely converting a well-trained ONNX model to Hailo’s HEF format often results in unforeseen issues. The conversion process is not a simple one-to-one translation; it requires meticulous adjustments to ensure that the model aligns with Hailo’s internal architecture and operational paradigms.
Implications:
Without proper conversion, models may suffer from performance degradation, incorrect inference results, or complete failure to run. This complexity is compounded by the need to maintain the integrity and accuracy of the original model during the transition.
Solution:
To effectively convert models, one must engage in finetuning the model within Hailo’s proprietary framework and performing intricate modifications to adapt the model structure to Hailo’s requirements. This might involve altering layer configurations, modifying activation functions, or restructuring the network to fit within the supported operational parameters of the Hailo AI Acceleration chip.
Additionally, leveraging Hailo’s conversion tools and adhering closely to their guidelines can mitigate some of these complexities. Engaging with Hailo’s support channels or community forums for specific conversion issues can also provide valuable insights and solutions.
3. Calibration Data Necessity
Issue:
The conversion process from ONNX to HEF necessitates the use of comprehensive calibration data derived from the original training dataset. Calibration is essential for quantizing the model, which involves mapping floating-point operations to fixed-point representations that Hailo’s hardware can efficiently process.
Implications:
Without access to the complete set of training data, the calibration process cannot be effectively performed, leading to failed conversions. This requirement underscores the critical importance of data management and availability during the deployment phase.
Solution:
Ensuring that the entire training dataset is retained and accessible during model conversion is imperative. In cases where the original data is unavailable, generating a representative calibration dataset that accurately reflects the distribution and characteristics of the training data can serve as a viable alternative. This approach helps maintain model accuracy and performance post-conversion.
Moreover, integrating data handling best practices into the model development lifecycle can prevent such issues, ensuring that calibration data is always available when needed for deployment.
4. Framework Constraints
Issue:
Hailo’s framework imposes certain constraints that limit the direct execution of ONNX models on their hardware. Specifically, the available ONNX Runtime Preview version is only compatible with ONNX Runtime version 1.11.1, which is over two years old. This outdated compatibility restricts the use of newer features and optimizations available in more recent ONNX versions.
Implications:
The reliance on an archaic ONNX Runtime version hampers flexibility, making it challenging to incorporate modern advancements and optimizations in model development. It can lead to compatibility issues with newer ONNX models and limit the overall performance and capabilities of the deployed models.
Solution:
To navigate these constraints, users might need to align their model development processes with the supported ONNX Runtime version. This alignment ensures compatibility and smooth deployment on Hailo’s hardware. Additionally, staying informed about updates from Hailo regarding framework enhancements and runtime support can provide opportunities to leverage newer features as they become available.
For projects that require the latest ONNX features, considering alternative deployment strategies or hardware that supports newer ONNX Runtime versions might be necessary. Engaging with Hailo’s development team or contributing to their community can also influence future updates to better support modern ONNX versions.
5. Additional Considerations
Beyond the primary challenges outlined above, several other factors can influence the success of deploying custom ONNX models on Hailo’s AI HAT+:
- Resource Optimization: Ensuring that the model is optimized for Hailo’s hardware resources, such as memory and compute units, is crucial for achieving desired performance levels.
- Toolchain Familiarity: Gaining proficiency with Hailo’s toolchain, including the Dataflow Compiler and HailoRT, is essential for efficient model conversion and deployment.
- Community and Support: Leveraging community forums, support channels, and official documentation can provide invaluable assistance in troubleshooting and optimizing deployments.
Recommendations
Given these challenges, here are some recommendations based on my experience:
- Finetuning with Hailo’s Framework: To ensure seamless integration, it might be more efficient to train or finetune models directly within Hailo’s ecosystem rather than attempting to convert existing models.
- Model Surgery: For those determined to use existing models, performing detailed modifications to align with Hailo’s internal formats is essential, albeit time-consuming.
- Stay Updated: Keep an eye on updates from Hailo regarding ONNX support and runtime compatibility to leverage improvements and expanded functionalities.
Frequently Asked Questions (FAQ)
To further assist those venturing into similar setups, here’s a concise FAQ based on common queries and experiences:
1. Can I generate a .hef file from an existing ONNX model without additional training?
Answer: Direct conversion of a heavily trained ONNX model to HEF is not straightforward. Hailo’s framework typically requires finetuning or model modifications to align with their internal formats. Additionally, calibration data from your training dataset is essential for a successful conversion.
2. Is the Dataflow Compiler (DFC) necessary for converting ONNX models to HEF?
Answer: Yes, the Hailo Dataflow Compiler is required for converting ONNX or TFLite models to HEF. It includes tutorials and step-by-step guidance to facilitate this conversion process.
3. What quantization levels are supported on Hailo’s H8L? Can I achieve FP16 precision?
Answer: Hailo’s quantization primarily operates in 8-bit precision. While mixed precision techniques are employed during model optimization, achieving FP16 on H8L may require specific configurations and is subject to the framework’s constraints.
4. How can I measure the inference FPS of my model? Do I need HailoRT for this?
Answer: Measuring the model’s FPS can be effectively done using HailoRT (Hailo Runtime). HailoRT provides tools and metrics to evaluate the performance of your model, including FPS and latency.
5. Why do I need to retrain or use images again when converting my model to HEF?
Answer: Calibration is a critical step in the conversion process to HEF, ensuring that the quantization parameters accurately reflect the data distribution. This step typically requires access to your training data to maintain model accuracy post-conversion.
6. Are there any limitations when using ONNX Runtime with Hailo devices for image processing tasks like background removal?
Answer: Yes, the current ONNX Runtime support for Hailo devices is limited. While it may handle specific tasks, comprehensive support for various image processing operations is constrained. For non-real-time tasks, exploring alternative execution providers or leveraging Hailo’s native capabilities is advisable.
7. What should I do if the converted HEF model produces incorrect outputs despite having the same input data as the ONNX model?
Answer: Discrepancies in outputs often indicate issues during the conversion process. Ensure that all steps—parsing, optimization, and compilation—are executed correctly. Pay attention to warnings during conversion, such as normalization requirements, and adjust your model accordingly to align with Hailo’s specifications.
Final Thoughts
Integrating AI into Raspberry Pi projects using Hailo’s AI HAT+ presents a promising avenue for enhancing computational capabilities. While the journey involves navigating through technical challenges and meticulous setups, the rewards in terms of performance and efficiency are substantial. By understanding the intricacies of model conversion, calibration, and framework limitations, you can harness the full potential of Hailo’s hardware to bring your AI projects to life.
Whether you’re a seasoned developer or a passionate hobbyist, the key lies in persistence, continuous learning, and leveraging community support to overcome obstacles. Happy tinkering!
Responses
It was all going so well, then…
[HailoRT] [error] CHECK failed – Driver version (4.20.0) is different from library version (4.19.0)
[HailoRT] [error] Driver version mismatch, status HAILO_INVALID_DRIVER_VERSION(76)
This is all at/beyond the limit of my IT knowledge, so I follow steps verbatim until something doesn’t work, then I am floundering.
It seems that I am not alone.
https://community.hailo.ai/t/issue-hailort-error-check-failed-driver-version-4-20-0-is-different-from-library-version-4-19-0/9617/2
Hey Will, thanks for your comment.
The issue is because the driver and library versions are not matching, it happens when one was updated and the other was not.
Did you try
sudo apt updatesudo apt full-upgrade
sudo apt install hailo-all