Risk Object Visual Analysis System, ROVAS, bases on the convolutional neural networks, CNN, for its visual object detection task. CNN based AI/ML models are commonly used in modern computer vision systems and object detection. The deployment of such model as functional part of the ROVAS requires gathering the potentially suitable data, manually labelling the risk objects in the data set, installation of suitable tools and frameworks and training and evaluating maturity of it. The maturity of the model will be evaluated by relational development of the achieved accuracy and loss indicators. This is since systems to compare to are mostly non-existent and there are no common publicly shared datasets for rating CNN models in this specific application domain. The capabilities of the ROVAS with the visual object detection mature model will be exposed through the Internet accessible service interface.

The applied CNN model type here is You Only Look Once, YOLO, as described in [1]. Specifically, we applied the further improved open-source implementation of version 5 available from https://github.com/ultralytics/yolov5. This implementation includes also useful scripting and service provisioning examples all licensed with GPL-3.0. For the general principles, intended operational details and limitations see the Deliverable 2.4 Detailed Description of the Safety Functionalities and Worker Notifications.

However, to improve the spatial detection and position determination of the required safety structures, object detection or segmentation could be executed directly on the laser scanned point clouds. While processing point clouds requires a vast amount of computing power, storage space and some point cloud processing limitations it has not been considered at this point of this project, but it is a worth to investigate later.

Do you want to know more? Download our deliverable from heresubscribe to our YouTube channel and share your opinion with us through our LinkedIn or our Twitter communities!

————————————————————————————————————

[1] Redmon, J., Divvala, S., Girshick, R., and Farhadi, A., “You Only Look Once: Unified, Real-Time Object Detection”, http://arxiv.org/abs/1506.02640, 2015.