在 Azure IoT Edge 设备上启用机器学习推理

Azure IoT Edge

Azure IoT 中心

边缘 AI 是最受欢迎的边缘方案之一。实现此方案需要图像分类、对象检测、身体、人脸和手势分析以及图像处理。本体系结构指南介绍如何使用 Azure IoT Edge 来支持这些方案。

可以通过更新 AI 模型来提高 AI 准确性，但在某些情况下，边缘设备网络环境并不适合。例如，在风力发电和石油行业，设备可能位于沙漠或海洋中。

IoT Edge 模块孪生用于实现动态加载的 AI 模型。 IoT Edge 模块基于 Docker。 AI 环境中 IoT Edge 模块的映像大小通常至少为 1 GB，因此在窄带宽网络中增量更新 AI 模型非常重要。本文重点介绍该注意事项。其思路是创建一个 IoT Edge AI 模块，该模块可以加载 LiteRT（以前称为 TensorFlow Lite)或 Open Neural Network Exchange (ONNX) 对象检测模型。还可以 Web API 格式启用该模块，以便可以使用它来帮助其他应用程序或模块。

本文中所述的解决方案可通过以下方式为你提供帮助：

在边缘设备上启用 AI 推理。
最大程度地降低在边缘部署和更新 AI 模型的网络成本。该解决方案可以为你或你的客户节省资金，尤其是在带宽较窄的网络环境中。
在 IoT Edge 设备的本地存储中创建和管理 AI 模型存储库。
当边缘设备切换 AI 模型时，可实现几乎零停机时间。

TensorFlow 和 LiteRT 是 Google Inc. 的商标。使用此标志并不意味着认可。

体系结构

下载此体系结构的 Visio 文件。

数据流

AI 模型上传到 Azure Blob 存储或 Web 服务。模型可以是预先训练的 LiteRT 或 ONNX 模型，也可以是在 Azure 机器学习中创建的模型。 IoT Edge 模块可以访问此模型，并在之后将其下载到边缘设备。如果需要更好的安全性，请考虑在 Blob 存储和边缘设备之间使用专用终结点连接。
Azure IoT 中心自动将设备模块孪生与 AI 模型信息同步。即使 IoT Edge 处于脱机状态，也会执行同步。（在某些情况下，IoT 设备会按每小时、每天或每周的预定时间连接到网络，以节省电源或减少网络流量。）
加载程序模块通过 API 监视模块孪生的更新。当检测到更新时，它会获取机器学习模型 SAS 令牌，然后下载 AI 模型。
- 有关详细信息，请参阅为容器或 Blob 创建 SAS 令牌。
- 可以使用 ExpiresOn 属性设置资源的过期日期。如果设备长时间处于脱机状态，可以延长过期时间。
加载程序模块将 AI 模型保存在 IoT Edge 模块的共享本地存储中。需要在 IoT Edge 部署 JSON 文件中配置共享本地存储。
加载程序模块通过 LiteRT 或 ONNX API 从本地存储加载 AI 模型。
加载程序模块启动一个 Web API，该 API 通过 POST 请求接收二进制照片，并在 JSON 文件中返回结果。

若要更新 AI 模型，可以将新版本上传到 Blob 存储，并再次同步设备模块孪生，以进行增量更新。无需更新整个 IoT Edge 模块映像。

方案详细信息

在此解决方案中，IoT Edge 模块用于下载 AI 模型，然后启用机器学习推理。可以在此解决方案中使用预先训练的 LiteRT 或 ONNX 模型。

LiteRT

.tflite 文件是预先训练的 AI 模型。可以从 TensorFlow.org 下载一个这样的文件。它是一个通用 AI 模型，可在跨平台应用程序（如 iOS 和 Android）中使用。 LiteRT 支持来自 TensorFlow、PyTorch、JAX 和 Keras 的模型。有关元数据和关联字段（例如 labels.txt）的详细信息，请参阅从模型中读取元数据。
训练对象检测模型，检测是否存在多个类别的对象及其位置。例如，可以使用以下内容来训练模型：包含各种水果的图像、所代表的水果类别（例如苹果）的标签和指定每个对象在图像中出现的位置的数据。

向模型提供图像时，它会输出检测到的对象列表、每个对象的边界框的位置以及表示检测置信度的分数。
如果要生成或自定义优化 AI 模型，请参阅 LiteRT Model Maker。
可以在 Detection Zoo 获取更多具有各种延迟和精度特征的免费预先训练的检测模型。每个模型都使用以下代码示例中显示的输入和输出签名。

ONNX

ONNX 是一种开放标准格式，用于表示机器学习模型。合作伙伴社区通过在很多框架和工具中实现这一格式从而对其提供支持。

ONNX 支持用于生成和部署模型以及完成其他任务的工具。有关详细信息，请参阅支持的 ONNX 工具。
可以使用 ONNX 运行时运行 ONNX 预先训练的模型。有关预先训练的模型的信息，请参阅 ONNX Model Zoo。
对于此方案，可以使用对象检测和图像分段模型：Tiny YOLOv3。

ONNX 社区提供了帮助你创建和部署深度学习模型的工具。

下载经过训练的 AI 模型

若要下载经过训练的 AI 模型，建议在新模型准备就绪时使用设备孪生来接收通知。即使设备处于脱机状态，也可以将消息缓存在 IoT 中心，直至边缘设备重新联机。消息将自动同步。

以下为一个 Python 代码示例，它为设备孪生注册通知，然后下载 ZIP 文件中的 AI 模型。它还会对下载的文件执行进一步的操作。

此代码将执行下列任务：

接收设备孪生通知。通知包括文件名、文件下载地址和 MD5 身份验证令牌。（在文件名中，可以包含版本信息，例如 1.0。）
将 AI 模型作为 ZIP 文件下载到本地存储。
（可选）执行 MD5 校验和。 MD5 验证有助于防止在网络传输过程中被篡改的 ZIP 文件。
解压缩 ZIP 文件并将其保存在本地。
向 IoT 中心发送通知或路由消息，报告新的 AI 模型已准备就绪。

# define behavior for receiving a twin patch
async def twin_patch_handler(patch):
    try:
        print( "######## The data in the desired properties patch was: %s" % patch)
        if "FileName" in patch:
            FileName = patch["FileName"]
        if "DownloadUrl" in patch:
            DownloadUrl = patch["DownloadUrl"]
        if "ContentMD5" in patch:
            ContentMD5 = patch["ContentMD5"]
        FilePath = "/iotedge/storage/" + FileName

        # download AI model
        r = requests.get(DownloadUrl)
        print ("######## download AI Model Succeeded.")
        ffw = open(FilePath, 'wb')
        ffw.write(r.content)
        ffw.close()
        print ("######## AI Model File: " + FilePath)

        # MD5 checksum
        md5str = content_encoding(FilePath)
        if md5str == ContentMD5:
            print ( "######## New AI Model MD5 checksum succeeded")
            # decompressing the ZIP file
            unZipSrc = FilePath
            targeDir = "/iotedge/storage/"
            filenamenoext = get_filename_and_ext(unZipSrc)[0]
            targeDir = targeDir + filenamenoext
            unzip_file(unZipSrc,targeDir)

            # ONNX
            local_model_path = targeDir + "/tiny-yolov3-11.onnx"
            local_labelmap_path = targeDir + "/coco_classes.txt"

            # LiteRT
            # local_model_path = targeDir + "/ssd_mobilenet_v1_1_metadata_1.tflite"
            # local_labelmap_path = targeDir + "/labelmap.txt"

            # message to module
            if client is not None:
                print ( "######## Send AI Model Info AS Routing Message")
                data = "{\"local_model_path\": \"%s\",\"local_labelmap_path\": \"%s\"}" % (filenamenoext+"/tiny-yolov3-11.onnx", filenamenoext+"/coco_classes.txt")
                await client.send_message_to_output(data, "DLModelOutput")
                # update the reported properties
                reported_properties = {"LatestAIModelFileName": FileName }
                print("######## Setting reported LatestAIModelName to {}".format(reported_properties["LatestAIModelFileName"]))
                await client.patch_twin_reported_properties(reported_properties)
        else:
            print ( "######## New AI Model MD5 checksum failed")

    except Exception as ex:
        print ( "Unexpected error in twin_patch_handler: %s" % ex )

推理

下载 AI 模型后，下一步是在边缘设备上使用该模型。可以动态加载模型并在边缘设备上执行对象检测。以下代码示例演示如何使用 LiteRT AI 模型检测边缘设备上的对象。

此代码将执行下列任务：

动态加载 LiteRT AI 模型。
执行图像标准化。
检测物体。
计算检测分数。

class InferenceProcedure():

    def detect_object(self, imgBytes):

        results = []
        try:
            model_full_path = AI_Model_Path.Get_Model_Path()
            if(model_full_path == ""):
                raise Exception ("PLEASE SET AI MODEL FIRST")
            if '.tflite' in model_full_path:
                interpreter = tf.lite.Interpreter(model_path=model_full_path)
                interpreter.allocate_tensors()
                input_details = interpreter.get_input_details()
                output_details = interpreter.get_output_details()
                input_shape = input_details[0]['shape']

                # bytes to numpy.ndarray
                im_arr = np.frombuffer(imgBytes, dtype=np.uint8)
                img = cv2.imdecode(im_arr, flags=cv2.IMREAD_COLOR)
                im_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
                im_rgb = cv2.resize(im_rgb, (input_shape[1], input_shape[2]))
                input_data = np.expand_dims(im_rgb, axis=0)

                interpreter.set_tensor(input_details[0]['index'], input_data)
                interpreter.invoke()
                output_data = interpreter.get_tensor(output_details[0]['index'])
                detection_boxes = interpreter.get_tensor(output_details[0]['index'])
                detection_classes = interpreter.get_tensor(output_details[1]['index'])
                detection_scores = interpreter.get_tensor(output_details[2]['index'])
                num_boxes = interpreter.get_tensor(output_details[3]['index'])

                label_names = [line.rstrip('\n') for line in open(AI_Model_Path.Get_Labelmap_Path())]
                label_names = np.array(label_names)
                new_label_names = list(filter(lambda x : x != '???', label_names))

                for i in range(int(num_boxes[0])):
                    if detection_scores[0, i] > .5:
                        class_id = int(detection_classes[0, i])
                        class_name = new_label_names[class_id]
                        # top, left, bottom, right
                        results_json = "{'Class': '%s','Score': '%s','Location': '%s'}" % (class_name, detection_scores[0, i],detection_boxes[0, i])
                        results.append(results_json)
                        print(results_json)
        except Exception as e:
            print ( "detect_object unexpected error %s " % e )
            raise

        # return results
        return json.dumps(results)

以下是上述代码的 ONNX 版本。步骤大致相同。唯一的区别是检测分数的处理方式，因为 Labelmap 和模型输出参数不同。

class InferenceProcedure():

    def letterbox_image(self, image, size):
        '''resize image with unchanged aspect ratio using padding'''
        iw, ih = image.size
        w, h = size
        scale = min(w/iw, h/ih)
        nw = int(iw*scale)
        nh = int(ih*scale)

        image = image.resize((nw,nh), Image.BICUBIC)
        new_image = Image.new('RGB', size, (128,128,128))
        new_image.paste(image, ((w-nw)//2, (h-nh)//2))
        return new_image

    def preprocess(self, img):
        model_image_size = (416, 416)
        boxed_image = self.letterbox_image(img, tuple(reversed(model_image_size)))
        image_data = np.array(boxed_image, dtype='float32')
        image_data /= 255.
        image_data = np.transpose(image_data, [2, 0, 1])
        image_data = np.expand_dims(image_data, 0)
        return image_data

    def detect_object(self, imgBytes):
        results = []
        try:
            model_full_path = AI_Model_Path.Get_Model_Path()
            if(model_full_path == ""):
                raise Exception ("PLEASE SET AI MODEL FIRST")
            if '.onnx' in model_full_path:

                # input
                image_data = self.preprocess(imgBytes)
                image_size = np.array([imgBytes.size[1], imgBytes.size[0]], dtype=np.float32).reshape(1, 2)

                labels_file = open(AI_Model_Path.Get_Labelmap_Path())
                labels = labels_file.read().split("\n")

                # Loading ONNX model
                print("loading Tiny YOLO...")
                start_time = time.time()
                sess = rt.InferenceSession(model_full_path)
                print("loaded after", time.time() - start_time, "s")

                input_name00 = sess.get_inputs()[0].name
                input_name01 = sess.get_inputs()[1].name
                pred = sess.run(None, {input_name00: image_data,input_name01:image_size})

                boxes = pred[0]
                scores = pred[1]
                indices = pred[2]

                results = []
                out_boxes, out_scores, out_classes = [], [], []
                for idx_ in indices[0]:
                    out_classes.append(idx_[1])
                    out_scores.append(scores[tuple(idx_)])
                    idx_1 = (idx_[0], idx_[2])
                    out_boxes.append(boxes[idx_1])
                    results_json = "{'Class': '%s','Score': '%s','Location': '%s'}" % (labels[idx_[1]], scores[tuple(idx_)],boxes[idx_1])
                    results.append(results_json)
                    print(results_json)

        except Exception as e:
            print ( "detect_object unexpected error %s " % e )
            raise

        # return results
        return json.dumps(results)

如果 IoT Edge 设备包含上述代码和功能，则边缘设备具备 AI 图像物体检测功能并支持 AI 模型动态更新。如果希望边缘模块通过 Web API 向其他应用程序或模块提供 AI 功能，可以在模块中创建 Web API。

Flask 框架是一种工具示例，可用于快速创建 API。可以以二进制数据形式接收图像，使用 AI 模型进行检测，然后以 JSON 格式返回结果。有关详细信息，请参阅 Flask：Visual Studio Code 中的 Flask 教程。

作者

本文由 Microsoft 维护，它最初是由以下贡献者撰写的。

主要作者：

王博（音译） | 高级软件工程师

其他参与者：

Freddy Ayala | 云解决方案架构师

要查看非公开的 LinkedIn 个人资料，请登录到 LinkedIn。

通过