实战：基于深度学习的道路损坏检测-锐单电子商城

点击上方“小白学视觉”，选择加"星标"或“置顶”

重磅干货，第一时间送达

1.简介

由于有利于经济发展和增长，同时带来重要的社会效益，道路基础设施是一项重要的公共资产。路面检查主要基于人类的视觉观察和昂贵机器的定量分析。智能探测器是用记录的图像或视频来检测损坏的最佳替代方案。除了道路INFR在一个结构上，道路损坏检测器也会独立驾驶汽车，以检测一些坑洞或其他干扰，尽量避免它们有用。

2.数据集

这里收集了本项目中使用的数据集。该数据集包括日本、印度和捷克共和国不同国家的道路图像。对于图像，标签的注释是 xml 在文件中，标签是 PASCAL VOC 格式。由于数据集包含来自日本的大部分图像(在之前的版本中，它只包含来自日本的图像)，标签是根据数据来源和日本道路指南确定的。

但是最新的数据集现在包含了其他国家的图像，所以我们只考虑以下标签的损坏来概括。D00：垂直裂缝，D10:水平裂缝，D20:鳄鱼裂缝，D40：坑洼

3.基于深度学习的目标检测

CNN 或者卷积神经网络是所有计算机视觉任务的基石。即使在物体检测的情况下，也使用了从图像中提取物体的模式到特征图（基本上是一个小于图像尺寸的矩阵）的卷积操作。现在，从过去几年开始，我们对物体检测任务进行了大量的研究，我们得到了大量最先进的算法或方法，其中一些简而言之，我们在下面解释。

4.EDA

数据集中图像总数：26620

标签分布

计数每个班 D00 : 6592  D10 : 4446  D20 : 8381  D40:5627

各国标签分布(全数据分析)

捷克数据分析 0 图像数量 2829  1 D00 988  2 D10 399  3 D20 161  4 D40 197  5 标签数量 1745  ************************ ********************************************** 印度数据分析      类别计数 6 图像数量 7706  7 D00 1555  8 D10 68  9 D20 2021  10 D40 3187  11 标签数量 6831  **************************** ****************************************** 日本数据分析 12 图像数量10506 13 D00 4049 14 D10 3979  15 D20 6199  16 D40 2243  17 标签数量 16470  ************************************ ************************************

图像中标签大小的分布

最小标签尺寸：0x1 标签 最大尺寸：704x492

5.关键技术

对象检测现在是一个巨大的主题，相当于一个学期的主题。它由许多算法组成。因此，目标检测算法，目标检测算法分为基于区域的算法等各种类别（RCNN、Fast-RCNN、Faster-RCNN）、基于区域的算法本身是两级检测器的一部分，但我们将在下面简要解释它们，所以我们清楚地提到了它们。让我们从RCNN(基于区域的卷积神经网络)开始。

目标检测算法的基本架构由两部分组成。这部分由一个组成 CNN 它将原始图像信息转换为特征图，在下一部分，不同的算法有不同的技术。因此，在 RCNN 在这种情况下，它使用选择性搜索来获得 ROI(感兴趣区)，也就是说，那个地方可能有不同的对象。大约从每个图像中提取 2000 个区域。它使用这些 ROI 对标签进行分类，并使用两种不同的模型来预测对象的位置。所以这些模型被称为两级检测器。

RCNN 为了克服这些限制，他们提出了一些限制 Fast RCNN。RCNN 计算时间高，因为每个区域分别传递给 CNN，它使用三种不同的模型进行预测。因此，在 Fast RCNN 在中间，每个图像只传输一次 CNN 并提取特征图。选择性搜索用于在这些地图上生成预测。将 RCNN 所有三个模型都组合在一起。

但是 Fast RCNN 选择性搜索仍然使用缓慢，所以计算时间仍然很长。猜猜他们想出了另一个有意义的名字版本，也就是说，更快 RCNN。Faster RCNN 选择性搜索方法取代了区域提议网络，使算法更快。现在让我们转向一些一次性检测器。YOLO 和 SSD 它是一种非常著名的物体检测模型，因为它们在速度和准确性之间提供了很好的权衡

YOLO：在一次评估中，单个神经网络直接从完整的图像中预测边界框和类别概率。由于整个检测管是一个单一的网络，可以直接优化检测性能

SSD（Single Shot Detector）：SSD 方法将边界框的输出空间离散为一组不同纵横比的默认框。离散化后，该方法按特征图位置进行缩放。Single Shot Detector 该网络将具有不同分辨率的多个特征图的预测结合起来，自然地处理各种大小的对象。

6.型号

为了学习基础知识，我们尝试了一些基本而快速的算法来实现以下数据集：

Efficientdet_d0 SSD_mobilenet_v2 YOLOv3

我们使用了第一个和第二个模型tensorflow 模型 zoo训练 yolov3 引用了this。用于评估 mAP(平均精度)，使用 Effectivedet_d0 和 ssd_mobilenet_v2 得到的 mAP 这可能是因为默认配置没有改变学习率、优化器和数据增强。

7.结果

使用 efficicentdet_d0 进行推导

import tensorflow as tf from object_detection.utils import label_map_util from object_detection.utils import config_util from object_detection.utils import visualization_utils as viz_utils from object_detection.builders import model_builder   # Load pipeline config and build a detection model configs = config_util.get_configs_from_pipeline_file('/content/efficientdet_d0_coco17_tpu-32/pipeline.config') model_config = configs['model'] detection_model = model_builder.build(model_config=model_config, is_training=False)   # Restore checkpoint ckpt = tf.compat.v2.train.Checkpoint(model=detection_model) ckpt.restore('/content/drive/MyDrive/efficientdet/checkpoints/ckpt-104').expect_partial()   @tf.function def detect_fn(image):     """Detect objects in image."""       image, shapes = detection_model.preprocess(image)     prediction_dict = detection_model.predict(image, shapes)     detections = detection_model.postprocess(prediction_dict, shapes)       return detections   category_index = label_map_util.create_category_index_from_labelmap('/content/data/label_map.pbtxt',                           use_display_name=True)
                                                                 
for image_path in IMAGE_PATHS:


    print('Running inference for {}... '.format(image_path), end='')


    image_np = load_image_into_numpy_array(image_path)


    input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)


    detections = detect_fn(input_tensor)
    num_detections = int(detections.pop('num_detections'))
    detections = {key: value[0, :num_detections].numpy()
                  for key, value in detections.items()}
    detections['num_detections'] = num_detections


    # detection_classes should be ints.
    detections['detection_classes'] = detections['detection_classes'].astype(np.int64)


    label_id_offset = 1
    image_np_with_detections = image_np.copy()


    viz_utils.visualize_boxes_and_labels_on_image_array(
            image_np_with_detections,
            detections['detection_boxes'],
            detections['detection_classes']+label_id_offset,
            detections['detection_scores'],
            category_index,
            use_normalized_coordinates=True,
            max_boxes_to_draw=200,
            min_score_thresh=.30,
            agnostic_mode=False)


    %matplotlib inline
    fig = plt.figure(figsize = (10,10))
    plt.imshow(image_np_with_detections)
    print('Done')
    plt.show()

使用 SSD_mobilenet_v2 进行推导

(与efficientdet 相同的代码)

YOLOv3 的推导

def func(input_file):
  classes = ['D00', 'D10', 'D20', 'D40']
  alt_names = {'D00': 'lateral_crack', 'D10': 'linear_cracks', 'D20': 'aligator_crakcs', 'D40': 'potholes'}
  # initialize a list of colors to represent each possible class label
  np.random.seed(42)
  COLORS = np.random.randint(0, 255, size=(len(classes), 3),
    dtype="uint8")
  # derive the paths to the YOLO weights and model configuration
  weightsPath = "/content/drive/MyDrive/yolo/yolo-obj_final.weights"
  configPath = "/content/yolov3.cfg"
  # load our YOLO object detector trained on COCO dataset (80 classes)
  # and determine only the *output* layer names that we need from YOLO
  #print("[INFO] loading YOLO from disk...")
  net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)
  ln = net.getLayerNames()
  ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]


  
  # read the next frame from the file
  frame = cv2.imread(input_file)
  (H, W) = frame.shape[:2]


  # construct a blob from the input frame and then perform a forward
  # pass of the YOLO object detector, giving us our bounding boxes
  # and associated probabilities
  blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416),
    swapRB=True, crop=False)
  net.setInput(blob)
  start = time.time()
  layerOutputs = net.forward(ln)
  end = time.time()
  # initialize our lists of detected bounding boxes, confidences,
  # and class IDs, respectively
  boxes = []
  confidences = []
  classIDs = []


  # loop over each of the layer outputs
  for output in layerOutputs:
    # loop over each of the detections
    for detection in output:
      # extract the class ID and confidence (i.e., probability)
      # of the current object detection
      scores = detection[5:]
      classID = np.argmax(scores)
      confidence = scores[classID]
      # filter out weak predictions by ensuring the detected
      # probability is greater than the minimum probability
      if confidence > 0.3:
        # scale the bounding box coordinates back relative to
        # the size of the image, keeping in mind that YOLO
        # actually returns the center (x, y)-coordinates of
        # the bounding box followed by the boxes' width and
        # height
        box = detection[0:4] * np.array([W, H, W, H])
        (centerX, centerY, width, height) = box.astype("int")
        # use the center (x, y)-coordinates to derive the top
        # and and left corner of the bounding box
        x = int(centerX - (width / 2))
        y = int(centerY - (height / 2))
        # update our list of bounding box coordinates,
        # confidences, and class IDs
        boxes.append([x, y, int(width), int(height)])
        confidences.append(float(confidence))
        classIDs.append(classID)


  # apply non-maxima suppression to suppress weak, overlapping
  # bounding boxes
  idxs = cv2.dnn.NMSBoxes(boxes, confidences, 0.3,
    0.25)
  # ensure at least one detection exists
  if len(idxs) > 0:
    # loop over the indexes we are keeping
    for i in idxs.flatten():
      # extract the bounding box coordinates
      (x, y) = (boxes[i][0], boxes[i][1])
      (w, h) = (boxes[i][2], boxes[i][3])
      # draw a bounding box rectangle and label on the frame
      color = [int(c) for c in COLORS[classIDs[i]]]
      cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
      label = classes[classIDs[i]]
      text = "{}: {:.4f}".format(alt_names[label],
        confidences[i])
      cv2.putText(frame, text, (x, y - 5),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)


  cv2_imshow(frame)

下载1：OpenCV-Contrib扩展模块中文版教程

在「小白学视觉」公众号后台回复：扩展模块中文教程，即可下载全网第一份OpenCV扩展模块教程中文版，涵盖扩展模块安装、SFM算法、立体视觉、目标跟踪、生物视觉、超分辨率处理等二十多章内容。

下载2：Python视觉实战项目52讲

在「小白学视觉」公众号后台回复：Python视觉实战项目，即可下载包括图像分割、口罩检测、车道线检测、车辆计数、添加眼线、车牌识别、字符识别、情绪检测、文本内容提取、面部识别等31个视觉实战项目，助力快速学校计算机视觉。

下载3：OpenCV实战项目20讲

在「小白学视觉」公众号后台回复：OpenCV实战项目20讲，即可下载含有20个基于OpenCV实现20个实战项目，实现OpenCV学习进阶。

交流群

欢迎加入公众号读者群一起和同行交流，目前有SLAM、三维视觉、传感器、自动驾驶、计算摄影、检测、分割、识别、医学影像、GAN、算法竞赛等微信群（以后会逐渐细分），请扫描下面微信号加群，备注：”昵称+学校/公司+研究方向“，例如：”张三 + 上海交大 + 视觉SLAM“。请按照格式备注，否则不予通过。添加成功后会根据研究方向邀请进入相关微信群。请勿在群内发送广告，否则会请出群，谢谢理解~

资讯详情

实战：基于深度学习的道路损坏检测

动力学技术KTU1121 USB Type-C 端口保护器的介绍、特性、及应用

实战：基于深度学习的道路损坏检测

动力学技术KTU1121 USB Type-C 端口保护器的介绍、特性、及应用

最近热搜

历史搜索 清除历史记录

历史搜索清除历史记录