目录
- 一、参考文献
- 二、下载数据集
- 三、转换数据集
-
- 1、新建文件夹
- 2、将txt标注转换为xml
- 3.分为训练集和验证集
- 四、训练
-
- 1、准备mask.yml
- 2、修改models/yolov5l.yaml
- 3、新建train_mask.py
- 4、训练
- 5、生成的模型
- 五、推理验证
-
- 1、新建detect_mask.py
- 2、推理图片
- 3、推理视频
-
- 下载行人视频
- 推理
- 六、转换瑞芯微
-
- 导出中间模型
-
- 修改models/yolo.py
- 导出
- 2.中间模型转换瑞芯微模型
-
- 复制中间模型
- 新建convert_mask.py
- 3、执行转换
一、参考文献
利用yolov5实现口罩佩戴检测算法(非常详细) 目标检测-数据集格式转换、训练集和验证集
二、下载数据集
附上博主使用的口罩数据集链接:https://pan.baidu.com/s/1Gud8jemSCdjG00TYA74WpQ 提取码:
下载之后是mask.zip,解压后有两个文件夹和,这里大约有8000张图片,这里已经是txt(yolo训练标签是txt),一般的标签是xml格式no-mask,1:mask
三、转换数据集
在这方面,我建议去看这个博客,目标检测-数据集格式转换和训练集和验证集的划分 因为博主使用的数据集已经是标签了txt格式,但我先将txt转xml格式,然后直接使用代码xml格式转为yolo(txt)训练集和测试集不直接用于格式和划分。txt格式直接划分,炮哥就是这样解释的,txt划分后放入训练会出错) 在这个例子中,我的做法将不同于他们所有人的做法。我先用炮哥的代码把它拿走yolo的txt转换为xml;然后将所有的images和labels放入一个img文件夹,然后用自己的代码划分训练集和验证集。
1.新建文件夹
在/data/下新建voc_data文件夹,在voc_data新建文件夹Annotations,JPEGImages,YOLO
- Annotations:储存转换后xml标注
- JPEGImages:将解压后的images这里所有的图片都被复制到这里
- YOLO:将解压后的txt文件全部复制到次
2、将txt标注转换为xml
在data目录下新建yolo_to_voc.py,注意main方法中的路径
from xml.dom.minidom import Document import os import cv2 # 参考链接:https://blog.csdn.net/didiaopao/article/details/120022845 # def makexml(txtPath, xmlPath, picPath): # txt文件夹路径,xml文件保存路径,文件夹路径 def makexml(picPath, txtPath, xmlPath): # txt文件夹路径,xml文件保存路径,文件夹路径 """该函数用于将yolo格式txt将标注文件转换为voc格式xml标注文件 在自己的图片文件夹下建立三个子文件夹,分别命名为picture、txt、xml """ dic = {
'0': "no-mask", # 创建字典来转换类型 '1': "mask", # 这里的字典应该与你自己的字典相匹配classes.txt文件中的类对应,顺序应一致 } files = os.listdir(txtPath) for i, name in enumerate(files): xmlBuilder = Document() annotation = xmlBuilder.createElement("annotation") # 创建annotation标签 xmlBuilder.appendChild(annotation) txtFile = open(txtPath + name) print("文件:",txtPath + name) txtList = txtFile.readlines() img = cv2.imread(picPath + name[0:-4] + ".jpg") Pheight, Pwidth, Pdepth = img.shape folder = xmlBuilder.createElement("folder") # folder标签 foldercontent = xmlBuilder.createTextNode("driving_annotation_dataset") folder.appendChild(foldercontent) annotation.appendChild(folder) # folder标签结束 filename = xmlBuilder.createElement("filename") # filename标签 filenamecontent = xmlBuilder.createTextNode(name[0:-4] + ".jpg") filename.appendChild(filenamecontent) annotation.appendChild(filename) # filename标签结束 size = xmlBuilder.createElement("size") # size标签 width = xmlBuilder.createElement("width") # size子标签width widthcontent = xmlBuilder.createTextNode(str(Pwidth)) width.appendChild(widthcontent) size.appendChild(width) # size子标签width结束 height = xmlBuilder.createElement("height") # size子标签height heightcontent = xmlBuilder.createTextNode(str(Pheight)) height.appendChild(heightcontent) size.appendChild(height) # size子标签height结束 depth = xmlBuilder.createElement("depth") # size子标签depth depthcontent = xmlBuilder.createTextNode(str(Pdepth)) depth.appendChild(depthcontent) size.appendChild(depth) # size子标签depth结束 annotation.appendChild(size) # size标签结束 for j in txtList: oneline = j.strip().split(" ") object = xmlBuilder.createElement("object") # object 标签 picname = xmlBuilder.createElement("name") # name标签 namecontent = xmlBuilder.createTextNode(dic[oneline[0]]) picname.appendChild(namecontent) object.appendChild(picname) # name标签结束 pose = xmlBuilder.createElement("pose") # pose标签 posecontent = xmlBuilder.createTextNode("Unspecified") pose.appendChild(posecontent) object.appendChild(pose) # pose标签结束 truncated = xmlBuilder.createElement("truncated") # truncated标签 truncatedContent = xmlBuilder.createTextNode("0") truncated.appendChild(truncatedContent) object.appendChild(truncated) # truncated标签结束 difficult = xmlBuilder.createElement("difficult") # difficult标签 difficultcontent = xmlBuilder.createTextNode("0") difficult.appendChild(difficultcontent) object.appendChild(difficult) # difficult标签结束 bndbox = xmlBuilder.createElement("bndbox") # bndbox标签 xmin = xmlBuilder.createElement("xmin") # xmin标签 mathData = int(((float(oneline[1])) * Pwidth + 1) - (float(oneline[3])) * 0.5 * Pwidth) xminContent = xmlBuilder.createTextNode(str(mathData)) xmin.appendChild(xminContent) bndbox.appendChild(xmin) # xmin标签结束 ymin = xmlBuilder.createElement("ymin") # ymin标签 mathData = int(((float(oneline[2])) * Pheight + 1) - (float(oneline[4])) * 0.5 * Pheight) yminContent = xmlBuilder.createTextNode(str(mathData)) ymin.appendChild(yminContent) bndbox.appendChild(ymin) # ymin标签结束 xmax = xmlBuilder.createElement("xmax") # xmax标签 mathData = int(((float(oneline[1])) * Pwidth + 1) + (float(oneline[3])) * 0.5 * Pwidth) xmaxContent = xmlBuilder.createTextNode(str(mathData)) xmax.appendChild(xmaxContent) bndbox.appendChild(xmax) # xmax标签结束 ymax = xmlBuilder.createElement("ymax") # ymax标签 mathData = int(((float(oneline[2])) * Pheight + 1) + (float(oneline[4])) * 0.5 * Pheight) ymaxContent = xmlBuilder.createTextNode(str(mathData)) ymax.appendChild(ymaxContent) bndbox.appendChild(ymax) # ymax标签结束 object.appendChild(bndbox) # bndbox标签结束 annotation.appendChild(object) # object标签结束 f = open(xmlPath + name[0:-4] + ".xml", 'w') xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8') f.close() if __name__ == "__main__": # picPath = "VOCdevkit/VOC2007/JPEGImages/" # 图片所在文件夹路径,后面的/一定要带上 # txtPath = "VOCdevkit/VOC2007/YOLO/" # txt所在文件夹路径,后面的/一定要带上 # xmlPath = "VOCdevkit/VOC2007/Annotations/" # xml文件保存路径,后面的/一定要带上 picPath = "voc_data/JPEGImages/" # 图片所在文件夹路径,后面的/一定要带上 txtPath = "voc_data/YOLO/" # txt所在文件夹路径,后面的/一定要带上 xmlPath = "voc_data/Annotations/" # xml文件保存路径,后面的/一定要带上 makexml(picPath, txtPath, xmlPath)
执行转换
python yolo_to_voc.py
转换之后在data/voc_data/Annotations中就都是xml了
注意数据集中有个labels是错误的,类型是none,需要删除否则程序会报错,可以在代码的21行加上print(“文件:”,txtPath + name),找到具体哪个文件有问题。
3、划分训练集和验证集
将voc_data/Annotations和JPEGImages都拷贝到data/img目录下
cp -rp data/voc_data/Annotations/* data/img/
cp -rp data/voc_data/JPEGImages/* data/img/
拷贝划分程序process-date文件夹到data目录下 注意修改create_all.py中的类别,然后执行
python create_all.py
本代码会自动划分训练集和验证集,并把错误的图片或者标注文件筛选出来
四、训练
我们使用yolov5l.pt的预训练模型来训练
1、准备mask.yml
在data目录下新建mask.yml。指定路径和类别
2、修改models/yolov5l.yaml
修改识别类型
3、新建train_mask.py
cp train.py train_mask.py
重点修改如下
epochs设置训练300轮 batch-size设置10,可以根据gpu的性能设置,默认是16
4、训练
python train_mask.py
5、生成的模型
五、推理验证
1、新建detect_mask.py
cp detect.py detect_mask.py
修改如下配置
2、推理图片
python detect_mask.py --source data/test/
3、推理视频
下载行人视频
lux https://www.bilibili.com/video/BV1Q54y1L74D
推理
python detect_mask.py --source D:\ai\mask.mp4
六、转换瑞芯微
1、导出中间模型
修改models/yolo.py
修改注释,
导出
cp export.py export_mask.py
修改如下
python export_mask.py
会在best.pt目录下生成
2、中间模型转换瑞芯微模型
rk工具路径:
/cnn/rknn/rknn-toolkit-master/examples/pytorch/yolov5
拷贝中间模型
将拷贝到rk工具路径中
mv best.torchscript.pt /cnn/rknn/rknn-toolkit-master/examples/pytorch/yolov5/mask.torchscript.pt
新建convert_mask.py
代码如下,重点
- PT_MODEL = ‘mask.torchscript.pt’
- RKNN_MODEL = ‘mask.rknn’
import os import numpy as np import cv2 from rknn.api import RKNN PT_MODEL = 'mask.torchscript.pt' RKNN_MODEL = 'mask.rknn' IMG_PATH = 'bus.jpg' DATASET = './dataset.txt' # QUANTIZE_ON = False QUANTIZE_ON = True BOX_THRESH = 0.5 NMS_THRESH = 0.6 IMG_SIZE = 640 CLASSES = ("person") def sigmoid(x): return 1 / (1 + np.exp(-x)) def xywh2xyxy(x): # Convert [x, y, w, h] to [x1, y1, x2, y2] y = np.copy(x) y[:, 0] = x[:, 0] - x[:, 2] / 2 # top left x y[:, 1] = x[:, 1] - x[:, 3] / 2 # top left y y[:, 2] = x[:, 0] + x[:, 2] / 2 # bottom right x y[:, 3] = x[:, 1] + x[:, 3] / 2 # bottom right y return y def process(input, mask, anchors): anchors = [anchors[i] for i in mask] grid_h, grid_w = map(int, input.shape[0:2]) box_confidence = sigmoid(input[..., 4]) box_confidence = np.expand_dims(box_confidence, axis=-1) box_class_probs = sigmoid(input[..., 5:]) box_xy = sigmoid(input[..., :2])*2 - 0.5 col = np.tile(np.arange(0, grid_w), grid_w).reshape(-1, grid_w) row = np.tile(np.arange(0, grid_h).reshape(-1, 1), grid_h) col = col.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2) row = row.reshape(grid_h, grid_w, 1, 1).repeat(3, axis=-2) grid = np.concatenate((col, row), axis=-1) box_xy += grid box_xy *= int(IMG_SIZE/grid_h) box_wh = pow(sigmoid(input[..., 2:4])*2, 2) box_wh = box_wh * anchors box = np.concatenate((box_xy, box_wh), axis=-1) return box, box_confidence, box_class_probs def filter_boxes(boxes, box_confidences, box_class_probs): """Filter boxes with box threshold. It's a bit different with origin yolov5 post process! # Arguments boxes: ndarray, boxes of objects. box_confidences: ndarray, confidences of objects. box_class_probs: ndarray, class_probs of objects. # Returns boxes: ndarray, filtered boxes. classes: ndarray, classes for boxes. scores: ndarray, scores for boxes. """ box_classes = np.argmax(box_class_probs, axis=-1) box_class_scores = np.max(box_class_probs, axis=-1) pos = np.where(box_confidences[...,0] >= BOX_THRESH) boxes = boxes[pos] classes = box_classes[pos] scores = box_class_scores[pos] return boxes, classes, scores def nms_boxes(boxes, scores): """Suppress non-maximal boxes. # Arguments boxes: ndarray, boxes of objects. scores: ndarray, scores of objects. # Returns keep: ndarray, index of effective boxes. """ x = boxes[:, 0] y = boxes[:, 1] w = boxes[:, 2] - boxes[:, 0] h = boxes[:, 3] - boxes[:, 1] areas = w * h order = scores.argsort()[::-1] keep = [] while order.size > 0: i = order[0] keep.append(i) xx1 = np.maximum(x[i], x[order[1:]]) yy1 = np.maximum(y[i], y[order[1:]]) xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]]) yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]]) w1 = np.maximum(0.0, xx2 - xx1 + 0.00001) h1 = np.maximum(0.0, yy2 - yy1 + 0.00001) inter = w1 * h1 ovr = inter / (areas[i] + areas[order[1:]] - inter) inds = np.where(ovr <= NMS_THRESH)[0] order = order[inds + 1] keep = np.array(keep) return keep def yolov5_post_process(input_data): masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]] anchors = [[10, 13], [16, 30], [33, 23], [30, 61], [62, 45], [59, 119], [116, 90], [156, 198], [373, 326]] boxes, classes, scores = [], [], [] for 标签:
wpq铝合金材料传感器