资讯详情

手把手搭建经典神经网络系列(2)——VggNet

一、VggNet简介

2014年,新的深度卷积神经网络共同开发:,并取得了比赛项目的,同年也提出)和项目的

VGGNet探索了与其通过反复堆叠直接关系 3*3 小卷积核和 2*2 最大池化层,VGGNet成功的构筑了卷积神经网络。

VggNet 可以看成是,都由构成两部分。VggNet探讨了卷积神经网络的深度及其性能之间的关系。~19层深卷积神经网络,(但从后来神经网络的发展来看,盲目加深神经网络的深度并不一定能达到最佳效果。可能会伴随等问题),使错误率大大降低。

  • 1409.1556.pdf (arxiv.org)
  • 链接:https://pan.baidu.com/s/1iARC6yFSt94rU34xlUkH1Q提取码:wr2t

这里强烈建议你有时间自己阅读原文,可能会给你带来!如果读者需要数据集,可以去这个系列下寻找。

其实作者是在论文中设计的的网络结构(),但究其。VggNet 中,通过不断提高性能。

下图为(训练时输入为224*224的RGB图像):

分别用于论文作者测试这六种网络结构,这六种网络结构相似成分的区别在于每个卷积层的子层数量不同,从A到E依次增加(子层数从1到4),总网络深度从11层到19层(),表中的卷积层参数为conv(感受野大小)-通道数 ,表示使用3*3卷积核,通道数为128。

虽然从A到E每一级网络逐渐变深,但是,这是因为参数主要消耗在最后三个全连接层。虽然前面的卷积部分很深,但参数消耗不大,但训练耗时的部分仍然是,因其。这其中的D,E也就是我们常说的

其立体结构如下:

  • 1、输入的图片(),经64个3x3卷积核作两次卷积 ReLU,卷积后的尺寸变为
  • 2、作max pooling(最大化池化)池化单元尺寸为2x2(),池化后的尺寸变为
  • 3、经卷积核作两次卷积 ReLU,尺寸变为
  • 4、作2x2的池化,尺寸变成
  • 5、经个3x3卷积核作三次卷积 ReLU,尺寸变为
  • 6、作2x2的池化,尺寸变成
  • 7、经个3x3卷积核作三次卷积 ReLU,尺寸变为
  • 8、作2x2的池化,尺寸变成
  • 9、经个3x3卷积核作三次卷积 ReLU,尺寸变为
  • 10、作2x2的池化,尺寸变为
  • 11、与两层,一层1x1x1000进行全连接 ReLU(共三层)
  • 12、通过输出个预测结果

  • 首先是两个可替代一个,三个可以替代一个。(这一点与的差别特别大,熟悉AlexNet学生都知道,AlexNet网络中充斥着大量的网络的卷积核)
    • 减少参数量和计算量
    • 增加激活函数以增加网络的非线性
    • 通道通道之间融合
    • 增加激活函数以增加网络的非线性表达能力。

  • 代替3x3池化核,相比平均池化,更能

        VGG由构成,层与层之间使用max-pooling(最大池化)分开,所有隐层的激活单元都采用ReLU函数

        Vgg使用多个较的卷积层代替一个的卷积层,可以减少参数(而且保持了相同的感受野),相当于进行了更多的非线性映射,可以增加网络的拟合/表达能力。 

        小卷积核是Vgg的一个重要特点,虽然,但没有采用,而是通过降低卷积核的大小(3x3),(VGG:从1到4卷积子层,AlexNet:1子层)。

相比AlexNet的3x3的池化核,

Vgg网络第一层的通道数为64,后面每层都进行了翻倍,最多到512个通道,通道数的增加,使得出来。

由于卷积核专注于扩大通道数、池化专注于缩小宽和高,使得模型架构上更深更宽的同时,控制了计算量的增加规模。

笔者以下代码都是基于Pytorch实现的,如若需要其他框架下的代码,可以私聊笔者,笔者无偿提供。

import torch.nn as nn
import torch

# official pretrain weights
model_urls = {
    'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth',
    'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth',
    'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth',
    'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth'
}


class VGG(nn.Module):
    def __init__(self, features, num_classes=1000, init_weights=False):
        super(VGG, self).__init__()
        self.features = features
        self.classifier = nn.Sequential(
            nn.Linear(512*7*7, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(4096, num_classes)
        )
        if init_weights:
            self._initialize_weights()

    def forward(self, x):
        # N x 3 x 224 x 224
        x = self.features(x)
        # N x 512 x 7 x 7
        x = torch.flatten(x, start_dim=1)
        # N x 512*7*7
        x = self.classifier(x)
        return x

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
                nn.init.xavier_uniform_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.Linear):
                nn.init.xavier_uniform_(m.weight)
                # nn.init.normal_(m.weight, 0, 0.01)
                nn.init.constant_(m.bias, 0)


def make_features(cfg: list):
    layers = []
    in_channels = 3
    for v in cfg:
        if v == "M":
            layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
        else:
            conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
            layers += [conv2d, nn.ReLU(True)]
            in_channels = v
    return nn.Sequential(*layers)


cfgs = {
    'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
    'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
    'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}


def vgg(model_name="vgg16", **kwargs):
    assert model_name in cfgs, "Warning: model number {} not in cfgs dict!".format(model_name)
    cfg = cfgs[model_name]

    model = VGG(make_features(cfg), **kwargs)
    return model

        在model.y文件中,搭建了VggNet的网络结构。代码语义可以查看注释,如有疑问欢迎留言!

import os
import sys
import json

import torch
import torch.nn as nn
from torchvision import transforms, datasets
import torch.optim as optim
from tqdm import tqdm

from model import vgg


def main():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    print("using {} device.".format(device))

    data_transform = {
        "train": transforms.Compose([transforms.RandomResizedCrop(224),
                                     transforms.RandomHorizontalFlip(),
                                     transforms.ToTensor(),
                                     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
        "val": transforms.Compose([transforms.Resize((224, 224)),
                                   transforms.ToTensor(),
                                   transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}

    data_root = os.path.abspath(os.path.join(os.getcwd(), "./"))  # get data root path
    image_path = os.path.join(data_root, "data_set", "flower_data")  # flower data set path
    assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
    train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
                                         transform=data_transform["train"])
    train_num = len(train_dataset)

    # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
    flower_list = train_dataset.class_to_idx
    cla_dict = dict((val, key) for key, val in flower_list.items())
    # write dict into json file
    json_str = json.dumps(cla_dict, indent=4)
    with open('class_indices.json', 'w') as json_file:
        json_file.write(json_str)

    batch_size = 15             #根据自己的电脑进行设置即可,笔者电脑太菜
    nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])  # number of workers
    print('Using {} dataloader workers every process'.format(nw))

    train_loader = torch.utils.data.DataLoader(train_dataset,
                                               batch_size=batch_size, shuffle=True,
                                               num_workers=nw)

    validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
                                            transform=data_transform["val"])
    val_num = len(validate_dataset)
    validate_loader = torch.utils.data.DataLoader(validate_dataset,
                                                  batch_size=batch_size, shuffle=False,
                                                  num_workers=nw)
    print("using {} images for training, {} images for validation.".format(train_num,
                                                                           val_num))

    # test_data_iter = iter(validate_loader)
    # test_image, test_label = test_data_iter.next()

    model_name = "vgg16"
    net = vgg(model_name=model_name, num_classes=5, init_weights=True)
    net.to(device)
    loss_function = nn.CrossEntropyLoss()
    optimizer = optim.Adam(net.parameters(), lr=0.0001)

    epochs = 25
    best_acc = 0.0
    save_path = './{}Net.pth'.format(model_name)
    train_steps = len(train_loader)
    for epoch in range(epochs):
        # train
        net.train()
        running_loss = 0.0
        train_bar = tqdm(train_loader, file=sys.stdout)
        for step, data in enumerate(train_bar):
            images, labels = data
            optimizer.zero_grad()
            outputs = net(images.to(device))
            loss = loss_function(outputs, labels.to(device))
            loss.backward()
            optimizer.step()

            # print statistics
            running_loss += loss.item()

            train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
                                                                     epochs,
                                                                     loss)

        # validate
        net.eval()
        acc = 0.0  # accumulate accurate number / epoch
        with torch.no_grad():
            val_bar = tqdm(validate_loader, file=sys.stdout)
            for val_data in val_bar:
                val_images, val_labels = val_data
                outputs = net(val_images.to(device))
                predict_y = torch.max(outputs, dim=1)[1]
                acc += torch.eq(predict_y, val_labels.to(device)).sum().item()

        val_accurate = acc / val_num
        print('[epoch %d] train_loss: %.3f  val_accuracy: %.3f' %
              (epoch + 1, running_loss / train_steps, val_accurate))

        if val_accurate > best_acc:
            best_acc = val_accurate
            torch.save(net.state_dict(), save_path)

    print('Finished Training')


if __name__ == '__main__':
    main()

        权重训练文件,本实验选取了较为经典入门的花类识别,同时方便与上一节的AlexNet网络的识别效果进行对比。

predict.py

import os
import json

import torch
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt

from model import vgg


def main():
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

    data_transform = transforms.Compose(
        [transforms.Resize((224, 224)),
         transforms.ToTensor(),
         transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

    # load image
    img_path = "./1.png"
    assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
    img = Image.open(img_path)
    plt.imshow(img)
    # [N, C, H, W]
    img = data_transform(img)
    # expand batch dimension
    img = torch.unsqueeze(img, dim=0)

    # read class_indict
    json_path = './class_indices.json'
    assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)

    with open(json_path, "r") as f:
        class_indict = json.load(f)
    
    # create model
    model = vgg(model_name="vgg16", num_classes=5).to(device)
    # load model weights
    weights_path = "./vgg16Net.pth"
    assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)
    model.load_state_dict(torch.load(weights_path, map_location=device))

    model.eval()
    with torch.no_grad():
        # predict class
        output = torch.squeeze(model(img.to(device))).cpu()
        predict = torch.softmax(output, dim=0)
        predict_cla = torch.argmax(predict).numpy()

    print_res = "class: {}   prob: {:.3}".format(class_indict[str(predict_cla)],
                                                 predict[predict_cla].numpy())
    plt.title(print_res)
    for i in range(len(predict)):
        print("class: {:10}   prob: {:.3}".format(class_indict[str(i)],
                                                  predict[i].numpy()))
    plt.show()


if __name__ == '__main__':
    main()

        预测函数,用于对图片的识别。

   的提出带领图像识别技术的进一步发展,其通过给予后来者众多启发

        同时,由于Vgg的极简优势助其有用可以在GPU上进行快速运算。后来于中引入了,其单路架构等优势,让其在部署领域大放异彩。这一切都离不开VggNet的出现!!!

标签: lrn系列热继电器

锐单商城拥有海量元器件数据手册IC替代型号,打造 电子元器件IC百科大全!

锐单商城 - 一站式电子元器件采购平台