402com永利1站:cs231n学习笔记-CNN-目标检测、定位、分割

原标题:ECCV 201八丨YOLO遇上OpenPose,近200FPS的高帧数多人态度检查测试

cs231n学习笔记-CNN-目的检查测试、定位、分割

Original url:

伊瓢 发自 凹非寺

cite from:

在高帧数下,怎么样达成人体姿态检验?

一. 基本概念

下边那条刷屏的twitter摄像给出了答案。

1)CNN:Convolutional Neural Networks

Object Detection

 Published: 09
Oct 2015  Category: deep_learning

Jump to...

  1. Leaderboard
  2. Papers
    1. R-CNN
    2. MultiBox
    3. SPP-Net
    4. DeepID-Net
    5. NoC
    6. Fast
      R-CNN
    7. DeepBox
    8. MR-CNN
    9. Faster
      R-CNN
    10. YOLO
    11. AttentionNet
    12. DenseBox
    13. SSD
    14. Inside-Outside Net
      (ION)
    15. G-CNN
    16. HyperNet
    17. MultiPathNet
    18. CRAFT
    19. OHEM
    20. R-FCN
    21. MS-CNN
    22. PVANET
    23. GBD-Net
    24. StuffNet
  3. Detection From
    Video

    1. T-CNN
    2. Datasets
  4. Object Detection in
    3D
  5. Salient Object
    Detection
  6. Specific Object
    Deteciton

    1. Face
      Deteciton

      1. UnitBox
      2. MTCNN
      3. Datasets /
        Benchmarks
    2. Facial Point / Landmark
      Detection
    3. People
      Detection
    4. Person Head
      Detection
    5. Pedestrian
      Detection
    6. Vehicle
      Detection
    7. Traffic-Sign
      Detection
    8. Boundary / Edge / Contour
      Detection
    9. Skeleton
      Detection
    10. Fruit
      Detection
    11. Others
  7. Object
    Proposal
  8. Localization
  9. Tutorials
  10. Projects
  11. Blogs
MethodVOC2007VOC2010VOC2012ILSVRC 2013MSCOCO 2015Speed
OverFeat   24.3%  
R-CNN (AlexNet)58.5%53.7%53.3%31.4%  
R-CNN (VGG16)66.0%     
SPP_net(ZF-5)54.2%(1-model), 60.9%(2-model)  31.84%(1-model), 35.11%(6-model)  
DeepID-Net64.1%  50.3%  
NoC73.3% 68.8%   
Fast-RCNN (VGG16)70.0%68.8%68.4% 19.7%(@[0.5-0.95]), 35.9%(@0.5) 
MR-CNN78.2% 73.9%   
Faster-RCNN (VGG16)78.8% 75.9% 21.9%(@[0.5-0.95]), 42.7%(@0.5)198ms
Faster-RCNN (ResNet-101)85.6% 83.8% 37.4%(@[0.5-0.95]), 59.0%(@0.5) 
SSD300 (VGG16)72.1%    58 fps
SSD500 (VGG16)75.1%    23 fps
ION79.2% 76.4%   
AZ-Net70.4%   22.3%(@[0.5-0.95]), 41.0%(@0.5) 
CRAFT75.7% 71.3%48.5%  
OHEM78.9% 76.3% 25.5%(@[0.5-0.95]), 45.9%(@0.5) 
R-FCN (ResNet-50)77.4%    0.12sec(K40), 0.09sec(TitianX)
R-FCN (ResNet-101)79.5%    0.17sec(K40), 0.12sec(TitianX)
R-FCN (ResNet-101),multi sc train83.6% 82.0% 31.5%(@[0.5-0.95]), 53.2%(@0.5) 
PVANet 9.081.8% 82.5%  750ms(CPU), 46ms(TitianX)

这是当年ECCV上的1篇名叫《Pose Proposal
Networks》的随想,作者是东瀛柯尼卡美能达公司的関井大気(Taiki
SEKII),结合了二零一八年CVPLX570上的YOLO和CMU的OpenPose,创立出的新情势,能够实现高帧数录制中的四个人态度检查测试。

2)FC:Fully Connected

Leaderboard

Detection Results: VOC2012

  • intro: Competition “comp4” (train on own data)
  • homepage: 

高帧数,无压力

3)IoU:Intersection over Union (IoU的值定义:Region Proposal与Ground
Truth的窗口的混合比并集的比值,尽管IoU低于0.5,那么一定于指标照旧不曾检测到)

Papers

Deep Neural Networks for Object
Detection

  • paper: 

OverFeat: Integrated Recognition,
Localization and Detection using Convolutional Networks

  • intro: A deep version of the sliding window method, predicts
    bounding box directly from each location of the topmost feature map
    after knowing the confidences of the underlying object categories.
  • intro: training a convolutional network to simultaneously classify,
    locate and detect objects in images can boost the classification
    accuracy and the detection and localization accuracy of all tasks
  • arxiv: 
  • github: 
  • code: 

402com永利1站 1

4)ICCV:International Conference on Computer Vision

R-CNN

Rich feature hierarchies for
accurate object detection and semantic segmentation

  • intro: R-CNN
  • arxiv: 
  • supp: 
  • slides: 
  • slides: 
  • github: 
  • notes: 
  • caffe-pr(“Make R-CNN the Caffe detection
    example”): 

402com永利1站 2

5)R-CNN:Region-based Convolutional Neural Networks

MultiBox

Scalable Object Detection using
Deep Neural Networks

  • intro: MultiBox. Train a CNN to predict Region of Interest.
  • arxiv: 
  • github: 
  • blog: 

Scalable, High-Quality Object
Detection

  • intro: MultiBox
  • arxiv: 
  • github: 

而其它情势,比如NIPS 20一七 的AE(Associative embedding)、ICCV
20一7的奥迪Q3MPE(Regional multi-person pose estimation)、CVPXC6020一柒的PAF(Realtime multi-person 二D pose estimation using part affinity
田野(field)s),都心有余而力不足兑现高帧数尤其是100上述帧数摄像的神态检查测试。

6)AR:Average Recall

SPP-Net

Spatial Pyramid Pooling in Deep
Convolutional Networks for Visual Recognition

  • intro: ECCV 2014 / TPAMI 2015
  • arxiv: 
  • github: 
  • notes: 

Learning Rich Features from RGB-D
Images for Object Detection and Segmentation

  • arxiv: 

402com永利1站 3

7)mAP:mean Average Precision

DeepID-Net

DeepID-Net: Deformable Deep
Convolutional Neural Networks for Object Detection

  • intro: PAMI 2016
  • intro: an extension of R-CNN. box pre-training, cascade on region
    proposals, deformation layers and context representations
  • project
    page: 
  • arxiv: 

Object Detectors Emerge in Deep
Scene CNNs

  • arxiv: 
  • paper: 
  • paper: 
  • slides: 

segDeepM: Exploiting Segmentation
and Context in Deep Neural Networks for Object Detection

  • intro: CVPR 2015
  • project(code+data): 
  • arxiv: 
  • github: 

在COCO数据集上也不虚,相比谷歌PersonLab能在越来越高帧数下运转。

8)RPN:Region Proposal Networks

NoC

Object Detection Networks on
Convolutional Feature Maps

  • intro: TPAMI 2015
  • arxiv: 

Improving Object Detection with
Deep Convolutional Networks via Bayesian Optimization and Structured
Prediction

  • arxiv: 
  • slides: 
  • github: 

402com永利1站 4

9)FAIR:Facebook AI Research

Fast R-CNN

Fast R-CNN

  • arxiv: 
  • slides: 
  • github: 
  • webcam demo: 
  • notes: 
  • notes: 
  • github(“Fast R-CNN in
    MXNet”): 
  • github: 
  • github: 
  • github(Tensorflow): 

来看下具体多少,在头、肩、肘部位和完全上半身识别Chinese Football Association Super League越了别的方法,全体得分也不虚。

10)w.r.t.:with respect to

DeepBox

DeepBox: Learning Objectness with
Convolutional Networks

  • arxiv: 
  • github: 

神奇“体位”大冒险

11)Image Classification(what?):图像分类

MR-CNN

Object detection via a
multi-region & semantic segmentation-aware CNN model

  • intro: ICCV 2015. MR-CNN
  • arxiv: 
  • github: 
  • notes: 
  • notes: 
  • my notes: Who can tell me why there are a bunch of duplicated
    sentences in section 7.2 “Detection error analysis”? 😀

其它,常规的千姿百态检查评定11分便于出错的“体位”中,该方法也能够避开。

1二)Object Detection(what+where?)、Localization、Segmentation:对角检查测试、定位、分割

Faster R-CNN

Faster R-CNN: Towards Real-Time
Object Detection with Region Proposal Networks

  • intro: NIPS 2015
  • arxiv: 
  • gitxiv: 
  • slides: 
  • github: 
  • github: 
  • github: 
  • github(Torch): 
  • github(Torch): 
  • github(Tensorflow): 
  • github(tensorflow): 

Faster R-CNN in MXNet with
distributed implementation and data parallelization

  • github: 

譬如从天空跳伞下来这种意外的姿态:

贰. CNN基本知识

YOLO

You Only Look Once: Unified,
Real-Time Object Detection

402com永利1站 5

  • intro: YOLO uses the whole topmost feature map to predict both
    confidences for multiple categories and bounding boxes (which are
    shared for these categories).
  • arxiv: 
  • code: 
  • github: 
  • reddit: 
  • github: 
  • github: 
  • github: 
  • github: 
  • github: 
  • github: 
  • gtihub: 

Start Training YOLO with Our Own
Data

402com永利1站 6

  • intro: train with customized data and class numbers/labels. Linux /
    Windows version for darknet.
  • blog: 
  • github: 

R-CNN minus R

  • arxiv: 

402com永利1站 7

二.一 CNN的卷积流程

AttentionNet

AttentionNet: Aggregating Weak
Directions for Accurate Object Detection

  • intro: ICCV 2015
  • intro: state-of-the-art performance of 65% (AP) on PASCAL VOC
    2007/2012 human detection task
  • arxiv: 
  • slides: 
  • slides: 

人口过多的拥堵意况:

卷积总结进程如下图所示:

DenseBox

DenseBox: Unifying Landmark
Localization with End to End Object Detection

  • arxiv: 
  • demo: 
  • KITTI result: 

402com永利1站 8

大家刚刚描述的就是卷积。能够把卷积想象为复信号处理中的壹种奇特乘法。也可将多个矩阵生成点积想象为七个函数。图像就是底层函数,而过滤器就是在其上“卷过”的函数。

SSD

SSD: Single Shot MultiBox
Detector

402com永利1站 9

  • arxiv: 
  • paper: 
  • github: 
  • video: 
  • github(MXNet): 
  • github: 
  • github(Keras): 

干什么SSD(Single Shot MultiBoxDetector)对小目的的检查评定效果倒霉?

  • zhihu: 

再有,多个人重叠的图像。

图像的重点难题在于其高维度,原因是对高维度的拍卖时间和平运动算能力开销很高。卷积互联网正是为了通过种种办法下降图像的维度而陈设的。过滤器步幅就是收缩维度的一种艺术,另一种艺术是降采集样品。

Inside-Outside Net (ION)

Inside-Outside Net: Detecting
Objects in Context with Skip Pooling and Recurrent Neural
Networks

  • intro: “0.8s per image on a Titan X GPU (excluding proposal
    generation) without two-stage bounding-box regression and 1.15s per
    image with it”.
  • arxiv: 
  • slides: 
  • coco-leaderboard: 

Adaptive Object Detection Using
Adjacency and Zoom Prediction

  • intro: CVPR 2016. AZ-Net
  • arxiv: 
  • github: 
  • youtube: 

402com永利1站 10

2.2 Activations
maps的个数与Filter的个数一致

G-CNN

G-CNN: an Iterative Grid Based
Object Detector

  • arxiv: 

Factors in Finetuning Deep Model
for object detection Factors in Finetuning Deep Model for Object
Detection with Long-tail Distribution

  • intro: CVPR 2016.rank 3rd for provided data and 2nd for external
    data on ILSVRC 2015 object detection
  • project
    page: 
  • arxiv: 

We don’t need no bounding-boxes:
Training object class detectors using only human verification

  • arxiv: 

小心,左边站立的半边天和她最近在瑜伽垫上的人,完完全全分开了,不会闹出下边那种胳膊腿儿搞错的耻笑。

二.三输入层与Filter、Padding、Stride、参数和输出层的涉及

HyperNet

HyperNet: Towards Accurate Region
Proposal Generation and Joint Object Detection

  • arxiv: 

402com永利1站 11

1) 参数个数由Filter定义及Filter个数决定,其公式为:

MultiPathNet

A MultiPath Network for Object
Detection

402com永利1站 12

  • intro: BMVC 2016. Facebook AI Research (FAIR)
  • arxiv: 
  • github: 

原理

The number of parameters = (FxFxD + 1) * K

CRAFT

CRAFT Objects from Images

  • intro: CVPR 2016. Cascade Region-proposal-network And FasT-rcnn. an
    extension of Faster R-CNN
  • project page: 
  • arxiv: 
  • paper: 
  • github: 

402com永利1站 13

二)2个Activation Map共享多个Filter及其权重和谬误

OHEM

Training Region-based Object
Detectors with Online Hard Example Mining

  • intro: CVPR 2016 Oral. Online hard example mining (OHEM)
  • arxiv: 
  • paper: 

Track and Transfer: Watching
Videos to Simulate Strong Human Supervision for Weakly-Supervised Object
Detection

  • intro: CVPR 2016
  • arxiv: 

Exploit All the Layers: Fast and
Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded
Rejection Classifiers

那是基于ResNet-1八的PPN对几个人姿势检查实验的进程:

三)Activation Map个数与Filter个数相同

R-FCN

R-FCN: Object Detection via
Region-based Fully Convolutional Networks

  • arxiv: 
  • github: 
  • github: 

Weakly supervised object detection
using pseudo-strong labels

  • arxiv: 

Recycle deep features for better
object detection

  • arxiv: 

a) 输入图像;

2.4 Pooling(池化/降采样)过程

MS-CNN

A Unified Multi-scale Deep
Convolutional Neural Network for Fast Object Detection

  • intro: ECCV 2016
  • intro: 640×480: 15 fps, 960×720: 8 fps
  • arxiv: 
  • github: 
  • poster: 

Multi-stage Object Detection with
Group Recursive Learning

  • intro: VOC2007: 78.6%, VOC2012: 74.9%
  • arxiv: 

Subcategory-aware Convolutional
Neural Networks for Object Proposals and Detection

  • intro: SubCNN
  • arxiv: 
  • github: 

b) 从输入图像中检查测试部分边界框;

一)  Pooling在各个Activation Map上独立做,在Pooling之后,Activation
Map数量不变

PVANET

PVANET: Deep but Lightweight
Neural Networks for Real-time Object Detection

  • intro: “less channels with more layers”, concatenated ReLU,
    Inception, and HyperNet, batch normalization, residual connections
  • arxiv: 
  • github: 
  • leaderboard(PVANet
    9.0): 

PVANet: Lightweight Deep Neural
Networks for Real-time Object Detection

  • intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep
    Neural Networks (EMDNN). Continuation
    of arXiv:1608.08021
  • arxiv: 

c) 检查测试出身子;

Pooling层1般用来降维,将三个kxk的区域内取平均或取最大值,作为那3个小区域内的性状,传递到下一层。古板的Pooling层是不重叠的,使Pooling层重叠能够降低错误率,而且对防备过拟合有必然的作用。

GBD-Net

Gated Bi-directional CNN for
Object Detection

  • intro: The Chinese University of Hong Kong & Sensetime Group Limited
  • paper: 
  • mirror: 

Crafting GBD-Net for Object
Detection

  • intro: winner of the ImageNet object detection challenge of 2016.
    CUImage and CUVideo
  • intro: gated bi-directional CNN (GBD-Net)
  • arxiv: 
  • github: 

d) 区分图中种种人。

2)Pooling进度描述(Pooling进程不要求参数)

StuffNet

StuffNet: Using ‘Stuff’ to Improve
Object Detection

  • arxiv: 

Generalized Haar Filter based Deep
Networks for Real-Time Object Detection in Traffic Scene

  • arxiv: 

Hierarchical Object Detection with
Deep Reinforcement Learning

  • intro: Deep Reinforcement Learning Workshop (NIPS 2016)
  • project page: 
  • arxiv: 
  • github: 

Learning to detect and localize
many objects from few examples

  • arxiv: 

402com永利1站 14

二.伍 深度革命2014

Detection From Video

Learning Object Class Detectors
from Weakly Annotated Video

  • intro: CVPR 2012
  • paper: 

Analysing domain shift factors
between videos and images for object detection

  • arxiv: 

Video Object Recognition

  • slides: 

Deep Learning for Saliency
Prediction in Natural Video

  • intro: Submitted on 12 Jan 2016
  • keywords: Deep learning, saliency map, optical flow, convolution
    network, contrast features
  • paper: 

那篇随想的主意是先将图片分割为较小的网格,使用较小的网络对每1幅网格图片进行单次物体检查测试范例,之后经过区域建议(region
proposal)框架将姿态检验重定义为指标检查评定难点。

一)深度革命中蒙受的题材:

T-CNN

T-CNN: Tubelets with Convolutional
Neural Networks for Object Detection from Videos

  • intro: Winning solution in ILSVRC2015 Object Detection from
    Video(VID) Task
  • arxiv: 
  • github: 

Object Detection from Video
Tubelets with Convolutional Neural Networks

  • intro: CVPR 2016 Spotlight paper
  • arxiv: 
  • paper: 
  • gihtub: 

Object Detection in Videos with
Tubelets and Multi-context Cues

  • intro: SenseTime Group
  • slides: 
  • slides: 

Context Matters: Refining Object
Detection in Video with Recurrent Neural Networks

  • intro: BMVC 2016
  • keywords: pseudo-labeler
  • arxiv: 
  • paper: 

CNN Based Object Detection in
Large Video Images

  • intro: WangTao @ 爱奇艺
  • keywords: object retrieval, object detection, scene classification
  • slides: 

后来,使用单次CNN直接检查测试肉体,通过新型的概率贪婪解析步骤,生成姿势提出。

趁着CNN网络的上进,越发的VGG互联网的提议,我们发现网络的层数是1个关键因素,貌似越深的网络功效越好。可是随着网络层数的增多,难点也随之而来。

Datasets

YouTube-Objects dataset
v2.2

  • homepage: 

ILSVRC2015: Object detection from
video (VID)

  • homepage: 

区域提案部分被定义为界线框检查评定(Bounding BoxDetections),大小和被检查实验人身形成比例,并且可以仅使用国有关键点注释实行监察和控制。

(1)第二个难点: vanishing/exploding
gradients(即梯度消失或爆炸):那就招致操练难以磨灭。不过随着 normalized
initialization and BN(Batch
诺玛lization)的提议,化解了梯度消失或爆炸难题。

Object Detection in 3D

Vote3Deep: Fast Object Detection
in 3D Point Clouds Using Efficient Convolutional Neural Networks

  • arxiv: 

凡事架构由单个完全CNN构成,具有相对较低分辨率的特征图,并利用专为姿势检查测试品质设计的损耗函数直接开始展览端到端优化,此架构称为姿态提出网络(Pose
Proposal Network,PPN)
。PPN借鉴了YOLO的优点。

(2)首个难点:网络越深,磨练基值误差和测试模型误差越大。在消逝难题一举成功后,又贰个标题暴流露来:随着互连网深度的充实,系统精度获得饱和之后,急迅的下降。令人意想不到的是以此特性下跌不是过拟合导致的。对二个很是深度的模子参与额外的层数导致操练舍入误差变大。如下图所示,可通过Deep
Residual Learning 框架来化解这种因为吃水扩充而致使准确性下跌难题。

Salient Object Detection

This task involves predicting the salient regions of an image given by
human eye fixations.

Large-scale optimization of
hierarchical features for saliency prediction in natural images

  • paper: 

Predicting Eye Fixations using
Convolutional Neural Networks

  • paper: 

Saliency Detection by
Multi-Context Deep Learning

  • paper: 

DeepSaliency: Multi-Task Deep
Neural Network Model for Salient Object Detection

  • arxiv: 

SuperCNN: A Superpixelwise
Convolutional Neural Network for Salient Object Detection

402com永利1站 15

  • paper: www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html

Shallow and Deep Convolutional
Networks for Saliency Prediction

  • arxiv: 
  • github: 

Recurrent Attentional Networks for
Saliency Detection

  • intro: CVPR 2016. recurrent attentional convolutional-deconvolution
    network (RACDNN)
  • arxiv: 

Two-Stream Convolutional Networks
for Dynamic Saliency Prediction

  • arxiv: 

Unconstrained Salient Object
Detection

Unconstrained Salient Object
Detection via Proposal Subset Optimization

402com永利1站 16

  • intro: CVPR 2016
  • project page: 
  • paper: 
  • github: 
  • caffe model
    zoo: 

Salient Object Subitizing

402com永利1站 17

  • intro: CVPR 2015
  • intro: predicting the existence and the number of salient objects in
    an image using holistic cues
  • project page: 
  • arxiv: 
  • paper: 
  • caffe model
    zoo: 

Deeply-Supervised Recurrent
Convolutional Neural Network for Saliency Detection

  • intro: ACMMM 2016. deeply-supervised recurrent convolutional neural
    network (DSRCNN)
  • arxiv: 

Saliency Detection via Combining
Region-Level and Pixel-Level Predictions with CNNs

  • intro: ECCV 2016
  • arxiv: 

Edge Preserving and Multi-Scale
Contextual Neural Network for Salient Object Detection

  • arxiv: 

A Deep Multi-Level Network for
Saliency Prediction

  • arxiv: 

Visual Saliency Detection Based on
Multiscale Deep CNN Features

  • intro: IEEE Transactions on Image Processing
  • arxiv: 

A Deep Spatial Contextual
Long-term Recurrent Convolutional Network for Saliency Detection

  • intro: DSCLRCN
  • arxiv: 

Deeply supervised salient object
detection with short connections

  • arxiv: 

Weakly Supervised Top-down Salient
Object Detection

  • intro: Nanyang Technological University
  • arxiv: 

传送门

叁. 空中定位与检查实验

Specific Object Deteciton

论文:

参考音讯《基于深度学习的指标检查评定钻探进展》

Face Deteciton

Multi-view Face Detection Using
Deep Convolutional Neural Networks

  • intro: Yahoo
  • arxiv: 

From Facial Parts Responses to
Face Detection: A Deep Learning Approach

402com永利1站 18

  • project
    page: 

Compact Convolutional Neural
Network Cascade for Face Detection

  • arxiv: 
  • github: 

Face Detection with End-to-End
Integration of a ConvNet and a 3D Model

  • intro: ECCV 2016
  • arxiv: 
  • github(MXNet): 

Supervised Transformer Network for
Efficient Face Detection

  • arxiv: 

叁.一 总计机视觉任务

UnitBox

UnitBox: An Advanced Object
Detection Network

  • intro: ACM MM 2016
  • arxiv: 

Bootstrapping Face Detection with
Hard Negative Examples

  • author: 万韶华 @ 小米.
  • intro: Faster R-CNN, hard negative mining. state-of-the-art on the
    FDDB dataset
  • arxiv: 

A Multi-Scale Cascade Fully
Convolutional Network Face Detector

  • intro: ICPR 2016
  • arxiv: 

Poster:

三.二 守旧指标检查评定方法

MTCNN

Joint Face Detection and Alignment
using Multi-task Cascaded Convolutional Networks

Joint Face Detection and Alignment
using Multi-task Cascaded Convolutional Neural Networks

402com永利1站 19

  • project
    page: 
  • arxiv: 
  • github(Matlab): 
  • github(MXNet): 
  • github: 

古板指标检查评定流水生产线:

Datasets / Benchmarks

FDDB: Face Detection Data Set and
Benchmark

  • homepage: 
  • results: 

WIDER FACE: A Face Detection
Benchmark

402com永利1站 20

  • homepage: 
  • arxiv: 

至于code嘛,如今并未。

一)区域选拔(穷举策略:接纳滑动窗口,且设置不相同的深浅,差别的长宽比对图像举办遍历,时间复杂度高)

Facial Point / Landmark Detection

Deep Convolutional Network Cascade
for Facial Point Detection

402com永利1站 21

  • homepage: 
  • paper: 
  • github: 

A Recurrent Encoder-Decoder
Network for Sequential Face Alignment

  • intro: ECCV 2016
  • arxiv: 

Detecting facial landmarks in the
video based on a hybrid framework

  • arxiv: 

Deep Constrained Local Models for
Facial Landmark Detection

  • arxiv: 

2)特征提取(SIFT、HOG等;形态各种性、光照变化两种性、背景各类性使得特征鲁棒性差)

People Detection

End-to-end people detection in
crowded scenes

402com永利1站 22

  • arxiv: 
  • github: 
  • ipn: 

Detecting People in Artwork with
CNNs

  • intro: ECCV 2016 Workshops
  • arxiv: 

402com永利1站 23

3)分类器(主要有SVM、Adaboost等)

Person Head Detection

Context-aware CNNs for person head
detection

  • arxiv: 
  • github: 

One plus云•普惠AI,让开发充满AI!

历史观目的检查实验的根本难点:

Pedestrian Detection

Pedestrian Detection aided by Deep
Learning Semantic Tasks

  • intro: CVPR 2015
  • project page: 
  • paper: 

Deep Learning Strong Parts for
Pedestrian Detection

  • intro: ICCV 2015. CUHK. DeepParts
  • intro: Achieving 11.89% average miss rate on Caltech Pedestrian
    Dataset
  • paper: 

Deep convolutional neural networks
for pedestrian detection

  • arxiv: 
  • github: 

New algorithm improves speed and
accuracy of pedestrian detection

  • blog: http://www.eurekalert.org/pub\_releases/2016-02/uoc–nai020516.php

Pushing the Limits of Deep CNNs
for Pedestrian Detection

  • intro: “set a new record on the Caltech pedestrian dataset, lowering
    the log-average miss rate from 11.7% to 8.9%”
  • arxiv: 

A Real-Time Deep Learning
Pedestrian Detector for Robot Navigation

  • arxiv: 

A Real-Time Pedestrian Detector
using Deep Learning for Human-Aware Navigation

  • arxiv: 

Is Faster R-CNN Doing Well for
Pedestrian Detection?

  • arxiv: 
  • github: 

Reduced Memory Region Based Deep
Convolutional Neural Network Detection

  • intro: IEEE 2016 ICCE-Berlin
  • arxiv: 

Fused DNN: A deep neural network
fusion approach to fast and robust pedestrian detection

  • arxiv: 

Multispectral Deep Neural Networks
for Pedestrian Detection

  • intro: BMVC 2016 oral
  • arxiv: 

爱上您的代码,爱做 “改变世界”的行进派!

一)基于滑动窗口的区域采用策略未有针对,时间复杂度高,窗口冗余

Vehicle Detection

DAVE: A Unified Framework for Fast
Vehicle Detection and Annotation

  • intro: ECCV 2016
  • arxiv: 

大会将第3次发布AI开发框架,从AI模型演习到AI模型安排的一切开发壹站式完结!让AI开发触手可及!重临微博,查看越来越多

贰)手工业设计的特色对于两种性的生成并没有很好的鲁棒性

Traffic-Sign Detection

Traffic-Sign Detection and
Classification in the Wild

  • project
    page(code+dataset): 
  • paper: 
  • code &
    model: 

主要编辑:

三.3 基于侯选区域(Region
Proposal)的深浅学习目的检验法

Boundary / Edge / Contour Detection

Holistically-Nested Edge
Detection

402com永利1站 24

  • intro: ICCV 2015, Marr Prize
  • paper: 
  • arxiv: 
  • github: 

Unsupervised Learning of
Edges

  • intro: CVPR 2016. Facebook AI Research
  • arxiv: 
  • zn-blog: 

Pushing the Boundaries of Boundary
Detection using Deep Learning

  • arxiv: 

Convolutional Oriented
Boundaries

  • intro: ECCV 2016
  • arxiv: 

3.3.1 R-CNN (CVPR2014,
TPAMI2015)

Skeleton Detection

Object Skeleton Extraction in
Natural Images by Fusing Scale-associated Deep Side Outputs

402com永利1站 25

  • arxiv: 
  • github: 

DeepSkeleton: Learning Multi-task
Scale-associated Deep Side Outputs for Object Skeleton Extraction in
Natural Images

  • arxiv: 

一)Region Proposal:能够缓解滑动窗口的题材

Fruit Detection

Deep Fruit Detection in
Orchards

  • arxiv: 

Image Segmentation for Fruit
Detection and Yield Estimation in Apple Orchards

  • intro: The Journal of Field Robotics in May 2016
  • project page: 
  • arxiv: 

候选区域(Region
Proposal):是先行找出图中目的恐怕出现的职位。它接纳了图像中的纹理、边缘、颜色等音信,能够确认保证在甄选较少窗口(几千居然几百)的情况下维持较高的召回率(Recall)。

Others

Deep Deformation Network for
Object Landmark Localization

  • arxiv: 

Fashion Landmark Detection in the
Wild

  • arxiv: 

Deep Learning for Fast and
Accurate Fashion Item Detection

  • intro: Kuznech Inc.
  • intro: MultiBox and Fast R-CNN
  • paper: 

Visual Relationship Detection with
Language Priors

  • intro: ECCV 2016 oral
  • paper: 
  • github: 

OSMDeepOD - OSM and Deep Learning
based Object Detection from Aerial Imagery (formerly known as
“OSM-Crosswalk-Detection”)

402com永利1站 26

  • github: 

Selfie Detection by
Synergy-Constraint Based Convolutional Neural Network

  • intro: IEEE SITIS 2016
  • arxiv: 

Associative Embedding:End-to-End
Learning for Joint Detection and Grouping

  • arxiv: 

常用的Region
Proposal有(详见"What

Object Proposal

DeepProposal: Hunting Objects by
Cascading Deep Convolutional Layers

  • arxiv: 
  • github: 

Scale-aware Pixel-wise Object
Proposal Networks

  • intro: IEEE Transactions on Image Processing
  • arxiv: 

Attend Refine Repeat: Active Box
Proposal Generation via In-Out Localization

  • intro: AttractioNet
  • arxiv: 
  • github: 

makes for effective detection proposals?"):

Localization

Beyond Bounding Boxes: Precise
Localization of Objects in Images

  • intro: PhD Thesis
  • homepage: 
  • phd-thesis: 
  • github(“SDS using
    hypercolumns”): 

Weakly Supervised Object
Localization with Multi-fold Multiple Instance Learning

  • arxiv: 

Weakly Supervised Object
Localization Using Size Estimates

  • arxiv: 

Localizing objects using referring
expressions

  • intro: ECCV 2016
  • keywords: LSTM, multiple instance learning (MIL)
  • paper: 
  • github: 

LocNet: Improving Localization
Accuracy for Object Detection

  • arxiv: 
  • github: 

Learning Deep Features for
Discriminative Localization

402com永利1站 27

  • homepage: 
  • arxiv: 
  • github(Tensorflow): 
  • github: 
  • github: 

ContextLocNet: Context-Aware Deep
Network Models for Weakly Supervised Localization

402com永利1站 28

  • intro: ECCV 2016
  • project page: 
  • arxiv: 
  • github: 

-Selective Search

Tutorials

Convolutional Feature Maps:
Elements of efficient (and accurate) CNN-based object detection

  • slides: 

-Edge Boxes

Projects

TensorBox: a simple framework for
training neural networks to detect objects in images

  • intro: “The basic model implements the simple and robust
    GoogLeNet-OverFeat algorithm. We additionally provide an
    implementation of
    the ReInspect algorithm”
  • github: 

Object detection in torch:
Implementation of some object detection frameworks in torch

  • github: 

Using DIGITS to train an Object
Detection network

402com永利1站 29

  • github: 

FCN-MultiBox Detector

  • intro: Full convolution MultiBox Detector ( like SSD) implemented in
    Torch.
  • github: 

二)Rubicon-CNN:可以缓解特征鲁棒性的难点

Blogs

Convolutional Neural Networks for
Object Detection

Introducing automatic object
detection to visual search (Pinterest)

  • keywords: Faster R-CNN
  • blog: 
  • demo: 
  • review: 

Deep Learning for Object Detection
with DIGITS

  • blog: 

Analyzing The Papers Behind
Facebook’s Computer Vision Approach

  • keywords: DeepMask, SharpMask, MultiPathNet
  • blog: https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/

**Easily Create High Quality Object Detectors with Deep Learning **

  • intro: dlib v19.2
  • blog: 

How to Train a Deep-Learned Object
Detection Model in the Microsoft Cognitive Toolkit

  • blog: 
  • github: 

Object Detection in Satellite
Imagery, a Low Overhead Approach

  • part
    1: 
  • part
    2: 

ou Only Look Twice — Multi-Scale
Object Detection in Satellite Imagery With Convolutional Neural
Networks

  • part
    1: 

Faster R-CNN Pedestrian and Car
Detection

  • blog: 
  • ipn: 
  • github: 

参照音信

(壹) 输入测试图像

(2) 利用selective
search算法在图像中从下到上提取两千个左右的Region
Proposal

(三) 将每种Region
Proposal缩放(warp)成22柒x2二柒的分寸并输入到CNN,将CNN的fc7层的出口作为特色

(四) 将每个Region Proposal提取到的CNN特征输入到SVM举行归类

注:1)对各类Region
Proposal缩放到平等原则是因为CNN全连接层输入需求确认保证维度固定。

2)上图少画了二个历程——对于SVM分好类的Region
Proposal做边框回归(bounding-box

regression),边框回归是对region
proposal举办更正的线性回归算法,为了让region

proposal提取到的窗口跟目的真实窗口更符合。因为region
proposal提取到的窗口不恐怕跟人手工业标记那么准,假如region

proposal跟目的地方偏移较大,即便是分类正确了,可是出于IoU(region

proposal与Ground
Truth的窗口的错落有致比并集的比值)低于0.伍,那么一定于目的依然尚未检查测试到。

3)R-CNN缺点:

(一) 磨炼分为多个等级,步骤繁琐: 微调网络+练习SVM+陶冶边框回归器

(二) 磨练耗费时间,占用磁盘空间大:5000张图像发生几百G的风味文件

(3) 速度慢: 使用GPU, VGG16模子处理一张图像供给47s。

(4) 测试速度慢:种种候选区域需求周转总体前向CNN总结

(5) SVM和回归是以往操作:在SVM和回归进度中CNN特征未有被学习更新

本着速度慢的这几个题材,SPP-NET给出了很好的消除方案。

3.3.2 SPP-NET (ECCV2014,
TPAMI2015)

SSP-Net:Spatial Pyramid Pooling in Deep Convolutional Networks for
Visual Recognition

先看一下牧马人-CNN为何检查实验速度这么慢,一张图都亟需四七s!仔细看下奇骏-CNN框架发现,对图像提完Region

Proposal(两千个左右)之后将每一种Proposal当成一张图像进行持续处理(CNN提特征+SVM分类),实际上对一张图像实行了两千

次提特征和分类的长河!那两千个Region

Proposal不都以图像的一片段吗,那么大家壹齐能够对图像提三回卷积层特征,然后只须要将Region

Proposal在原图的地点映射到卷积层特征图上,那样对于一张图像大家只需求提三遍卷积层特征,然后将种种Region

Proposal的卷积层特征输入到全连接层做继续操作。(对于CNN来说,超越二分一运算都耗在卷积操作上,那样做能够节约大批量时刻)。

今天的题材是各种Region
Proposal的尺码不均等,直接这样输入全连接层肯定是拾1分的,因为全连接层输入必须是永恒的长度。SPP-NET恰好能够消除这一个标题。

是因为古板的CNN限制了输入必须稳定大小(比如亚历克斯Net是2二肆x2二肆),所以在事实上行使中1再需求对原图片实行crop也许warp的操作:

- crop:截取原图片的三个原则性大小的patch

- warp:将原图片的ROI缩放到一个稳住大小的patch

不管crop如故warp,都心有余而力不足保障在不失真的状态下将图纸传遍到CNN当中:

- crop:物体只怕会发生截断,越发是长度宽度比大的图形。

- warp:物体被拉伸,失去“原形”,特别是长度宽度比大的图样

SPP为的正是斩草除根上述的题材,做到的功能为:不管道输送入的图片是如何条件,都能够科学的扩散网络。

切切实实思路为:CNN的卷积层是能够拍卖任意尺度的输入的,只是在全连接层处有限量条件——换句话说,如若找到2个主意,在全连接层在此以前将其输入限制到等长,那么就消除了这一个标题。

具体方案如下图所示:

要是原图输入是2二4x2二肆,对于conv5出来后的出口,是一叁x一3x256的,能够明白成有2陆13个这么的filter,各类filter对应一张一叁x壹三的activation
map。即使像上海体育场面那样将activationmap pooling成4x肆 二x贰 一x一三张子图,做max
pooling后,出来的风味正是原则性长度的(1陆+四+1)x25陆那么多的维度了。借使原图的输入不是2二4x2二四,出来的性状依然是(1陆+四+一)x25陆;直觉地说,能够掌握成将原本固定大小为(叁x三)窗口的pool5改成了自适应窗口大小,窗口的高低和activation
map成比例,保证了经过pooling后出来的feature的长短是同1的。

行使SPP-NET相比于GL450-CNN能够大大加快指标检验的快慢,不过依然留存珍视重难题:

相关文章

Comment ()
评论是一种美德,说点什么吧,否则我会恨你的。。。