2021年4月1日,本专栏计算机视觉方向论文收集积累:paper digest
欢迎关注原创微信官方账号,回复机器学习可以得到纯手推笔记!
笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE:Fast and Accurate Emulation of The SDO/HMI Stokes Inversion with Uncertainty Quantification AUTHORS: RICHARD E. L. HIGGINS et. al. CATEGORY: astro-ph.SR [astro-ph.SR, astro-ph.IM, cs.CV] HIGHLIGHT: In this paper, we introduce a deep learning-based approach that can emulate the existing HMI pipeline results two orders of magnitude faster than the current pipeline algorithms. 2, TITLE:A Study of Latent Monotonic Attention Variants AUTHORS: Albert Zeyer ; Ralf Schl?ter ; Hermann Ney CATEGORY: cs.CL [cs.CL, cs.AI, cs.CV] HIGHLIGHT: In this paper, we present a mathematically clean solution to introduce monotonicity, by introducing a new latent variable which represents the audio position or segment boundaries. 3, TITLE:Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans AUTHORS: Bindita Chaudhuri ; Nikolaos Sarafianos ; Linda Shapiro ; Tony Tung CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We introduce a novel approach to generate diverse high fidelity texture maps for 3D human meshes in a semi-supervised setup. 4, TITLE:Efficient Large-Scale Face Clustering Using An Online Mixture of Gaussians AUTHORS: David Montero ; Naiara Aginako ; Basilio Sierra ; Marcos Nieto CATEGORY: cs.CV [cs.CV, cs.LG, I.5.3] HIGHLIGHT: In this work, we address the problem of large-scale online face clustering: given a continuous stream of unknown faces, create a database grouping the incoming faces by their identity. 5, TITLE:DCVNet: Dilated Cost Volume Networks for Fast Optical Flow AUTHORS: Huaizu Jiang ; Erik Learned-Miller CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we propose dilated cost volumes to capture small and large displacements simultaneously, allowing optical flow estimation without the need for the sequential estimation strategy. 6, TITLE:SRA-LSTM: Social Relationship Attention LSTM for Human Trajectory Prediction AUTHORS: Yusheng Peng ; Gaofeng Zhang ; Jun Shi ; Benzhu Xu ; Liping Zheng CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: Motivated by this idea, we propose a Social Relationship Attention LSTM (SRA-LSTM) model to predict future trajectories. 7, TITLE:An Effective and Friendly Tool for Seed Image Analysis AUTHORS: ANDREA LODDO et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: This work aims to present a software that performs an image analysis by feature extraction and classification starting from images containing seeds through a brand new and unique framework. 8, TITLE:Scale-aware Automatic Augmentation for Object Detection AUTHORS: YUKANG CHEN et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We propose Scale-aware AutoAug to learn data augmentation policies for object detection. 9, TITLE:Long-Term Temporally Consistent Unpaired Video Translation from Simulated Surgical 3D Data AUTHORS: DOMINIK RIVOIR et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We propose a novel approach which combines unpaired image translation with neural rendering to transfer simulated to photorealistic surgical abdominal scenes. 10, TITLE:GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection AUTHORS: Abhinav Kumar ; Garrick Brazil ; Xiaoming Liu CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: In this paper, we present and integrate GrooMeD-NMS -- a novel Grouped Mathematically Differentiable NMS for monocular 3D object detection, such that the network is trained end-to-end with a loss on the boxes after NMS. 11, TITLE:Rectification-based Knowledge Retention for Continual Learning AUTHORS: Pravendra Singh ; Pratik Mazumder ; Piyush Rai ; Vinay P. Namboodiri CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner. 12, TITLE:Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation AUTHORS: XIANGYU YUE et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we propose an end-to-end Prototypical Cross-domain Self-Supervised Learning (PCS) framework for Few-shot Unsupervised Domain Adaptation (FUDA). 13, TITLE:Rainbow Memory: Continual Learning with A Memory of Diverse Samples AUTHORS: Jihwan Bang ; Heesu Kim ; YoungJoon Yoo ; Jung-Woo Ha ; Jonghyun Choi CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIHT: To enhance the sample diversity in the memory, we propose a novel memory management strategy based on per-sample classification uncertainty and data augmentation, named Rainbow Memory (RM). 14, TITLE: Going Deeper with Image Transformers AUTHORS: Hugo Touvron ; Matthieu Cord ; Alexandre Sablayrolles ; Gabriel Synnaeve ; Herv� J�gou CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we build and optimize deeper transformer networks for image classification. 15, TITLE: Weakly-Supervised Image Semantic Segmentation Using Graph Convolutional Networks AUTHORS: Shun-Yi Pan ; Cheng-You Lu ; Shih-Po Lee ; Wen-Hsiao Peng CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To overcome this issue, we propose a Graph Convolutional Network (GCN)-based feature propagation framework. 16, TITLE: Facial Masks and Soft-Biometrics: Leveraging Face Recognition CNNs for Age and Gender Prediction on Mobile Ocular Images AUTHORS: Fernando Alonso-Fernandez ; Kevin Hernandez Diaz ; Silvia Ramis ; Francisco J. Perales ; Josef Bigun CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We address the use of selfie ocular images captured with smartphones to estimate age and gender. 17, TITLE: Evaluation of Multimodal Semantic Segmentation Using RGB-D Data AUTHORS: Jiesi Hu ; Ganning Zhao ; Suya You ; C. C. Jay Kuo CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: Our goal is to develop stable, accurate, and robust semantic scene understanding methods for wide-area scene perception and understanding, especially in challenging outdoor environments. 18, TITLE: FANet: A Feedback Attention Network for Improved Biomedical Image Segmentation AUTHORS: NIKHIL KUMAR TOMAR et. al. CATEGORY: cs.CV [cs.CV, eess.IV] HIGHLIGHT: In this work, we leverage the information of each training epoch to prune the prediction maps of the subsequent epochs. 19, TITLE: Joint Deep Multi-Graph Matching and 3D Geometry Learning from Inhomogeneous 2D Image Collections AUTHORS: Zhenzhang Ye ; Tarun Yenamandra ; Florian Bernard ; Daniel Cremers CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: We fill this gap by proposing a trainable framework that takes advantage of graph neural networks for learning a deformable 3D geometry model from inhomogeneous image collections, i.e. a set of images that depict different instances of objects from the same category. 20, TITLE: Dogfight: Detecting Drones from Drones Videos AUTHORS: Muhammad Waseem Ashraf ; Waqas Sultani ; Mubarak Shah CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To handle this, instead of using region-proposal based methods, we propose to use a two-stage segmentation-based approach employing spatio-temporal attention cues. 21, TITLE: StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery AUTHORS: Or Patashnik ; Zongze Wu ; Eli Shechtman ; Daniel Cohen-Or ; Dani Lischinski CATEGORY: cs.CV [cs.CV, cs.CL, cs.GR, cs.LG] HIGHLIGHT: In this work, we explore leveraging the power of recently introduced Contrastive Language-Image Pre-training (CLIP) models in order to develop a text-based interface for StyleGAN image manipulation that does not require such manual effort. 22, TITLE: PAUL: Procrustean Autoencoder for Unsupervised Lifting AUTHORS: Chaoyang Wang ; Simon Lucey CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper we advocate for a 3D deep auto-encoder framework to be used explicitly as the NRSfM prior. 23, TITLE: Camouflaged Instance Segmentation: Dataset and Benchmark Suite AUTHORS: TRUNG-NGHIA LE et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To promote the new task of camouflaged instance segmentation, we introduce a new large-scale dataset, namely CAMO++, by extending our preliminary CAMO dataset (camouflaged object segmentation) in terms of quantity and diversity. 24, TITLE: Dual Contrastive Loss and Attention for GANs AUTHORS: NING YU et. al. CATEGORY: cs.CV [cs.CV, cs.GR] HIGHLIGHT: In this paper, we propose various improvements to further push the boundaries in image generation. 25, TITLE: Topology-Preserving 3D Image Segmentation Based On Hyperelastic Regularization AUTHORS: Daoping Zhang ; Lok Ming Lui CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we propose a novel 3D topology-preserving registration-based segmentation model with the hyperelastic regularization, which can handle both 2D and 3D images. 26, TITLE: Topo-boundary: A Benchmark Dataset on Topological Road-boundary Detection Using Aerial Images for Autonomous Driving AUTHORS: Zhenhua Xu ; Yuxiang Sun ; Ming Liu CATEGORY: cs.CV [cs.CV, cs.RO] HIGHLIGHT: So in this paper, we propose a new benchmark dataset, named \textit{Topo-boundary}, for off-line topological road-boundary detection. 27, TITLE: Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark AUTHORS: XIAO WANG et. al. CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: In this work, we propose a new benchmark specifically dedicated to the tracking-by-language, including a large scale dataset, label and diverse baseline methods. We also introduce two new challenges into TNL2K for the object tracking task, i.e., adversarial samples and modality switch. 28, TITLE: DER: Dynamically Expandable Representation for Class Incremental Learning AUTHORS: Shipeng Yan ; Jiangwei Xie ; Xuming He CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: To this end, we propose a novel two-stage learning approach that utilizes a dynamically expandable representation for more effective incremental concept modeling. 29, TITLE: Near-field Sensing Architecture for Low-Speed Vehicle Automation Using A Surround-view Fisheye Camera System AUTHORS: Ciar�n Eising ; Jonathan Horgan ; Senthil Yogamani CATEGORY: cs.CV [cs.CV, cs.RO] HIGHLIGHT: In this work, we describe our visual perception architecture on surround view cameras designed for a system deployed in commercial vehicles, provide a functional review of the different stages of such a computer vision system, and discuss some of the current technological challenges. 30, TITLE: Generating Multi-scale Maps from Remote Sensing Images Via Series Generative Adversarial Networks AUTHORS: Xu Chen ; Bangguo Yin ; Songqiang Chen ; Haifeng Li ; Tian Xu CATEGORY: cs.CV [cs.CV, eess.IV] HIGHLIGHT: By extending their method, multi-scale RSIs can be trivially translated to multi-scale maps (multi-scale rs2map translation) through scale-wise rs2map models trained for certain scales (parallel strategy). 31, TITLE: Few-Data Guided Learning Upon End-to-End Point Cloud Network for 3D Face Recognition AUTHORS: Yi Yu ; Feipeng Da ; Ziyu Zhang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, an end-to-end deep learning network entitled Sur3dNet-Face for point-cloud-based 3D face recognition is proposed. 32, TITLE: Multi-Class Multi-Instance Count Conditioned Adversarial Image Generation AUTHORS: Amrutha Saseendran ; Kathrin Skubch ; Margret Keuper CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: In this paper, we take one further step in this direction and propose a conditional generative adversarial network (GAN) that generates images with a defined number of objects from given classes. In particular, we propose a new dataset, CityCount, which is derived from the Cityscapes street scenes dataset, to evaluate our approach in a challenging and practically relevant scenario. 33, TITLE: Learning Camera Localization Via Dense Scene Matching AUTHORS: Shitao Tang ; Chengzhou Tang ; Rui Huang ; Siyu Zhu ; Ping Tan CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We present a new method for scene agnostic camera localization using dense scene matching (DSM), where a cost volume is constructed between a query image and a scene. 34, TITLE: Using Depth Information and Colour Space Variations for Improving Outdoor Robustness for Instance Segmentation of Cabbage AUTHORS: Nils L�ling ; David Reiser ; Alexander Stana ; H. W. Griepentrog CATEGORY: cs.CV [cs.CV, cs.LG, cs.RO] HIGHLIGHT: Following this goal, this research focuses on improving instance segmentation of field crops under varying environmental conditions. 35, TITLE: DAP: Detection-Aware Pre-training with Weak Supervision AUTHORS: YUANYI ZHONG et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. 36, TITLE: Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes AUTHORS: Dmytro Kotovenko ; Matthias Wright ; Arthur Heimbrecht ; Bj�rn Ommer CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR] HIGHLIGHT: We propose a method to stylize images by optimizing parameterized brushstrokes instead of pixels and further introduce a simple differentiable rendering mechanism. 37, TITLE: Neural Surface Maps AUTHORS: Luca Morreale ; Noam Aigerman ; Vladimir Kim ; Niloy J. Mitra CATEGORY: cs.CV [cs.CV, cs.GR] HIGHLIGHT: In this paper, we advocate considering neural networks as encoding surface maps. 38, TITLE: Learning with Memory-based Virtual Classes for Deep Metric Learning AUTHORS: Byungsoo Ko ; Geonmo Gu ; Han-Gyu Kim CATEGORY: cs.CV [cs.CV, cs.IR, cs.LG] HIGHLIGHT: In this work, we present a novel training strategy for DML called MemVir. 39, TITLE: Unpaired Single-Image Depth Synthesis with Cycle-consistent Wasserstein GANs AUTHORS: Christoph Angermann ; Ad�la Moravov� ; Markus Haltmeier ; Steinbj�rn J�nsson ; Christian Laubichler CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV] HIGHLIGHT: Therefore, in this study, latest advancements in the field of generative neural networks are leveraged to fully unsupervised single-image depth synthesis. 40, TITLE: A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection AUTHORS: Keshigeyan Chandrasegaran ; Ngoc-Trung Tran ; Ngai-Man Cheung CATEGORY: cs.CV [cs.CV, eess.IV] HIGHLIGHT: In this work, we investigate the validity of assertions claiming that CNN-generated images are unable to achieve high frequency spectral decay consistency. 41, TITLE: Joint Learning of Neural Transfer and Architecture Adaptation for Image Recognition AUTHORS: Guangrun Wang ; Liang Lin ; Rongcong Chen ; Guangcong Wang ; Jiqi Zhang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness, compared to the existing image recognition pipeline that only tunes the weights regardless of the architecture. 42, TITLE: Knowledge Distillation By Sparse Representation Matching AUTHORS: Dat Thanh Tran ; Moncef Gabbouj ; Alexandros Iosifidis CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we propose Sparse Representation Matching (SRM), a method to transfer intermediate knowledge obtained from one Convolutional Neural Network (CNN) to another by utilizing sparse representation learning. 43, TITLE: Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective AUTHORS: Jiarui Xu ; Xiaolong Wang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: Instead of following the previous literature, we propose to learn correspondence using Video Frame-level Similarity (VFS) learning, i.e, simply learning from comparing video frames. 44, TITLE: Layout-Guided Novel View Synthesis from A Single Indoor Panorama AUTHORS: Jiale Xu ; Jia Zheng ; Yanyu Xu ; Rui Tang ; Shenghua Gao CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we make the first attempt to generate novel views from a single indoor panorama and take the large camera translations into consideration. To validate the effectiveness of our method, we further build a large-scale photo-realistic dataset containing both small and large camera translations. 45, TITLE: VITON-HD: High-Resolution Virtual Try-On Via Misalignment-Aware Normalization AUTHORS: Seunghwan Choi ; Sunghyun Park ; Minsoo Lee ; Jaegul Choo CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To address the challenges, we propose a novel virtual try-on method called VITON-HD that successfully synthesizes 1024x768 virtual try-on images. 46, TITLE: Video Exploration Via Video-Specific Autoencoders AUTHORS: Kevin Wang ; Deva Ramanan ; Aayush Bansal CATEGORY: cs.CV [cs.CV, cs.GR, cs.HC, cs.LG] HIGHLIGHT: In this work, we observe that a simple autoencoder trained (from scratch) on multiple frames of a specific video enables one to perform a large variety of video processing and editing tasks. 47, TITLE: Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity AUTHORS: Yuanxin Ye ; Jie Shan ; Lorenzo Bruzzone ; Li Shen CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To address this problem, this paper proposes a novel feature descriptor named the Histogram of Orientated Phase Congruency (HOPC), which is based on the structural properties of images. 48, TITLE: Human Perception Modeling for Automatic Natural Image Matting AUTHORS: Yuhongze Zhou ; Liguang Zhou ; Tin Lun Lam ; Yangsheng Xu CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we argue that how to handle trade-off of additional information input is a major issue in automatic matting, which we decompose into two subtasks: trimap and alpha estimation. 49, TITLE: Learning By Aligning Videos in Time AUTHORS: SANJAY HARESH et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information. 50, TITLE: Contrastive Learning of Single-Cell Phenotypic Representations for Treatment Classification AUTHORS: ALEXIS PERAKIS et. al. CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV] HIGHLIGHT: Therefore, subsequent works propose unsupervised approaches based on generative models to learn these representations. 51, TITLE: CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields AUTHORS: Michael Niemeyer ; Andreas Geiger CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: Several recent works therefore propose generative models which are 3D-aware, i.e., scenes are modeled in 3D and then rendered differentiably to the image plane. 52, TITLE: DynOcc: Learning Single-View Depth from Dynamic Occlusion Cues AUTHORS: Yifan Wang ; Linjie Luo ; Xiaohui Shen ; Xing Mei CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we introduce the first depth dataset DynOcc consisting of dynamic in-the-wild scenes. 53, TITLE: Denoise and Contrast for Category Agnostic Shape Completion AUTHORS: Antonio Alliegro ; Diego Valsesia ; Giulia Fracastoro ; Enrico Magli ; Tatiana Tommasi CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we present a deep learning model that exploits the power of self-supervision to perform 3D point cloud completion, estimating the missing part and a context region around it. 54, TITLE: Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors AUTHORS: Vladimir Guzov ; Aymen Mir ; Torsten Sattler ; Gerard Pons-Moll CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors. 55, TITLE: Self-Regression Learning for Blind Hyperspectral Image Fusion Without Label AUTHORS: Wu Wang ; Yue Huang ; Xinhao Ding CATEGORY: cs.CV [cs.CV, eess.IV] HIGHLIGHT: Towards these issues, we proposed a self-regression learning method that alternatively reconstructs hyperspectral image (HSI) and estimate the observation model. 56, TITLE: SOON: Scenario Oriented Object Navigation with Graph-based Exploration AUTHORS: Fengda Zhu ; Xiwen Liang ; Yi Zhu ; Xiaojun Chang ; Xiaodan Liang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: Accordingly, in this paper, we introduce a Scenario Oriented Object Navigation (SOON) task. We also propose a new large-scale benchmark named From Anywhere to Object (FAO) dataset. 57, TITLE: Smart Scribbles for Image Mating AUTHORS: XIN YANG et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this article, we explore the intrinsic relationship between user input and alpha mattes and strike a balance between user effort and the quality of alpha mattes. 58, TITLE: SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification AUTHORS: Zijian Hu ; Zhengyu Yang ; Xuefeng Hu ; Ram Nevatia CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: Following this path, we propose a novel unsupervised objective that focuses on the less studied relationship between the high confidence unlabeled data that are similar to each other. 59, TITLE: Learning Spatio-Temporal Transformer for Visual Tracking AUTHORS: Bin Yan ; Houwen Peng ; Jianlong Fu ; Dong Wang ; Huchuan Lu CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we present a new tracking architecture with an encoder-decoder transformer as the key component. 60, TITLE: Deep Adaptive Fuzzy Clustering for Evolutionary Unsupervised Representation Learning AUTHORS: Dayu Tan ; Zheng Huang ; Xin Peng ; Weimin Zhong ; Vladimir Mahalec CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: In this study, we explore the possibility of employing fuzzy clustering in a deep neural network framework. 61, TITLE: DA-DETR: Domain Adaptive Detection Transformer By Hybrid Attention AUTHORS: Jingyi Zhang ; Jiaxing Huang ; Zhipeng Luo ; Gongjie Zhang ; Shijian Lu CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we adopt a one-stage detector and design DA-DETR, a simple yet effective domain adaptive object detection network that performs inter-domain alignment with a single discriminator. 62, TITLE: Convolutional Hough Matching Networks AUTHORS: Juhong Min ; Minsu Cho CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work we introduce a Hough transform perspective on convolutional matching and propose an effective geometric matching algorithm, dubbed Convolutional Hough Matching (CHM). 63, TITLE: Online Learning of A Probabilistic and Adaptive Scene Representation AUTHORS: Zike Yan ; Xin Wang ; Hongbin Zha CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this paper, we represent the scene with a Bayesian nonparametric mixture model, seamlessly describing per-point occupancy status with a continuous probability density function. 64, TITLE: Spatial Content Alignment For Pose Transfer AUTHORS: Wing-Yin Yu ; Lai-Man Po ; Yuzhi Zhao ; Jingjing Xiong ; Kin-Wai Lau CATEGORY: cs.CV [cs.CV, cs.AI, cs.MM] HIGHLIGHT: In this paper, we propose a novel framework Spatial Content Alignment GAN (SCAGAN) which aims to enhance the content consistency of garment textures and the details of human characteristics. 65, TITLE: Deep Simultaneous Optimisation of Sampling and Reconstruction for Multi-contrast MRI AUTHORS: XINWEN LIU et. al. CATEGORY: cs.CV [cs.CV, eess.IV] HIGHLIGHT: We propose an algorithm that generates the optimised sampling pattern and reconstruction scheme of one contrast (e.g. T2-weighted image) when images with different contrast (e.g. T1-weighted image) have been acquired. 66, TITLE: ReMix: Towards Image-to-Image Translation with Limited Data AUTHORS: Jie Cao ; Luanxuan Hou ; Ming-Hsuan Yang ; Ran He ; Zhenan Sun CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we propose a data augmentation method (ReMix) to tackle this issue. 67, TITLE: Facial Expression and Attributes Recognition Based on Multi-task Learning of Lightweight Neural Networks AUTHORS: Andrey V. Savchenko CATEGORY: cs.CV [cs.CV, 68T10] HIGHLIGHT: In this paper, we examine the multi-task training of lightweight convolutional neural networks for face identification and classification of facial attributes (age, gender, ethnicity) trained on cropped faces without margins. 68, TITLE: Deep Image Harmonization By Bridging The Reality Gap AUTHORS: WENYAN CONG et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To leverage both real-world images and rendered images, we propose a cross-domain harmonization network CharmNet to bridge the domain gap between two domains. 69, TITLE: The GIST and RIST of Iterative Self-Training for Semi-Supervised Segmentation AUTHORS: EU WERN TEH et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We consider the task of semi-supervised semantic segmentation, where we aim to produce pixel-wise semantic object masks given only a small number of human-labeled training examples. 70, TITLE: Channel-Based Attention for LCC Using Sentinel-2 Time Series AUTHORS: Hermann Courteille ; A. Beno�t ; N M�ger ; A Atto ; D. Ienco CATEGORY: cs.CV [cs.CV, cs.LG, cs.NE, eess.IV] HIGHLIGHT: An architecture expressing predictions with respect to input channels is thus proposed in this paper. 71, TITLE: Robust Facial Expression Recognition with Convolutional Visual Transformers AUTHORS: Fuyan Ma ; Bin Sun ; Shutao Li CATEGORY: cs.CV [cs.CV] HIGHLIGHT: Therefore, we propose Convolutional Visual Transformers to tackle FER in the wild by two main steps. 72, TITLE: Attention Map-guided Two-stage Anomaly Detection Using Hard Augmentation AUTHORS: Jou Won Song ; Kyeongbo Kong ; Ye In Park ; Suk-Ju Kang CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To alleviate this problem, this paper proposes a novel two-stage network consisting of an attention network and an anomaly detection GAN (ADGAN). 73, TITLE: Sparse Auxiliary Networks for Unified Monocular Depth Prediction and Completion AUTHORS: Vitor Guizilini ; Rares Ambrus ; Wolfram Burgard ; Adrien Gaidon CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: In this paper, we study the problem of predicting dense depth from a single RGB image (monodepth) with optional sparse measurements from low-cost active depth sensors. 74, TITLE: ICurb: Imitation Learning-based Detection of Road Curbs Using Aerial Images for Autonomous Driving AUTHORS: Zhenhua Xu ; Yuxiang Sun ; Ming Liu CATEGORY: cs.CV [cs.CV, cs.RO] HIGHLIGHT: We find that the visual appearances between road areas and off-road areas are usually different in aerial images, so we propose a novel solution to detect road curbs off-line using aerial images. 75, TITLE: Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding AUTHORS: Hao Zhou ; Chongyang Zhang ; Yan Luo ; Yanjun Chen ; Chuanping Hu CATEGORY: cs.CV [cs.CV, cs.AI] HIGHLIGHT: In this work, we propose a novel DeNet (Decoupling and De-bias) to embrace human uncertainty: Decoupling - We explicitly disentangle each query into a relation feature and a modified feature. 76, TITLE: Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data AUTHORS: Oscar Ma�as ; Alexandre Lacoste ; Xavier Giro-i-Nieto ; David Vazquez ; Pau Rodriguez CATEGORY: cs.CV [cs.CV] HIGHLIGHT: In this work, we propose Seasonal Contrast (SeCo), an effective pipeline to leverage unlabeled data for in-domain pre-training of re-mote sensing representations. 77, TITLE: Unsupervised Disentanglement of Linear-Encoded Facial Semantics AUTHORS: Yutong Zheng ; Yu-Kai Huang ; Ran Tao ; Zhiqiang Shen ; Marios Savvides CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision. 78, TITLE: Geometric Unsupervised Domain Adaptation for Semantic Segmentation AUTHORS: Vitor Guizilini ; Jie Li ; Rares Ambrus ; Adrien Gaidon CATEGORY: cs.CV [cs.CV] HIGHLIGHT: We propose to use self-supervised monocular depth estimation as a proxy task to bridge this gap and improve sim-to-real unsupervised domain adaptation (UDA). 79, TITLE: Neural Response Interpretation Through The Lens of Critical Pathways AUTHORS: ASHKAN KHAKZAR et. al. CATEGORY: cs.CV [cs.CV, cs.LG] HIGHLIGHT: In this work, we discuss the problem of identifying these critical pathways and subsequently leverage them for interpreting the network's response to an input. 80, TITLE: Rank-One Prior: Toward Real-Time Scene Recovery AUTHORS: Jun Liu ; Ryan Wen Liu ; Jianing Sun ; Tieyong Zeng CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To improve visual quality under different weather/imaging conditions, we propose a real-time light correction method to recover the degraded scenes in the cases of sandstorms, underwater, and haze. 81, TITLE: Fixing The Teacher-Student Knowledge Discrepancy in Distillation AUTHORS: JIANGFAN HAN et. al. CATEGORY: cs.CV [cs.CV] HIGHLIGHT: To solve this problem, in this paper, we propose a novel student-dependent distillation method, knowledge consistent distillation, which makes teacher's knowledge more consistent with the student and provides the best suitable knowledge to different student networks for distillation. 82, TITLE: ArtFlow: Unbiased Image Style Transfer Via Reversible Neural Flows AUTHORS: JIE AN et. al. CATEGORY: cs.CV [cs.CV, eess.IV] HIGHLIGHT: In this paper, we propose ArtFlow to prevent content leak during universal style transfer. 83, TITLE: Exploiting Invariance in Training Deep Neural Networks AUTHORS: CHENGXI YE et. al. CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG] HIGHLIGHT: Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks. 84, TITLE: Attention, Please! A Survey of Neural Attention Models in Deep Learning AUTHORS: Alana de Santana Correia ; Esther Luna Colombini CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV, cs.RO] HIGHLIGHT: By critically analyzing 650 works, we describe the primary uses of attention in convolutional, recurrent networks and generative models, identifying common subgroups of uses and applications. 85, TITLE: Robustness Certification for Point Cloud Models AUTHORS: Tobias Lorenz ; Anian Ruoss ; Mislav Balunovi? ; Gagandeep Singh ; Martin Vechev CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV] HIGHLIGHT: In this work, we address this challenge and introduce 3DCertify, the first verifier able to certify robustness of point cloud models. 86, TITLE: Bit-Mixer: Mixed-precision Networks with Runtime Bit-width Selection AUTHORS: Adrian Bulat ; Georgios Tzimiropoulos CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV] HIGHLIGHT: In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference. 87, TITLE: Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human Videos AUTHORS: Annie S. Chen ; Suraj Nair ; Chelsea Finn CATEGORY: cs.RO [cs.RO, cs.AI, cs.CV, cs.LG] HIGHLIGHT: In this work, we propose a simple approach, Domain-agnostic Video Discriminator (DVD), that learns multitask reward functions by training a discriminator to classify whether two videos are performing the same task, and can generalize by virtue of learning from a small amount of robot data with a broad dataset of human videos. 88, TITLE: Training Robust Deep Learning Models for Medical Imaging Tasks with Spectral Decoupling AUTHORS: Joona Pohjonen ; Carolin St�renberg ; Antti Rannikko ; Tuomas Mirtti ; Esa Pitk�nen CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: To address these challenges, we evaluate the utility of spectral decoupling in the context of medical image analysis. 89, TITLE: Classification of Hematoma: Joint Learning of Semantic Segmentation and Classification AUTHORS: Hokuto Hirano ; Tsuyoshi Okita CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG] HIGHLIGHT: This paper proposes the joint learning of semantic segmentation and classification and evaluate the performance of this. 90, TITLE: Learning Scalable $\ell_\infty$-constrained Near-lossless Image Compression Via Joint Lossy Image and Residual Compression AUTHORS: Yuanchao Bai ; Xianming Liu ; Wangmeng Zuo ; Yaowei Wang ; Xiangyang Ji CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: We propose a novel joint lossy image and residual compression framework for learning $\ell_\infty$-constrained near-lossless image compression. 91, TITLE: CNN-based Cardiac Motion Extraction to Generate Deformable Geometric Left Ventricle Myocardial Models from Cine MRI AUTHORS: Roshan Reddy Upendra ; Brian Jamison Wentz ; Richard Simon ; Suzanne M. Shontz ; Cristian A. Linte CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: Here, we propose a deep leaning-based framework for the development of patient-specific geometric models of LV myocardium from cine cardiac MR images, using the Automated Cardiac Diagnosis Challenge (ACDC) dataset. 92, TITLE: Differentiable Deconvolution for Improved Stroke Perfusion Analysis AUTHORS: Ezequiel de la Rosa ; David Robben ; Diana M. Sima ; Jan S. Kirschke ; Bjoern Menze CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: In this work we propose an AIF selection approach that is optimized for maximal core lesion segmentation performance. 93, TITLE: Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging AUTHORS: Ilya Chugunov ; Seung-Hwan Baek ; Qiang Fu ; Wolfgang Heidrich ; Felix Heide CATEGORY: eess.IV [eess.IV, cs.CV] HIGHLIGHT: We introduce Mask-ToF, a method to reduce flying pixels (FP) in time-of-flight (ToF) depth captures. We develop a differentiable ToF simulator to jointly train a convolutional neural network to decode this information and produce high-fidelity, low-FP depth reconstructions. 94, TITLE: A Novel Deep ML Architecture By Integrating Visual Simultaneous Localization and Mapping (vSLAM) Into Mask R-CNN for Real-time Surgical Video Analysis AUTHORS: Ella Selina Lan CATEGORY: eess.IV [eess.IV, cs.CV, I.4.0] HIGHLIGHT: In this research, a novel machine learning architecture, RPM-CNN, is created to perform real-time surgical video analysis. 95, TITLE: HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images AUTHORS: SAVERIO VADACCHINO et. al. CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG] HIGHLIGHT: In this work, we present HAD-Net, a novel offline adversarial knowledge distillation (KD) technique, whereby a pre-trained teacher segmentation network, with access to all MRI sequences, teaches a student network, via hierarchical adversarial training, to better overcome the large domain shift presented when crucial images are absent during inference.