https://github.com/gbstack/cvpr-2022-papers
paper | code Confidence Propagation Cluster: Unleash Full Potential of Object Detectors(信心传播集群:释放物体探测器的所有潜力) paper Semantic-aligned Fusion Transformer for One-shot Object Detection(语义对齐融合转换器用于一次性目标检测) paper A Dual Weighting Label Assignment Scheme for Object Detection(目标检测双重加权标签分配方案) paper | code MUM : Mix Image Tiles and UnMix Feature Tiles for Semi-Supervised Object Detection(混合图像块和 UnMix 用于半监督目标检测的特征块) paper | code SIGMA: Semantic-complete Graph Matching for Domain Adaptive Object Detection(域自适应对象检测的语义完全图匹配) paper | code Accelerating DETR Convergence via Semantic-Aligned Matching(通过语义对齐加速 DETR 收敛) paper | code Focal and Global Knowledge Distillation for Detectors(蒸馏探测器的焦点和全球知识) keywords: Object Detection,Knowledge Distillation paper | code Unknown-Aware Object Detection: Learning What You Don’t Know from Videos in the Wild(未知感知对象检测:从野外视频中学习你不知道的东西) paper | code Localization Distillation for Dense Object Detection(密集对象检测的定位蒸馏) keywords: Bounding Box Regression, Localization Quality Estimation, Knowledge Distillation paper | code
Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering(通过联合表示学习和在线聚类进行无监督活动分割) paper
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers(用于 3D 对象检测的稳健 LiDAR-Camera Fusion 与 Transformer) paper | code Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds(学习用于 3D LiDAR 点云的高效基于点的检测器) paper | code Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion(迈向具有深度完成的高质量 3D 检测) paper MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer(使用深度感知 Transformer 的单目 3D 对象检测) paper | code Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds(从点云进行 3D 对象检测的 Set-to-Set 方法) paper | code VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention paper | code MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection(单目 3D 目标检测的联合语义和几何成本量) paper | code DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection(用于多模态 3D 目标检测的激光雷达相机深度融合) paper | code Point Density-Aware Voxels for LiDAR 3D Object Detection(用于 LiDAR 3D 对象检测的点密度感知体素) paper | code Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement(带有形状引导标签增强的弱监督 3D 对象检测) paper | code Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes(在 3D 场景中实现稳健的定向边界框检测) paper | code A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation(在全景分割的指导下,用于基于 LiDAR 的 3D 对象检测的多功能多视图框架) keywords: 3D Object Detection with Point-based Methods, 3D Object Detection with Grid-based Methods, Cluster-free 3D Panoptic Segmentation, CenterPoint 3D Object Detection paper Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving(自动驾驶中用于单目 3D 目标检测的伪立体) keywords: Autonomous Driving, Monocular 3D Object Detection paper | code
Implicit Motion Handling for Video Camouflaged Object Detection(视频伪装对象检测的隐式运动处理) paper Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection(放大和缩小:用于伪装目标检测的混合尺度三元组网络) paper | code
Bi-directional Object-context Prioritization Learning for Saliency Ranking(显着性排名的双向对象上下文优先级学习) paper | code Democracy Does Matter: Comprehensive Feature Mining for Co-Salient Object Detection() paper
UKPGAN: A General Self-Supervised Keypoint Detector(一个通用的自监督关键点检测器) paper | code
CLRNet: Cross Layer Refinement Network for Lane Detection(用于车道检测的跨层细化网络) paper Rethinking Efficient Lane Detection via Curve Modeling(通过曲线建模重新思考高效车道检测) keywords: Segmentation-based Lane Detection, Point Detection-based Lane Detection, Curve-based Lane Detection, autonomous driving paper | code
EDTER: Edge Detection with Transformer(使用transformer的边缘检测) paper | code
Deep vanishing point detection: Geometric priors make dataset variations vanish(深度消失点检测**:几何先验使数据集变化消失)** paper | code
Learning What Not to Segment: A New Perspective on Few-Shot Segmentation(学习不分割的内容:关于小样本分割的新视角) paper | code CRIS: CLIP-Driven Referring Image Segmentation(CLIP 驱动的参考图像分割) paper Hyperbolic Image Segmentation(双曲线图像分割) paper
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers(使用 Transformers 深入研究全景分割) paper | code Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation(弯曲现实:适应全景语义分割的失真感知Transformer) keywords: Semanticand panoramic segmentation, Unsupervised domain adaptation, Transformer paper | code
Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation(用于域自适应语义分割的类平衡像素级自标记) paper | code Regional Semantic Contrast and Aggregation for Weakly Supervised Semantic Segmentation(弱监督语义分割的区域语义对比和聚合) paper | code Tree Energy Loss: Towards Sparsely Annotated Semantic Segmentation(走向稀疏注释的语义分割) paper | code Scribble-Supervised LiDAR Semantic Segmentation paper | code ADAS: A Direct Adaptation Strategy for Multi-Target Domain Adaptive Semantic Segmentation(多目标域自适应语义分割的直接适应策略) paper Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast(通过像素到原型对比的弱监督语义分割) paper Representation Compensation Networks for Continual Semantic Segmentation(连续语义分割的表示补偿网络) paper | code Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels(使用不可靠伪标签的半监督语义分割) paper | code Weakly Supervised Semantic Segmentation using Out-of-Distribution Data(使用分布外数据的弱监督语义分割) paper | code Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation(弱监督语义分割的自监督图像特定原型探索) paper | code Multi-class Token Transformer for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的多类token Transformer) paper | code Cross Language Image Matching for Weakly Supervised Semantic Segmentation(用于弱监督语义分割的跨语言图像匹配) paper Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers(从注意力中学习亲和力:使用 Transformers 的端到端弱监督语义分割) paper | code ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation(让自我训练更好地用于半监督语义分割) keywords: Semi-supervised learning, Semantic segmentation, Uncertainty estimation paper | code Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation(弱监督语义分割的类重新激活图) paper | code
ContrastMask: Contrastive Learning to Segment Every Thing(对比学习分割每件事) paper Discovering Objects that Can Move(发现可以移动的物体) paper | code E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation(一种基于端到端轮廓的高质量高速实例分割方法) paper | code Efficient Video Instance Segmentation via Tracklet Query and Proposal(通过 Tracklet Query 和 Proposal 进行高效的视频实例分割) paper SoftGroup for 3D Instance Segmentation on Point Clouds(用于点云上的 3D 实例分割) keywords: 3D Vision, Point Clouds, Instance Segmentation paper | code
Language as Queries for Referring Video Object Segmentation(语言作为引用视频对象分割的查询) paper | code
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting(具有上下文感知提示的语言引导密集预测) paper | code
Neural Compression-Based Feature Learning for Video Restoration(用于视频复原的基于神经压缩的特征学习) paper
M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers(M3L:通过多模式多级transformer进行基于语言的视频编辑) paper
Depth-Aware Generative Adversarial Network for Talking Head Video Generation(用于说话头视频生成的深度感知生成对抗网络) paper | code Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning(告诉我什么并告诉我如何:通过多模式调节进行视频合成) paper | code
Global Matching with Overlapping Attention for Optical Flow Estimation(具有重叠注意力的全局匹配光流估计) paper | code CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation(用于联合光流和场景流估计的双向相机-LiDAR 融合) paper
Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation(基于自适应相关的级联循环网络的实用立体匹配) paper Depth Estimation by Combining Binocular Stereo and Monocular Structured-Light(结合双目立体和单目结构光的深度估计) paper | code RGB-Depth Fusion GAN for Indoor Depth Completion(用于室内深度完成的 RGB 深度融合 GAN) paper Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective(从特征一致性的角度重新审视域广义立体匹配网络) paper Deep Depth from Focus with Differential Focus Volume(具有不同焦点体积的焦点深度) paper ChiTransformer:Towards Reliable Stereo from Cues(从线索走向可靠的立体声) paper Rethinking Depth Estimation for Multi-View Stereo: A Unified Representation and Focal Loss(重新思考多视图立体的深度估计:统一表示和焦点损失) paper | code ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks(立体匹配网络中自动避免捷径和域泛化的信息论方法) keywords: Learning-based Stereo Matching Networks, Single Domain Generalization, Shortcut Learning paper Attention Concatenation Volume for Accurate and Efficient Stereo Matching(用于精确和高效立体匹配的注意力连接体积) keywords: Stereo Matching, cost volume construction, cost aggregation paper | code Occlusion-Aware Cost Constructor for Light Field Depth Estimation(光场深度估计的遮挡感知成本构造函数) paper | [code](https://github.com/YingqianWang/OACC- Net) NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation(用于单目深度估计的神经窗口全连接 CRF) keywords: Neural CRFs for Monocular Depth paper OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion(通过几何感知融合进行 360 度单目深度估计) keywords: monocular depth estimation(单目深度估计),transformer paper
Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization(用于单目绝对 3D 定位的基于射线的 3D 人体姿态估计) paper | code Capturing Humans in Motion: Temporal-Attentive 3D Human Pose and Shape Estimation from Monocular Video(捕捉运动中的人类:来自单目视频的时间注意 3D 人体姿势和形状估计) paper Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors(来自稀疏惯性传感器的物理感知实时人体运动跟踪) paper Distribution-Aware Single-Stage Models for Multi-Person 3D Pose Estimation(用于多人 3D 姿势估计的分布感知单阶段模型) paper MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation(用于 3D 人体姿势估计的多假设transformer) paper | code CDGNet: Class Distribution Guided Network for Human Parsing(用于人类解析的类分布引导网络) paper Forecasting Characteristic 3D Poses of Human Actions(预测人类行为的特征 3D 姿势) paper Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation(学习用于多人姿势估计的局部-全局上下文适应) keywords: Top-Down Pose Estimation(从上至下姿态估计), Limb-based Grouping, Direct Regression paper MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video(用于视频中 3D 人体姿势估计的 Seq2seq 混合时空编码器) paper