Paper Search
Search papers corresponding to the selected category.
Category
Search Results
- (CVPR 2024) Style Aligned Image Generation via Shared Attention
- (CVPR 2024) One-step Diffusion with Distribution Matching Distillation
- (CVPR 2024) VideoBooth: Diffusion-based Video Generation with Image Prompts
- (CVPR 2024) FreeU: Free Lunch in Diffusion U-Net
- (CVPR 2024) StyLitGAN: Prompting StyleGAN to Produce New Illumination Conditions
- (CVPR 2024) Joint-task Regularization for Partially Labeled Multitask Learning
- (CVPR 2024) Shadows Don’t Lie and Lines Can’t Bend! Generative Models don’t know Projective Geometry...for now
- (CVPR 2024) Time-Efficient Light-Field Acquisition Using Coded Aperture and Events
- (CVPR 2024) FINER: Flexible spectral-bias tuning in Implicit Neural Representation by Variable-periodic Activation Functions
- (CVPR 2024) DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
- (CVPR 2024) Long-Tailed Anomaly Detection with Learnable Class Names
- (CVPR 2024) Towards Backward-Compatible Continual Learning of Image Compression
- (CVPR 2024) SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
- (CVPR 2024) YOLO-World: Real-Time Open-Vocabulary Object Detection
- (CVPR 2024) InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
- (CVPR 2024) TextCraftor: Your Text Encoder Can be Image Quality Controller
- (CVPR 2024) Beyond Textual Constraints: Learning Novel Diffusion Conditions with Fewer Examples
- (CVPR 2024) SUGAR : Pre-training 3D Visual Representations for Robotics
- (CVPR 2024) Mip-Splatting: Alias-free 3D Gaussian Splatting
- (CVPR 2024) DETRs Beat YOLOs on Real-time Object Detection
- (CVPR 2024) Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis
- (CVPR 2024) FedAS: Bridging Inconsistency in Personalized Federated Learning
- (CVPR 2024) PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation for Videos
- (CVPR 2024) Multi-Scale 3D Gaussian Splatting for Anti-Aliased Rendering
- (CVPR 2024) C3: High-performance and low-complexity neural compression from a single image or video
- (CVPR 2023) MOSO: Decomposing MOtion, Scene and Object for Video Prediction
- (CVPR 2023) Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
- (CVPR 2023) Learned Image Compression with Mixed Transformer-CNN Architectures
- (CVPR 2021) End-to-End Object Detection with Fully Convolutional Network
- (CVPR 2021) Checkerboard Context Model for Efficient Learned Image Compression
- (ICLR 2024) Language Model Beats Diffusion - Tokenizer is Key to Visual Generation
- (ICLR 2024) VDT: General-Purpose Video Diffusion Transformers via Mask Modeling
- (ICLR 2024) Vision Transformers Need Registers
- (ICLR 2023) DINO: DETR with Improved Denoising Anchor Boxes for End-to-End Object Detection
- (ICLR 2022) Prompt-to-Prompt Image Editing with Cross Attention Control
- (ICLR 2019) DARTS: Differentiable Architecture Search
- (ICLR 2018) Variational image compression with a scale hyperprior
- (ICLR 2017) Neural Architecture Search with Reinforcement Learning
- (ICLR 2021 workshop) COIN: COmpression with Implicit Neural representations
- (ICCV 2023) Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
- (ICCV 2023) Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
- (ICCV 2023) Adding Conditional Control to Text-to-Image Diffusion Models
- (ICCV 2023) Video Object Segmentation-aware Video Frame Interpolation
- (ICCV 2023) COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec (cool-chic v1.0)
- (NeurIPS 2024) YOLOv10: Real-Time End-to-End Object Detection
- (NeurIPS 2023) Towards Efficient Image Compression Without Autoregressive Models
- (NeurIPS 2022) Flexible Diffusion Modeling of Long Videos
- (NeurIPS 2022) Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
- (NeurIPS 2018) Joint autoregressive and hierarchical priors for learned image compression
- (ICASSP 2023) Hybrid Neural Network With Cross-And Self-Module Attention Pooling For Text-Independent Speaker Verification
- (ICASSP 2023) Improving Music Genre Classification from Multi-Modal Properties of Music and Genre Correlations Perspective
- (ICASSP 2023) HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones
- (ICASSP 2021) Image Coding for Machines: an End-To-End Learned Approach
- (ECCV 2024) Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
- (ECCV 2024) GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting
- (ICIP2022) Deep Feature Compression Using Rate-Distortion Optimization Guided Autoencoder
- (ICIP 2021) An Efficient Image Compression Method Based on Neural Network: An Overfitting Approach
- (ICIP 2018) Video Error Concealment Using Deep Neural Networks
- (ICML 2024) Fast Timing-Conditioned Latent Audio Diffusion
- (ICML 2023) AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
- (MMSP 2023) Region of Interest Enabled Learned Image Coding for Machines
- (MMSP 2023) Low-complexity Overfitted Neural Image Codec (cool-chic v2.0)
- (WACV 2024) Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model
- (SIGGRAPH 2023) 3D Gaussian Splatting for Real-Time Radiance Field Rendering
- (MM 2023) ICMH-Net: Neural Image Compression Towards both Machine Vision and Human Vision
- (EUSIPCO 2024) Overfitted image coding at reduced complexity (cool-chic v3.2)
- (ISSC 2023) A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
- (ICDCS 2023) Edge-Cloud Collaborated Object Detection via Difficult-Case Discriminator
- (TMLR 2022) COIN++: Neural Compression Across Modalities
- (ISM 2021) Learned Enhancement Filters for Image Coding for Machines
- (ACSSC 2014) Weighted boundary matching error concealment for HEVC using block partition decisions
- (arxiv 2024) Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
- (arXiv 2024) Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
- (arXiv 2024) ControlNeXt: Powerful and Efficient Control for Image and Video Generation
- (arXiv 2023) SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models
