Paper Search

Search papers corresponding to the selected category.

Search Results

(CVPR 2024) Style Aligned Image Generation via Shared Attention
(CVPR 2024) One-step Diffusion with Distribution Matching Distillation
(CVPR 2024) VideoBooth: Diffusion-based Video Generation with Image Prompts
(CVPR 2024) FreeU: Free Lunch in Diffusion U-Net
(CVPR 2024) StyLitGAN: Prompting StyleGAN to Produce New Illumination Conditions
(CVPR 2024) Joint-task Regularization for Partially Labeled Multitask Learning
(CVPR 2024) Shadows Don’t Lie and Lines Can’t Bend! Generative Models don’t know Projective Geometry...for now
(CVPR 2024) Time-Efficient Light-Field Acquisition Using Coded Aperture and Events
(CVPR 2024) FINER: Flexible spectral-bias tuning in Implicit Neural Representation by Variable-periodic Activation Functions
(CVPR 2024) DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
(CVPR 2024) Long-Tailed Anomaly Detection with Learnable Class Names
(CVPR 2024) Towards Backward-Compatible Continual Learning of Image Compression
(CVPR 2024) SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
(CVPR 2024) YOLO-World: Real-Time Open-Vocabulary Object Detection
(CVPR 2024) InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models
(CVPR 2024) TextCraftor: Your Text Encoder Can be Image Quality Controller
(CVPR 2024) Beyond Textual Constraints: Learning Novel Diffusion Conditions with Fewer Examples
(CVPR 2024) SUGAR : Pre-training 3D Visual Representations for Robotics
(CVPR 2024) Mip-Splatting: Alias-free 3D Gaussian Splatting
(CVPR 2024) DETRs Beat YOLOs on Real-time Object Detection
(CVPR 2024) Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis
(CVPR 2024) FedAS: Bridging Inconsistency in Personalized Federated Learning
(CVPR 2024) PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation for Videos
(CVPR 2024) Multi-Scale 3D Gaussian Splatting for Anti-Aliased Rendering
(CVPR 2024) C3: High-performance and low-complexity neural compression from a single image or video
(CVPR 2023) MOSO: Decomposing MOtion, Scene and Object for Video Prediction
(CVPR 2023) Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
(CVPR 2023) Learned Image Compression with Mixed Transformer-CNN Architectures
(CVPR 2021) End-to-End Object Detection with Fully Convolutional Network
(CVPR 2021) Checkerboard Context Model for Efficient Learned Image Compression
(ICLR 2024) Language Model Beats Diffusion - Tokenizer is Key to Visual Generation
(ICLR 2024) VDT: General-Purpose Video Diffusion Transformers via Mask Modeling
(ICLR 2024) Vision Transformers Need Registers
(ICLR 2023) DINO: DETR with Improved Denoising Anchor Boxes for End-to-End Object Detection
(ICLR 2022) Prompt-to-Prompt Image Editing with Cross Attention Control
(ICLR 2019) DARTS: Differentiable Architecture Search
(ICLR 2018) Variational image compression with a scale hyperprior
(ICLR 2017) Neural Architecture Search with Reinforcement Learning
(ICLR 2021 workshop) COIN: COmpression with Implicit Neural representations
(ICCV 2023) Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
(ICCV 2023) Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
(ICCV 2023) Adding Conditional Control to Text-to-Image Diffusion Models
(ICCV 2023) Video Object Segmentation-aware Video Frame Interpolation
(ICCV 2023) COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec (cool-chic v1.0)
(NeurIPS 2024) YOLOv10: Real-Time End-to-End Object Detection
(NeurIPS 2023) Towards Efficient Image Compression Without Autoregressive Models
(NeurIPS 2022) Flexible Diffusion Modeling of Long Videos
(NeurIPS 2022) Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
(NeurIPS 2018) Joint autoregressive and hierarchical priors for learned image compression
(ICASSP 2023) Hybrid Neural Network With Cross-And Self-Module Attention Pooling For Text-Independent Speaker Verification
(ICASSP 2023) Improving Music Genre Classification from Multi-Modal Properties of Music and Genre Correlations Perspective
(ICASSP 2023) HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones
(ICASSP 2021) Image Coding for Machines: an End-To-End Learned Approach
(ECCV 2024) Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
(ECCV 2024) GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting
(ICIP2022) Deep Feature Compression Using Rate-Distortion Optimization Guided Autoencoder
(ICIP 2021) An Efficient Image Compression Method Based on Neural Network: An Overfitting Approach
(ICIP 2018) Video Error Concealment Using Deep Neural Networks
(ICML 2024) Fast Timing-Conditioned Latent Audio Diffusion
(ICML 2023) AudioLDM: Text-to-Audio Generation with Latent Diffusion Models
(MMSP 2023) Region of Interest Enabled Learned Image Coding for Machines
(MMSP 2023) Low-complexity Overfitted Neural Image Codec (cool-chic v2.0)
(WACV 2024) Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model
(SIGGRAPH 2023) 3D Gaussian Splatting for Real-Time Radiance Field Rendering
(MM 2023) ICMH-Net: Neural Image Compression Towards both Machine Vision and Human Vision
(EUSIPCO 2024) Overfitted image coding at reduced complexity (cool-chic v3.2)
(ISSC 2023) A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
(ICDCS 2023) Edge-Cloud Collaborated Object Detection via Difficult-Case Discriminator
(TMLR 2022) COIN++: Neural Compression Across Modalities
(ISM 2021) Learned Enhancement Filters for Image Coding for Machines
(ACSSC 2014) Weighted boundary matching error concealment for HEVC using block partition decisions
(arxiv 2024) Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech
(arXiv 2024) Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like
(arXiv 2024) ControlNeXt: Powerful and Efficient Control for Image and Video Generation
(arXiv 2023) SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

Paper Search

Multimedia Seminar

Paper Search

Category

Search Results