Publications

On the Dangers of Bootstrapping Generation for Continual Learning and Beyond

Zverev D, Koepke AS & Henriques JF (2026), Lecture Notes in Computer Science, 16125 LNCS, 237-250

BibTeX

@inproceedings{onthedangersofb-2026/1,
  title={On the Dangers of Bootstrapping Generation for Continual Learning and Beyond},
  author={Zverev D, Koepke AS & Henriques JF},
  pages={237-250},
  year = "2026"
}

Rapidvol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans

Eid MC, Yeung P-H, Wyburd MK, Henriques JF & Namburete AIL (2025), 00, 1-5

BibTeX

@inproceedings{rapidvolrapidre-2025/4,
  title={Rapidvol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans},
  author={Eid MC, Yeung P-H, Wyburd MK, Henriques JF & Namburete AIL},
  booktitle={2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI)},
  pages={1-5},
  year = "2025"
}

VISUALPREDICATOR: LEARNING ABSTRACT WORLD MODELS WITH NEURO-SYMBOLIC PREDICATES FOR ROBOT PLANNING

Liang Y, Kumar N, Tang H, Weller A, Tenenbaum JB et al. (2025), 13th International Conference on Learning Representations Iclr 2025, 71952-71980

BibTeX

@inproceedings{visualpredicato-2025/1,
  title={VISUALPREDICATOR: LEARNING ABSTRACT WORLD MODELS WITH NEURO-SYMBOLIC PREDICATES FOR ROBOT PLANNING},
  author={Liang Y, Kumar N, Tang H, Weller A, Tenenbaum JB et al.},
  pages={71952-71980},
  year = "2025"
}

GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers

Prospero L, Hamdi A, Henriques JF & Rupprecht C (2025), IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 5997-6007

BibTeX

@inproceedings{gstprecisedhuma-2025/1,
  title={GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers},
  author={Prospero L, Hamdi A, Henriques JF & Rupprecht C},
  pages={5997-6007},
  year = "2025"
}

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

Nuti F, Franzmeyer T & Henriques J (2025), Proceedings of Machine Learning Research, 267, 46837-46876

BibTeX

@inproceedings{tucomeasuringth-2025/1,
  title={TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs},
  author={Nuti F, Franzmeyer T & Henriques J},
  pages={46837-46876},
  year = "2025"
}

3D-aware instance segmentation and tracking in egocentric videos

Bhalgat Y, Tschernezki V, Laina I, Henriques J, Vedaldi A et al. (2024), 347-364

BibTeX

@misc{dawareinstances-2024/12,
  title={3D-aware instance segmentation and tracking in egocentric videos},
  author={Bhalgat Y, Tschernezki V, Laina I, Henriques J, Vedaldi A et al.},
  year = "2024"
}

3D-aware instance segmentation and tracking in egocentric videos

Bhalgat Y, Tschernezki V, Laina I, Henriques J, Vedaldi A et al. (2024), Computer Vision – ACCV 2024, 347-364

BibTeX

@inproceedings{dawareinstances-2024/12,
  title={3D-aware instance segmentation and tracking in egocentric videos},
  author={Bhalgat Y, Tschernezki V, Laina I, Henriques J, Vedaldi A et al.},
  booktitle={17th Asian Conference on Computer Vision (ACCV 2024)},
  pages={347-364},
  year = "2024"
}

N2F2: hierarchical scene understanding with nested neural feature fields

Bhalgat Y, Laina I, Henriques J, Zisserman A & Vedaldi A (2024), Computer Vision – ECCV 2024 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LIX, 197-214

BibTeX

@inproceedings{nfhierarchicals-2024/11,
  title={N2F2: hierarchical scene understanding with nested neural feature fields},
  author={Bhalgat Y, Laina I, Henriques J, Zisserman A & Vedaldi A},
  booktitle={20th European Conference on Computer Vision (ECCV 2024)},
  pages={197-214},
  year = "2024"
}

Contrastive lift: 3D object instance segmentation by slow-fast contrastive fusion

Bhalgat Y, Laina I, Henriques J, Zisserman A & Vedaldi A (2024), Advances in Neural Information Processing Systems 36, 9092

BibTeX

@inproceedings{contrastivelift-2024/10,
  title={Contrastive lift: 3D object instance segmentation by slow-fast contrastive fusion},
  author={Bhalgat Y, Laina I, Henriques J, Zisserman A & Vedaldi A},
  booktitle={37th Conference in Neural Information Processing Systems (NeurIPS 2023)},
  pages={9092},
  year = "2024"
}

Dissecting Temporal Understanding in Text-to-Audio Retrieval

Oncescu A-M, Henriques JF & Koepke AS (2024), 9535-9543

BibTeX

@inproceedings{dissectingtempo-2024/10,
  title={Dissecting Temporal Understanding in Text-to-Audio Retrieval},
  author={Oncescu A-M, Henriques JF & Koepke AS},
  booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
  pages={9535-9543},
  year = "2024"
}

HelloFresh: LLM evaluations on streams of real-world human editorial actions across X community notes and Wikipedia edits

Franzmeyer T, Shtedritski A, Albanie S, Torr P, Henriques JF et al. (2024), Findings of the Association for Computational Linguistics: ACL 2024, 12702-12716

BibTeX

@inproceedings{hellofreshllmev-2024/9,
  title={HelloFresh: LLM evaluations on streams of real-world human editorial actions across X community notes and Wikipedia edits},
  author={Franzmeyer T, Shtedritski A, Albanie S, Torr P, Henriques JF et al.},
  booktitle={62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)},
  pages={12702-12716},
  year = "2024"
}

3D-aware instance segmentation and tracking in egocentric videos

Bhalgat Y, Tschernezki V, Laina I, Henriques J, Vedaldi A et al. (2024)

BibTeX

@misc{dawareinstances-2024/8,
  title={3D-aware instance segmentation and tracking in egocentric videos},
  author={Bhalgat Y, Tschernezki V, Laina I, Henriques J, Vedaldi A et al.},
  year = "2024"
}

Neural fields for co-reconstructing 3D objects from incidental 2D data

Campbell D, Insafutdinov E, Henriques JF & Vedaldi A (2024), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2024), 2883-2893

BibTeX

@inproceedings{neuralfieldsfor-2024/6,
  title={Neural fields for co-reconstructing 3D objects from incidental 2D data},
  author={Campbell D, Insafutdinov E, Henriques JF & Vedaldi A},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2024)},
  pages={2883-2893},
  year = "2024"
}

Rapid Motor Adaptation for Robotic Manipulator Arms

Liang Y, Ellis K & Henriques J (2024), 00, 16404-16413

BibTeX

@inproceedings{rapidmotoradapt-2024/6,
  title={Rapid Motor Adaptation for Robotic Manipulator Arms},
  author={Liang Y, Ellis K & Henriques J},
  booktitle={2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={16404-16413},
  year = "2024"
}

Flash3D: feed-forward generalisable 3D scene reconstruction from a single image

Szymanowicz S, Insafutdinov E, Zheng C, Campbell D, Henriques JF et al. (2024)

BibTeX

@misc{flashdfeedforwa-2024/6,
  title={Flash3D: feed-forward generalisable 3D scene reconstruction from a single image},
  author={Szymanowicz S, Insafutdinov E, Zheng C, Campbell D, Henriques JF et al.},
  year = "2024"
}

Text2Loc: 3D Point Cloud Localization from Natural Language

Xia Y, Shi L, Ding Z, Henriques JF & Cremers D (2024), 00, 14958-14967

BibTeX

@inproceedings{textlocdpointcl-2024/6,
  title={Text2Loc: 3D Point Cloud Localization from Natural Language},
  author={Xia Y, Shi L, Ding Z, Henriques JF & Cremers D},
  booktitle={2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={14958-14967},
  year = "2024"
}

Select to Perfect: Imitating desired behavior from large multi-agent data

Franzmeyer T, Elkind E, Torr P, Foerster J & Henriques J (2024)

BibTeX

@misc{selecttoperfect-2024/5,
  title={Select to Perfect: Imitating desired behavior from large multi-agent data},
  author={Franzmeyer T, Elkind E, Torr P, Foerster J & Henriques J},
  year = "2024"
}

A sound approach: using large language models to generate audio descriptions for egocentric text-audio retrieval

Oncescu A-M, Henriques JF, Zisserman A, Albanie S & Koepke AS (2024), Proceedings of the 49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP 2024), 7300-7304

BibTeX

@inproceedings{asoundapproachu-2024/4,
  title={A sound approach: using large language models to generate audio descriptions for egocentric text-audio retrieval},
  author={Oncescu A-M, Henriques JF, Zisserman A, Albanie S & Koepke AS},
  booktitle={49th IEEE International Conference on Acoustics, Speech, & Signal Processing (ICASSP 2024)},
  pages={7300-7304},
  year = "2024"
}

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

Bhalgat Y, Laina I, Henriques JF, Zisserman A & Vedaldi A (2024)

BibTeX

@misc{nfhierarchicals-2024/3,
  title={N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields},
  author={Bhalgat Y, Laina I, Henriques JF, Zisserman A & Vedaldi A},
  year = "2024"
}

SCENES: Subpixel Correspondence EstimationWith Epipolar Supervision

Kloepfer DA, Henriques JF & Campbell D (2024), 00, 21-30

BibTeX

@inproceedings{scenessubpixelc-2024/3,
  title={SCENES: Subpixel Correspondence EstimationWith Epipolar Supervision},
  author={Kloepfer DA, Henriques JF & Campbell D},
  booktitle={2024 International Conference on 3D Vision (3DV)},
  pages={21-30},
  year = "2024"
}

Select to perfect: imitating desired behavior from large multi-agent data

Franzmeyer T, Elkind E, Torr P, Foerster J & Henriques JF (2024), Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)

BibTeX

@inproceedings{selecttoperfect-2024/1,
  title={Select to perfect: imitating desired behavior from large multi-agent data},
  author={Franzmeyer T, Elkind E, Torr P, Foerster J & Henriques JF},
  booktitle={12th International Conference on Learning Representations (ICLR 2024)},
  year = "2024"
}

Illusory attacks: information-theoretic detectability matters in adversarial attacks

Franzmeyer T, McAleer S, Henriques JF, Foerster J, Torr P et al. (2024), Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)

BibTeX

@inproceedings{illusoryattacks-2024/1,
  title={Illusory attacks: information-theoretic detectability matters in adversarial attacks},
  author={Franzmeyer T, McAleer S, Henriques JF, Foerster J, Torr P et al.},
  booktitle={12th International Conference on Learning Representations (ICLR 2024)},
  year = "2024"
}

Unsupervised Object Detection with Theoretical Guarantees

Longa M & Henriques JF (2024), Advances in Neural Information Processing Systems, 37

BibTeX

@inproceedings{unsupervisedobj-2024/1,
  title={Unsupervised Object Detection with Theoretical Guarantees},
  author={Longa M & Henriques JF},
  year = "2024"
}

LoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Loss

Kloepfer DA, Henriques J & Campbell D (2024), Advances in Neural Information Processing Systems, 37

BibTeX

@inproceedings{locolearningdlo-2024/1,
  title={LoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Loss},
  author={Kloepfer DA, Henriques J & Campbell D},
  year = "2024"
}

Interpretable Representation Learning from Videos using Nonlinear Priors

Longa M & Henriques JF (2024), 35th British Machine Vision Conference Bmvc 2024

BibTeX

@inproceedings{interpretablere-2024/1,
  title={Interpretable Representation Learning from Videos using Nonlinear Priors},
  author={Longa M & Henriques JF},
  year = "2024"
}

LoCUS: Learning Multiscale 3D-consistent Features from Posed Images

Kloepfer DA, Campbell D & Henriques JF (2023), 00, 16588-16598

BibTeX

@inproceedings{locuslearningmu-2023/10,
  title={LoCUS: Learning Multiscale 3D-consistent Features from Posed Images},
  author={Kloepfer DA, Campbell D & Henriques JF},
  booktitle={2023 IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={16588-16598},
  year = "2023"
}

CASSPR: Cross Attention Single Scan Place Recognition

Xia Y, Gladkova M, Wang R, Li Q, Stilla U et al. (2023), 00, 8427-8438

BibTeX

@inproceedings{cassprcrossatte-2023/10,
  title={CASSPR: Cross Attention Single Scan Place Recognition},
  author={Xia Y, Gladkova M, Wang R, Li Q, Stilla U et al.},
  booktitle={2023 IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={8427-8438},
  year = "2023"
}

RbA: Segmenting Unknown Regions Rejected by All

Nayal N, Yavuz M, Henriques JF & Güney F (2023), 00, 711-722

BibTeX

@inproceedings{rbasegmentingun-2023/10,
  title={RbA: Segmenting Unknown Regions Rejected by All},
  author={Nayal N, Yavuz M, Henriques JF & Güney F},
  booktitle={2023 IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={711-722},
  year = "2023"
}

A light touch approach to teaching transformers multi-view geometry

Bhalgat Y, Henriques J & Zisserman A (2023), Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR 2023), 4958-4969

BibTeX

@inproceedings{alighttouchappr-2023/8,
  title={A light touch approach to teaching transformers multi-view geometry},
  author={Bhalgat Y, Henriques J & Zisserman A},
  booktitle={Conference on Computer Vision and Pattern Recognition (CVPR 2023)},
  pages={4958-4969},
  year = "2023"
}

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

Bhalgat Y, Laina I, Henriques JF, Zisserman A & Vedaldi A (2023)

BibTeX

@misc{contrastivelift-2023/6,
  title={Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion},
  author={Bhalgat Y, Laina I, Henriques JF, Zisserman A & Vedaldi A},
  year = "2023"
}

Learn what matters: cross-domain imitation learning with task-relevant embeddings

Franzmeyer T, Torr P & Henriques J (2023), Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 35

BibTeX

@inproceedings{learnwhatmatter-2023/4,
  title={Learn what matters: cross-domain imitation learning with task-relevant embeddings},
  author={Franzmeyer T, Torr P & Henriques J},
  booktitle={36th Annual Conference on Neural Information Processing Systems (NeurIPS 2022)},
  year = "2023"
}

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

Milani S, Kanervisto A, Ramanauskas K, Schulhoff S, Houghton B et al. (2023), Proceedings of Machine Learning Research, 220, 171-188

BibTeX

@inproceedings{towardssolvingf-2023/1,
  title={Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition},
  author={Milani S, Kanervisto A, Ramanauskas K, Schulhoff S, Houghton B et al.},
  pages={171-188},
  year = "2023"
}

Extracting Reward Functions from Diffusion Models

Nuti F, Franzmeyer T & Henriques JF (2023), Advances in Neural Information Processing Systems, 36

BibTeX

@inproceedings{extractingrewar-2023/1,
  title={Extracting Reward Functions from Diffusion Models},
  author={Nuti F, Franzmeyer T & Henriques JF},
  year = "2023"
}

GENERALISED LOOKAHEAD OPTIMISER

Oncescu CA, Henriques JF & Valmadre J (2023), 1st Tiny Papers Track at Iclr 2023 Tiny Papers @ Iclr 2023

BibTeX

@inproceedings{generalisedlook-2023/1,
  title={GENERALISED LOOKAHEAD OPTIMISER},
  author={Oncescu CA, Henriques JF & Valmadre J},
  year = "2023"
}

SNeS: learning probably symmetric neural surfaces from incomplete data

Insafutdinov E, Campbell D & Vedaldi A (2022)

BibTeX

@inproceedings{sneslearningpro-2022/11,
  title={SNeS: learning probably symmetric neural surfaces from incomplete data},
  author={Insafutdinov E, Campbell D & Vedaldi A},
  booktitle={European Conference on Computer Vision 2022},
  year = "2022"
}

RbA: Segmenting Unknown Regions Rejected by All

Nayal N, Yavuz M, Henriques JF & Güney F (2022)

BibTeX

@misc{rbasegmentingun-2022/11,
  title={RbA: Segmenting Unknown Regions Rejected by All},
  author={Nayal N, Yavuz M, Henriques JF & Güney F},
  year = "2022"
}

Towards real-world navigation with deep differentiable planners

Ishida S & Henriques JF (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), 17306-17315

BibTeX

@inproceedings{towardsrealworl-2022/9,
  title={Towards real-world navigation with deep differentiable planners},
  author={Ishida S & Henriques JF},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
  pages={17306-17315},
  year = "2022"
}

Towards real-world navigation with deep differentiable planners

Ishida S & Henriques JF (2022), Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), 17306-17315

BibTeX

@inproceedings{towardsrealworl-2022/9,
  title={Towards real-world navigation with deep differentiable planners},
  author={Ishida S & Henriques JF},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
  pages={17306-17315},
  year = "2022"
}

SNeS: learning probably symmetric neural surfaces from incomplete data

Insafutdinov E, Campbell D, Henriques JF & Vedaldi A (2022)

BibTeX

@misc{sneslearningpro-2022/6,
  title={SNeS: learning probably symmetric neural surfaces from incomplete data},
  author={Insafutdinov E, Campbell D, Henriques JF & Vedaldi A},
  year = "2022"
}

Learning altruistic behaviours in reinforcement learning without external rewards

Franzmeyer T, Malinowski M & Henriques JF (2022)

BibTeX

@inproceedings{learningaltruis-2022/4,
  title={Learning altruistic behaviours in reinforcement learning without external rewards},
  author={Franzmeyer T, Malinowski M & Henriques JF},
  booktitle={10th International Conference on Learning Representations (ICLR 2022)},
  year = "2022"
}

Learning altruistic behaviours in reinforcement learning without external rewards

Franzmeyer T, Malinowski M & Henriques JF (2022)

BibTeX

@inproceedings{learningaltruis-2022/4,
  title={Learning altruistic behaviours in reinforcement learning without external rewards},
  author={Franzmeyer T, Malinowski M & Henriques JF},
  booktitle={10th International Conference on Learning Representations (ICLR 2022)},
  year = "2022"
}

Space-Time Crop & Attend: improving cross-modal video representation learning

Patrick M, Huang P-Y, Misra I, Metze F, Vedaldi A et al. (2022), 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 10540-10552

BibTeX

@inproceedings{spacetimecropat-2022/2,
  title={Space-Time Crop & Attend: improving cross-modal video representation learning},
  author={Patrick M, Huang P-Y, Misra I, Metze F, Vedaldi A et al.},
  booktitle={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
  pages={10540-10552},
  year = "2022"
}

On compositions of transformations in contrastive self-supervised learning

Yuki M. Asano Y, Patrick M, Kuznetsova P, Fong R, Henriques J et al. (2022), Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021), 9557-9567

BibTeX

@inproceedings{oncompositionso-2022/2,
  title={On compositions of transformations in contrastive self-supervised learning},
  author={Yuki M. Asano Y, Patrick M, Kuznetsova P, Fong R, Henriques J et al.},
  booktitle={2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021)},
  pages={9557-9567},
  year = "2022"
}

Preface

Albanie S, Henriques JF, Bertinetto L, Hernández-García A, Doughty H et al. (2022), Proceedings of Machine Learning Research, 181

BibTeX

@article{preface-2022/1,
  title={Preface},
  author={Albanie S, Henriques JF, Bertinetto L, Hernández-García A, Doughty H et al.},
  journal={Proceedings of Machine Learning Research},
  volume={181},
  year = "2022"
}

Keeping your eye on the ball: Trajectory attention in video transformers

Patrick M, Campbell D, Asano Y, Misra I, Metze F et al. (2021), Advances in Neural Information Processing Systems 34, 34, 12493-12506

BibTeX

@inproceedings{keepingyoureyeo-2021/12,
  title={Keeping your eye on the ball: Trajectory attention in video transformers},
  author={Patrick M, Campbell D, Asano Y, Misra I, Metze F et al.},
  booktitle={35th Conference on Neural Information Processing Systems (NeurIPS 2021)},
  pages={12493-12506},
  year = "2021"
}

Moving SLAM: fully unsupervised deep learning in non-rigid scenes

Xu D, Vedaldi A & Henriques JF (2021), Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021), 4611-4617

BibTeX

@inproceedings{movingslamfully-2021/9,
  title={Moving SLAM: fully unsupervised deep learning in non-rigid scenes},
  author={Xu D, Vedaldi A & Henriques JF},
  booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021)},
  pages={4611-4617},
  year = "2021"
}

Audio retrieval with natural language queries

Oncescu A-M, Koepke AS, Henriques J, Akata Z & Albanie S (2021), Proceedings of Interspeech 2021, 2411-2415

BibTeX

@inproceedings{audioretrievalw-2021/8,
  title={Audio retrieval with natural language queries},
  author={Oncescu A-M, Koepke AS, Henriques J, Akata Z & Albanie S},
  booktitle={Interspeech 2021},
  pages={2411-2415},
  year = "2021"
}

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

Patrick M, Campbell D, Asano YM, Misra I, Metze F et al. (2021)

BibTeX

@misc{keepingyoureyeo-2021/6,
  title={Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers},
  author={Patrick M, Campbell D, Asano YM, Misra I, Metze F et al.},
  year = "2021"
}

QUERYD: a video dataset with high-quality text and audio narrations

Oncescu A-M, Henriques J, Liu Y, Zisserman A & Albanie S (2021), ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2265-2269

BibTeX

@inproceedings{querydavideodat-2021/5,
  title={QUERYD: a video dataset with high-quality text and audio narrations},
  author={Oncescu A-M, Henriques J, Liu Y, Zisserman A & Albanie S},
  booktitle={2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021)},
  pages={2265-2269},
  year = "2021"
}

Support-set bottlenecks for video-text representation learning

Patrick M, Huang P, Asano Y, Metze F, Hauptmann A et al. (2021)

BibTeX

@inproceedings{supportsetbottl-2021/5,
  title={Support-set bottlenecks for video-text representation learning},
  author={Patrick M, Huang P, Asano Y, Metze F, Hauptmann A et al.},
  booktitle={9th International Conference on Learning Representations (ICLR 2021)},
  year = "2021"
}

Showing 50 publications by João Henriques

On the Dangers of Bootstrapping Generation for Continual Learning and Beyond

Rapidvol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans

VISUALPREDICATOR: LEARNING ABSTRACT WORLD MODELS WITH NEURO-SYMBOLIC PREDICATES FOR ROBOT PLANNING

GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers

TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs

3D-aware instance segmentation and tracking in egocentric videos

3D-aware instance segmentation and tracking in egocentric videos

N2F2: hierarchical scene understanding with nested neural feature fields

Contrastive lift: 3D object instance segmentation by slow-fast contrastive fusion

Dissecting Temporal Understanding in Text-to-Audio Retrieval

HelloFresh: LLM evaluations on streams of real-world human editorial actions across X community notes and Wikipedia edits

3D-aware instance segmentation and tracking in egocentric videos

Neural fields for co-reconstructing 3D objects from incidental 2D data

Rapid Motor Adaptation for Robotic Manipulator Arms

Flash3D: feed-forward generalisable 3D scene reconstruction from a single image

Text2Loc: 3D Point Cloud Localization from Natural Language

Select to Perfect: Imitating desired behavior from large multi-agent data

A sound approach: using large language models to generate audio descriptions for egocentric text-audio retrieval

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

SCENES: Subpixel Correspondence EstimationWith Epipolar Supervision

Select to perfect: imitating desired behavior from large multi-agent data

Illusory attacks: information-theoretic detectability matters in adversarial attacks

Unsupervised Object Detection with Theoretical Guarantees

LoCo: Learning 3D Location-Consistent Image Features with a Memory-Efficient Ranking Loss

Interpretable Representation Learning from Videos using Nonlinear Priors

LoCUS: Learning Multiscale 3D-consistent Features from Posed Images

CASSPR: Cross Attention Single Scan Place Recognition

RbA: Segmenting Unknown Regions Rejected by All

A light touch approach to teaching transformers multi-view geometry

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

Learn what matters: cross-domain imitation learning with task-relevant embeddings

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

Extracting Reward Functions from Diffusion Models

GENERALISED LOOKAHEAD OPTIMISER

SNeS: learning probably symmetric neural surfaces from incomplete data

RbA: Segmenting Unknown Regions Rejected by All

Towards real-world navigation with deep differentiable planners

Towards real-world navigation with deep differentiable planners

SNeS: learning probably symmetric neural surfaces from incomplete data

Learning altruistic behaviours in reinforcement learning without external rewards

Learning altruistic behaviours in reinforcement learning without external rewards

Space-Time Crop & Attend: improving cross-modal video representation learning

On compositions of transformations in contrastive self-supervised learning

Preface

Keeping your eye on the ball: Trajectory attention in video transformers

Moving SLAM: fully unsupervised deep learning in non-rigid scenes

Audio retrieval with natural language queries

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

QUERYD: a video dataset with high-quality text and audio narrations

Support-set bottlenecks for video-text representation learning