publications | Jan Šochman

Selection of my favourite publications. A full list can be found at my scholar profile.

2024

The BRAVO Semantic Segmentation Challenge Results in UNCV2024

Tuan-Hung Vu, Eduardo Valle, Andrei Bursuc, Tommie Kerssies, and 15 more authors

2024

arXiv
"PixOOD: Pixel-Level Out-of-Distribution Detection"

Tomáš "Vojíř, Jan Šochman, and Jiří" Matas

In Computer Vision – ECCV 2024, 2024

Abs arXiv Bib PDF Supp Code

We propose a dense image prediction out-of-distribution detection algorithm, called PixOOD, which does not require training on samples of anomalous data and is not designed for a specific application which avoids traditional training biases. In order to model the complex intra-class variability of the in-distribution data at the pixel level, we propose an online data condensation algorithm which is more robust than standard K-means and is easily trainable through SGD. We evaluate PixOOD on a wide range of problems. It achieved state-of-the-art results on four out of seven datasets, while being competitive on the rest.
@inproceedings{vojir2024pixood, author = {"Voj{\'i}{\v{r}}, Tom{\'a}{\v{s}} and {\v{S}}ochman, Jan and Matas, Ji{\v{r}}{\'i}"}, title = {{"PixOOD: Pixel-Level Out-of-Distribution Detection"}}, booktitle = {Computer Vision -- ECCV 2024}, year = {2024}, publisher = {Springer Nature Switzerland}, pages = {93--109}, isbn = {978-3-031-73027-6} }

2023

Calibrated Out-of-Distribution Detection with a Generic Representation

Tomáš Vojı́ř, Jan Šochman, Rahaf Aljundi, and Jiřı́ Matas

In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Oct 2023

Abs arXiv Bib PDF Supp Code

Out-of-distribution detection is a common issue in deploying vision models in practice and solving it is an essential building block in safety critical applications. Existing OOD detection solutions focus on improving the OOD robustness of a classification model trained exclusively on in-distribution (ID) data. In this work, we take a different approach and propose to leverage generic pre-trained representations. We first investigate the behaviour of simple classifiers built on top of such representations and show striking performance gains compared to the ID trained representations. We propose a novel OOD method, called GROOD, that achieves excellent performance, predicated by the use of a good generic representation. Only a trivial training process is required for adapting GROOD to a particular problem. The method is simple, general, efficient, calibrated and with only a few hyper-parameters. The method achieves state-of-the-art performance on a number of OOD benchmarks, reaching near perfect performance on several of them. The source code is available at this https URL.
@inproceedings{vojir2023calibrated, author = {Voj{\'\i}\v{r}, Tom\'a\v{s} and \v{S}ochman, Jan and Aljundi, Rahaf and Matas, Ji\v{r}{\'\i}}, title = {Calibrated Out-of-Distribution Detection with a Generic Representation}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops}, month = oct, year = {2023}, pages = {4507-4516}, doi = {10.1109/ICCVW60793.2023.00485}, }

2021

Monocular Arbitrary Moving Object Discovery and Segmentation

Michal Neoral, Jan Šochman, and Jiří Matas

In The 32nd British Machine Vision Conference – BMVC 2021, Oct 2021

Abs Bib PDF Code

We propose a method for discovery and segmentation of objects that are, or their parts are, independently moving in the scene. Given three monocular video frames, the method outputs semantically meaningful regions, i.e. regions corresponding to the whole object, even when only a part of it moves. The architecture of the CNN-based end-to-end method, called Raptor, combines semantic and motion backbones, which pass their outputs to a final region segmentation network. The semantic backbone is trained in a class-agnostic manner in order to generalise to object classes beyond the training data. The core of the motion branch is a geometrical cost volume computed from optical flow, optical expansion, mono-depth and the estimated camera motion. Evaluation of the proposed architecture on the instance motion segmentation and binary moving-static segmentation problems on KITTI, DAVIS-Moving and YTVOS-Moving datasets shows that the proposed method achieves state-of-the-art results on all the datasets and is able to generalise well to various environments. For the KITTI dataset, we provide an upgraded instance motion segmentation annotation which covers all moving objects. Dataset, code and models are available on the github project page github.com/michalneoral/Raptor.
@inproceedings{Neoral2021, author = {Neoral, Michal and {\v{S}}ochman, Jan and Matas, Ji{\v{r}}{\'i}}, title = {Monocular Arbitrary Moving Object Discovery and Segmentation}, booktitle = {The 32nd British Machine Vision Conference -- BMVC 2021}, year = {2021}, }

2018

Continual Occlusions and Optical Flow Estimation

Michal Neoral, Jan Šochman, and Jiri Matas

In Asian Conference on Computer Vision, Oct 2018

Abs arXiv Bib PDF

Two optical flow estimation problems are addressed: (i) occlusion estimation and handling, and (ii) estimation from image sequences longer than two frames. The proposed ContinualFlow method estimates occlusions before flow, avoiding the use of flow corrupted by occlusions for their estimation. We show that providing occlusion masks as an additional input to flow estimation improves the standard performance metric by more than 25% on both KITTI and Sintel. As a second contribution, a novel method for incorporating information from past frames into flow estimation is introduced. The previous frame flow serves as an input to occlusion estimation and as a prior in occluded regions, i.e. those without visual correspondences. By continually using the previous frame flow, ContinualFlow performance improves further by 18% on KITTI and 7% on Sintel, achieving top performance on KITTI and Sintel.
@inproceedings{Neoral2018, author = {Neoral, Michal and {\v S}ochman, Jan and Matas, Jiri}, title = {Continual Occlusions and Optical Flow Estimation}, booktitle = {Asian Conference on Computer Vision}, pages = {159--174}, year = {2018}, doi = {10.1007/978-3-030-20870-7_10}, organization = {Springer}, }

2013

Robust abandoned object detection integrating wide area visual surveillance and social context

James Ferryman, David Hogg, Jan Sochman, Ardhendu Behera, and 9 more authors

Pattern Recognition Letters, Oct 2013

Scene Understanding and Behaviour Analysis

Abs Bib PDF

This paper presents a video surveillance framework that robustly and efficiently detects abandoned objects in surveillance scenes. The framework is based on a novel threat assessment algorithm which combines the concept of ownership with automatic understanding of social relations in order to infer abandonment of objects. Implementation is achieved through development of a logic-based inference engine based on Prolog. Threat detection performance is conducted by testing against a range of datasets describing realistic situations and demonstrates a reduction in the number of false alarms generated. The proposed system represents the approach employed in the EU SUBITO project (Surveillance of Unattended Baggage and the Identification and Tracking of the Owner).
@article{Ferryman2013789, title = {Robust abandoned object detection integrating wide area visual surveillance and social context}, journal = {Pattern Recognition Letters}, volume = {34}, number = {7}, pages = {789 - 798}, year = {2013}, note = {<ce:title>Scene Understanding and Behaviour Analysis</ce:title>}, issn = {0167-8655}, doi = {10.1016/j.patrec.2013.01.018}, url = {http://www.sciencedirect.com/science/article/pii/S0167865513000226}, author = {Ferryman, James and Hogg, David and Sochman, Jan and Behera, Ardhendu and Rodriguez-Serrano, José A. and Worgan, Simon and Li, Longzhen and Leung, Valerie and Evans, Murray and Cornic, Philippe and Herbin, Stéphane and Schlenger, Stefan and Dose, Michael}, keywords = {Abandoned objects}, }

2010

Interpreting Structures in Man-made Scenes - Combining Low-Level and High-Level Structure Sources

Kasim Terzic, Lothar Hotz, and Jan Sochman

In Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART, Oct 2010

Bib PDF

@inproceedings{Terzic2010,
  author = {Terzic, Kasim and Hotz, Lothar and Sochman, Jan},
  title = {Interpreting Structures in Man-made Scenes - Combining Low-Level and High-Level Structure Sources},
  year = {2010},
  pages = {357--364},
  booktitle = {Proceedings of the 2nd International Conference on Agents and Artificial Intelligence - Volume 1: ICAART},
  publisher = {SciTePress},
  organization = {INSTICC},
  doi = {10.5220/0002735303570364},
  isbn = {978-989-674-021-4},
  issn = {2184-433X},
}

2009

Learning Fast Emulators of Binary Decision Processes

Jan Šochman, and Jiřı́ Matas

International Journal of Computer Vision, Jun 2009

Abs Bib PDF

Computation time is an important performance characteristic of computer vision algorithms. The paper shows how existing (slow) binary decision algorithms can be approximated by a (fast) trained WaldBoost classifier. WaldBoost learning minimises the decision time of the classifier while guaranteeing predefined precision. We show that the WaldBoost algorithm together with bootstrapping is able to efficiently handle an effectively unlimited number of training examples provided by the implementation of the approximated algorithm. Two interest point detectors, the Hessian-Laplace and the Kadir-Brady saliency detectors, are emulated to demonstrate the approach. Experiments show that while the repeatability and matching scores are similar for the original and emulated algorithms, a 9-fold speed-up for the Hessian-Laplace detector and a 142-fold speed-up for the Kadir-Brady detector is achieved. For the Hessian-Laplace detector, the achieved speed is similar to SURF, a popular and very fast handcrafted modification of Hessian-Laplace; the WaldBoost emulator approximates the output of the Hessian-Laplace detector more precisely.
@article{Sochman2009, author = {{\v S}ochman, Jan and Matas, Ji{\v r}{\'\i}}, title = {Learning Fast Emulators of Binary Decision Processes}, journal = {International Journal of Computer Vision}, year = {2009}, volume = {83}, pages = {149--163}, number = {2}, month = jun, doi = {10.1007/s11263-009-0229-x}, keywords = {Boosting, AdaBoost, Sequential probability ratio test, Sequential decision making, WaldBoost, Interest point detectors, Machine learning}, }

2008

Training Sequential On-line Boosting Classifier for Visual Tracking

H. Grabner, J. Šochman, H. Bischof, and J. Matas

In 19th International Conference on Pattern Recognition, Jun 2008

Abs Bib PDF

On-line boosting allows to adapt a trained classifier to changing environmental conditions or to use sequentially available training data. Yet, two important problems in the on-line boosting training remain unsolved: (i) classifier evaluation speed optimization and, (ii) automatic classifier complexity estimation. In this paper we show how the on-line boosting can be combined with Wald’s sequential decision theory to solve both of the problems. The properties of the proposed on-line WaldBoost algorithm are demonstrated on a visual tracking problem. The complexity of the classifier is changing dynamically depending on the difficulty of the problem. On average, a speedup of a factor of 5-10 is achieved compared to the non-sequential on-line boosting.
@inproceedings{Grabner2008, author = {Grabner, H. and {\v S}ochman, J. and Bischof, H. and Matas, J.}, title = {Training Sequential On-line Boosting Classifier for Visual Tracking}, booktitle = {19th International Conference on Pattern Recognition}, year = {2008}, doi = {10.1109/ICPR.2008.4761678}, }

2007

Learning A Fast Emulator of a Binary Decision Process

Jan Šochman, and Jiřı́ Matas

In ACCV, Jun 2007

Abs Bib PDF

Computation time is an important performance characteristic of computer vision algorithms. This paper shows how existing (slow) binary-valued decision algorithms can be approximated by a trained WaldBoost classifier, which minimises the decision time while guaranteeing predefined approximation precision. The core idea is to take an existing algorithm as a black box performing some useful binary decision task and to train the WaldBoost classifier as its emulator. Two interest point detectors, Hessian-Laplace and Kadir-Brady saliency detector, are emulated to demonstrate the approach. The experiments show similar repeatability and matching score of the original and emulated algorithms while achieving a 70-fold speed-up for Kadir-Brady detector.
@inproceedings{Sochman-accv2007, author = {{\v S}ochman, Jan and Matas, Ji{\v r}{\' \i}}, title = {Learning A Fast Emulator of a Binary Decision Process}, booktitle = {ACCV}, year = {2007}, editor = {Yagi, Yasushi and Kang, Sing Bing and Kweon, In So and Zha, Hongbin}, volume = {II}, pages = {236--245}, address = {Berlin Heidelberg}, publisher = {Springer}, series = {LNSC}, isbn = {978-3-540-76389-5}, doi = {10.1007/978-3-540-76390-1_24}, }

2005

WaldBoost - Learning for Time Constrained Sequential Detection

Jan Šochman, and Jiřı́ Matas

In Proc. of Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2005

Bib PDF

@inproceedings{sochman-waldboost-cvpr05,
  author = {{\v S}ochman, Jan and Matas, Ji{\v r}{\' \i}},
  title = {WaldBoost  - Learning for Time Constrained Sequential Detection},
  booktitle = {Proc. of Conference on Computer Vision and Pattern Recognition (CVPR)},
  address = {Los Alamitos, USA},
  year = {2005},
  month = jun,
  day = {20--25},
  isbn = {0-7695-2372-2},
  publisher = {IEEE Computer Society},
  book_pages = {1219},
  pages = {150--157},
  doi = {10.1109/CVPR.2005.373},
  annote = { In many computer vision classification problems, both the
    error and time characterizes the quality of a decision. We show that
    such problems can be formalized in the framework of sequential
    decision-making. If the false positive and false negative error
    rates are given, the optimal strategy in terms of the shortest
    average time to decision (number of measurements used) is the Wald's
    sequential probability ratio test (SPRT).  We built on the optimal
    SPRT test and enlarge its capabilities to problems with dependent
    measurements. We show, how the limitations of SPRT to a priori
    ordered measurements and known joint probability density functions
    can be overcome. We propose an algorithm with near optimal time -
    error rate trade-off, called WaldBoost, which integrates the
    AdaBoost algorithm for measurement selection and ordering and the
    joint probability density estimation with the optimal SPRT decision
    strategy.  The WaldBoost algorithm is tested on the face detection
    problem. The results are superior to the state-of-the-art methods in
    average evaluation time and comparable in detection rates.  },
  keywords = {Adaboost, cascade, Wald's SPRT, sequential analysis, face detection},
  editor = {Schmid, Cordelia and Soatto, Stefano and Tomasi, Carlo},
  venue = {San Diego, California, USA  },
  volume = { 2 },
}