Peter Kocsis

PhD student in Inverse Rendering

Supervisor: Prof. Dr. Matthias Niessner

Visual Computing & Artificial Intelligence Lab, Technical University of Munich

peter.kocsis(at)tum.de

About

I am currently doing my PhD in the Visual Computing & Artificial Intelligence Lab at the Technical University of Munich under the supervision of Prof. Dr. Matthias Niessner. I finished my bachelor's as Mechatronics engineer, then did my master's in Robotics, Cognition, Intelligence. Previously, I worked on Reinforcement Learning for control and planning. Later I dived into Active Learning for image classification. During my PhD, I am focusing on Photorealistic 3D Reconstruction, specifically on lighting and material decomposition.

Publications

LightIt: Illumination Modeling and Control for Diffusion Models

CVPR 2024

LightIt: Illumination Modeling and Control for Diffusion Models

CVPR 2024

Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Niessner, Yannick Hold-Geoffroy

Recent generative methods lack lighting control, which is crucial to numerous artistic aspects of image generation such as setting the overall mood or cinematic appearance. To overcome these limitations, we propose to condition the generation on shading and normal maps. We model the lighting with single bounce shading, which includes cast shadows. We first train a shading estimation module to generate a dataset of real-world images and shading pairs. Then, we train a control network using the estimated shading and normals as input. Our method demonstrates high-quality image generation and lighting control in numerous scenes.

Intrinsic Image Diffusion for Single-view Material Estimation

CVPR 2024

Intrinsic Image Diffusion for Single-view Material Estimation

CVPR 2024

Peter Kocsis, Vincent Sitzmann, Matthias Niessner

Intrinsic image decomposition is a highly ambigous task. Deep-learning-based methods often fail due to the lack of large-scale real world data. We propose to formulate the problem probabilistically and generate possible decompositions using a generative model. This way, we can also utilize the strong image prior of diffusion models for the task of material estimation, which largely helps generalization.

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

NeurIPS 2022

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

NeurIPS 2022

Peter Kocsis, Peter Súkeník, Guillem Brasó, Matthias Niessner, Laura Leal-Taixé, Ismail Elezi

Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast number of weights and need to be trained on massive datasets; hence, they are not suitable for their use in low-data regimes. In this work, we propose a simple yet effective framework to improve generalization from small amounts of data. We augment modern CNNs with fully-connected (FC) layers and show the massive impact this architectural change has in low-data regimes.

Projects

Active Learning with Transformers

2021

Active Learning with Transformers

2021

Technical University of Munich

During my masters' thesis, I was working on using inter-sample message passing for active learning. Active learning requires uncertainty estimation of the unlabeled pool. Providing inter-sample information to the network helps to better find the out-of-domain samples.

Reinforcement Learning for Motion Planning

2020

Reinforcement Learning for Motion Planning

2020

Technical University of Munich

Commonroad is a generic framework for developing and testing motion planning algorithms for autonomous vehicles. Besides working on the platform as working student, I was also participating in researching reinforcement-learning-based motion planning with dense and sparse rewards.

Neural Ball-balancing Table

2019

Neural Ball-balancing Table

2019

Budapest University of Technology and Economics

During my bachelors' thesis, I have constructed a ball-balancing table and implemented various control algorithms. A virtual twin has been implemented in Unity and trained a neural-network-based controller, then transferred it to the real world device.

Monocular Localization

2017

Monocular Localization

2017

Machine Perception Research Laboratory

The goal of the project was to reimplement and potentially improve the paper "Visual localization within LIDAR maps for automated urban driving" (Ryan W. W. and Ryan M. E., 2014). Given a pre-scanned map, we render synthetic views around an estimated pose. Then, we match the synthetic views to the camera feed.