MVRackLay: Monocular Multi-View Layout Estimation for Warehouse Rack and Shelves

Pranjali Pathre, Anurag Sahu, Ashwin Rao, Avinash Prabhu, Meher Shashwat Nigam, Tanvi Karandikar, Harit Pandya, and K. Madhava Krishna

Code

Link to code: https://github.com/pranjali-pathre/MVRackLay

Link to download Dataset: https://tinyurl.com/yxmu5t64

Dataset Generation Pipeline

Waresynth

Architecture

Architecture comprises of a context encoder, a Convolutional LSTM for encoding temporal information and multi-channel decoders and adversarial discriminators.

Results

MVRackLay-Disc-4 Results

Here, we present the results of our network tested on domain randomized data. The bottommost shelf layout is shown in the left-most column, followed by the middle and top shelf (if visible). Observe the diversity of warehouse scenes captured and the top-view and front-view layouts predicted for the same.

RackLay vs. MVRackLay-Disc-4

Above, we compare qualitatively the results of RackLay and our MVRackLay-Disc-4. The shelf in focus is highlighted with a red border. Observe that our model removes the false positive and noise in row 1, fixes the false negative in row 2, removes noise in row 3 and 4, and increases sharpness of both box boundaries (all rows) and shelf edges.

MVRackLay-Disc-4 vs. MVRackLay-Disc-8

The shelf in focus is highlighted with a red border. Better demarcations between adjoining boxes and less joining of abreast layouts are observed in the output of MVRackLay-Disc-4 compared to its counterpart.

Contact information

pranjali2000pathre@gmail.com