Mission

The Machine and Neuromorphic Perception Laboratory (a.k.a. kLab) in the Chester F. Carlson Center for Imaging Science at the Rochester Institute of Technology (RIT) uses machine learning to solve problems in computer vision. The lab's primary interests are goal-driven scene understanding and lifelong learning. Almost all of our current research uses deep learning. The lab also studies learning and vision in animals as a source of principles that can be used to create brain-inspired algorithms. kLab is part of RIT's Multidisciplinary Vision Research Laboratory (MVRL). kLab is directed by Dr. Christopher Kanan.

Recent projects have included algorithms for visual question answering, studying incremental learning in neural networks, low-shot deep learning, new methods for eye movement analysis, saliency algorithms, perception systems for autonomous ships, algorithms for top-down and bottom-up saliency, tracking in video using neural networks, active vision algorithms, and feature learning in hyperspectral imagery.

Research Topics & Selected Publications

The lab's most recent papers can be found on the PI's website.

Visual Question Answering (VQA) - VQA algorithms attempt to answer questions about images.

Acharya, M., Kafle, K., Kanan, C. (2019) TallyQA: Answering Complex Counting Questions. In: AAAI-2019.
Kafle, K., Cohen, S., Price, B., Kanan, C. (2018) DVQA: Understanding Data Visualizations via Question Answering. In: CVPR-2018.
Kafle, K., Kanan, C. (2017) An Analysis of Visual Question Answering Algorithms. In: ICCV-2017.
Kafle, K., Kanan, C. (2017) Visual Question Answering: Datasets, Algorithms, and Future Challenges. Computer Vision and Image Understanding (CVIU). doi:10.1016/j.cviu.2017.06.005
Kafle, K., Yousefhussien, M., Kanan, C. (2017) Data Augmentation for Visual Question Answering. In: International Conference on Natural Language Generation (INLG-2017).
Kanan, C. and Kafle, K. (2016) Answer-Type Prediction for Visual Question Answering. In: CVPR-2016.

Deep & Self-Taught Learning - We were early pioneers in self-taught feature learning, and we heavily use deep learning.

Kumra, S., Kanan, C. (2017) Robotic Grasp Detection using Deep Convolutional Neural Networks. In: IROS-2017.
Kemker, R., Kanan, C. (2017) Self-Taught Feature Learning for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing (TGRS), 55(5): 2693 – 2705.
Wang, P., Cottrell, G., and Kanan, C. (2015) Modeling the Object Recognition Pathway: A Deep Hierarchical Model Using Gnostic Fields. In: CogSci-2015.
Kanan, C. (2013) Active Object Recognition with a Space-Variant Retina. ISRN Machine Vision, 2013: 138057. doi:10.1155/2013/138057
Kanan, C. and Cottrell, G.W. (2010) Robust classification of objects, faces, and flowers using natural image statistics. In: CVPR-2010.
Kanan, C., Tong, M.H., Zhang, L. and Cottrell, G.W. (2009) SUN: Top-down saliency using natural statistics. Visual Cognition, 17:979-1003.

Brain-Inspired Computer Vision -

Yousefhussien, M., Browning, N.A., and Kanan, C. (2016) Online Tracking using Saliency. In: WACV-2016.
Wang, P., Cottrell, G., and Kanan, C. (2015) Modeling the Object Recognition Pathway: A Deep Hierarchical Model Using Gnostic Fields. In: CogSci-2015.
Kanan, C. (2014) Fine-Grained Object Recognition with Gnostic Fields. WACV-2014. doi:10.1109/WACV.2014.6836122
Khosla, D., Huber, D.J., and Kanan, C. (2014) A Neuromorphic System for Visual Object Recognition. Biologically Inspired Cognitive Architectures, 8: 33-45.
Kanan, C. (2013) Recognizing Sights, Smells, and Sounds With Gnostic Fields. PLoS ONE, 8(1): e54088.

Human Eye Movements - People make 180,000 eye movements per day. We have developed algorithms for predicting what a person is doing from their eye movements and saliency models for predicting where a person will look in an image.

Kanan, C., Bseiso, D., Ray, N., Hsiao, J., and Cottrell, G. (2015) Humans Have Idiosyncratic and Task-specific Scanpaths for Judging Faces. Vision Research. doi:10.1016/j.visres.2015.01.013
Kanan, C., Ray, N.A., Bseiso, D.N.F., Hsiao, J.H., Cottrell, G.W. (2014) Predicting an observer's task using Multi-Fixation Pattern Analysis. In Proceedings of the Annual Eye Tracking Research & Applications Symposium (ETRA 2014).
Kanan, C., Tong, M.H., Zhang, L. and Cottrell, G.W. (2009) SUN: Top-down saliency using natural statistics. Visual Cognition, 17:979-1003.

Active Computer Vision - Motivated by human eye movements, we built computer vision algorithms that sequentially sample images to recognize objects.

Kanan, C. (2013) Active Object Recognition with a Space-Variant Retina. ISRN Machine Vision. doi:10.1155/2013/138057
Kanan, C. and Cottrell, G.W. (2010) Robust classification of objects, faces, and flowers using natural image statistics. In: CVPR-2010.

Lifelong Learning - Lifelong learning deals with algorithms that incrementally learn from data streams, which poses unique challenges.