Superpatching for Image Analysis Using Transformers and Superpixels

Author/creator McCutcheon, Brannon Brannon author
Other author Hart, David degree supervisor.
Other author East Carolina University
Format Theses and dissertations
Publication[Greenville, N.C.] : [East Carolina University], 2025.
Description56 pages
Supplemental ContentAccess via ScholarShip

Summary Transformers have revolutionized Computer Vision, offering robust performance across diverse tasks. However, their reliance on uniform pixel patching presents limitations, including computational inefficiency for larger images, suboptimal handling of local features, and an inability to process non-uniform patches. Addressing these constraints allows for new opportunities to expand their utility in demanding fields, such as medical imaging. This work proposes a novel architecture combining Convolutional Neural Networks (CNNs) and Transformers to leverage superpixels, clusters of pixels with shared characteristics that capture local feature boundaries effectively. We propose an architecture that segments images into a collection of superpixels, vectorizes these superpixels using a CNN, and passes the resulting tokenized vector representations to a standard Transformer. By removing the uniformity constraint in patching, our approach aims to enhance Transformer performance on tasks requiring large-scale image analysis and fine-grained local feature understanding, potentially opening a way for broader Transformer applications in Computer Vision.
Dissertation noteEast Carolina University 2025.
Bibliography noteIncludes bibliographical references.
Technical detailsSystem requirements: Adobe Reader.
Technical detailsMode of access: World Wide Web.