DESTR: Object Detection with Split Transformer

Posted Jan 14, 2024 Updated Feb 12, 2024

By Geonu-Lee 2 min read

DESTR: Object Detection with Split Transformer

CVPR 2022 , 2024-01-14 기준 21회 인용

Task

Object Detection
DETR

Contribution

Cross Attention 을 classification 과 regression task를 나누었다.
Mini-detector 를 도입하여 object queries initialize
Pair self-attention 구조 제안

(a) - DETR
(b) - C-DETR (Conditional DETR)
(c) - DESTR’s classification -> class-characteristic regions
(d) - DESTR’s regression -> horizontal and vertical edges

Proposed Method

The Mini-detector

입력으로 Encoder의 output을 사용
Decoder 의 object queries 를 만드는데 사용

FCOS 와 같이 feature cell 마다 prediction
bipartite matching 으로 supervised loss 부여

classification scores 으로 가장 높은 $K$ 개를 선택

Overfitting을 막기 위해서 Decoder의 object queries 로 사용하기전에 stop gradient

Pair Self-attention

왼쪽은 기존의 Conditional DETR Decoder layer
오른쪽은 제안하는 Decoder lalyer
self-attention -> pair self-attention
cross-attention -> split cross-attention

일반적인 self-attention 연산 과정

our observation that adjacent pairs of object queries may provide more important cues

left remote - a 와 left cat - b 의 self-attention $A_1(a,b)$ 가 informative features 갖고 있을 것이다.

같은 class 인 “remote” 일지라도 아래와 같은 경우를 보여준다.
$A_1(d, a) ≫ A_1(a, d)$
이는 부분적으로 물체가 가려진 부분이 있기 때문

서로 인접한 것 끼리 묶어서 연산
$A_2((a, b), (d, c))$

IoU를 기반으로 가장 가까운 query 끼리만 pair로 한다
여기서 $a$ 는 previous decoder 에서 예측된 box

pair 를 하는 과정에서 order를 추가해주는 과정
box center 와 이미지의 top-left 의 L1 distance 를 기준

각 pair 끼리 concat 하고 연산

pair self-attention 수식

최종적으로는 일반적인 self-attention 과 pair attention 결과를 모두 사용

Split Cross-attention

단순하게 classification 과 regression 나누어서 cross-attention 진행
그리고 각각의 head로 통과

Experimental Results

같은 backbone 에서 제안하는 방법이 다른 DETR 방법들에 비해 성능 향상
하지만 GFLOPs 가 크게 올라가는 것으로 보임

Paper Reading

Object Detection DETR

This post is licensed under CC BY 4.0 by the author.

DESTR: Object Detection with Split Transformer

Task

Contribution

Proposed Method

The Mini-detector

Pair Self-attention

Split Cross-attention

Experimental Results

Trending Tags