ReConPatch: Contrastive Patch Representation Learning for Industrial Anomaly Detection

Posted Aug 2, 2024 Updated Aug 7, 2024

By Geonu-Lee 4 min read

ReConPatch: Contrastive Patch Representation Learning for Industrial Anomaly Detection

WACV 2024, 2024-08-02 기준 27회 인용

Task

Anomaly Detection

Contributions

Construct discriminative features for anomaly detection by training a linear modulation
Contrastive patch representation learning 방법 제안 (ReConPatch)
MVTec AD dataset, BTAD dataset 에서 state-of-the-art (SOTA) 성능을 찍었다고 함 (그 당시에, ensemble method)

Proposed Method

representation biased to the natural image data, which has a gap with the target data

기존 pre-trained model 은 anomaly detection 을 위한 target data 와의 gap 이 있다는 문제 지적

The main concept of our proposed approach is to train the target-oriented features

본 논문에서는 target-oriented features 를 학습하는 것이 목적이라고 함

Overall structure

$\mathcal{P}(x,h,w) \in \mathbb{R}^{C’}$

Patch-level features 를 뽑는 과정은 PatchCore 와 동일하게 진행
두 개의 networks 를 사용

patch-level feature representation learning 을 위한 네트워크
pairwise and contextual similarities 를 계산하여 contrastive loss 를 계산하기 위한 네트워크

PatchCore 와 동일하게 Corset sampling 해서 Memory Bank 에 저장하고 Inference 에 사용

Patch-level feature representation learning

aggregate highly similar features while repelling those with low similarity

similarity 가 높은 feature 들은 더 근접하도록, 반대로 낮은 feature 들은 더 멀어지도록

Pairwise similarity

$ \bar{z}_i = \bar{g}(\bar{f}(p_i)), \bar{z}_j = \bar{g}(\bar{f}(p_j)) $

projected representation 에 대해서 patch-level 로 similarity 연산

같은 Pairwise similarity 일지라도 (a) 인 경우에는 서로 멀어저야하고, (b)의 경우는 멀어저야함
Pairwise similarity 만으로는 충분하지 않기에 contextual similarity 사용

Contextual similarity

k-th nearest neighbor 를 기반으로 하는 contextual similarity 계산

expanding the query to the neighbors of neighbors

redefined by averaging the similarities over the set of k-nearest reciprocal neighbors.

contextual similarity 는 asymmetric 하기 때문에 bi-directional similarity 연산

Pairwise similarity 와 Contextual similarity 를 linear combination

계산된 similarity 로 가까운 feature 들은 더 가깝게, 먼 feature 들은 더 멀게 학습

similarity 를 계산하는 네트워크는 느리게 EMA 로 업데이트

Fast training of the similarity calculation network reduces the consistency of the relationships between the patch-level features, leading to unstable training.