부스트캠프 AI Tech 2기 Level2 CV Data augmentation

6 Sep 2021

Data augmentation

1.1 Learning representation of data set

데이터 셋은 대부분 편향적이다.

train data (=camera image) 들은 실제와 다른 image 들을 표현한다.

편향된 이미지로부터 모델은 Robust하게 학습할 수 없다. (=근본적으로 dataset이 real data를 충분히 표현하지 못했기 때문에)

몇몇 데이터들이 주어졌을 때, 이 데이터에 옵션을 추가해서 모델이 더욱 다양성 있게 학습할 수 있도록 해주는 것이 Augmentation 이다.

1.2 Data augmentation

dataset 으로 부터의 다양한 이미지 변환을 적용시키기
- Crop, Shear, Brightness, Perspective, Rotate 등
OpenCV나 NumPy 라이브러리에 사용하기 쉽게 잘 구현되어 있다.

목적 : training dataset과 real dataset의 분포를 유사하게 만드는 것!

1.3 Various data augmentation methods

Various brightness in dataset

def brightness_augmentation(img):
    #numpy array img has RGB value(0~255) for each pixel
    img[:,:,0] = img[:,:,0] + 100 # add 100 to R value
    img[:,:,1] = img[:,:,1] + 100 # add 100 to R value
    img[:,:,2] = img[:,:,2] + 100 # add 100 to R value

    img[:,:,0][img[:,:,0]>255] = 255 # clip R values over 255
    img[:,:,1][img[:,:,1]>255] = 255 # clip G values over 255
    img[:,:,2][img[:,:,2]>255] = 255 # clip B values over 255

Rotate, flip

img_rotated = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE)
img_flipped = cv2.rotate(image, cv2.ROTATE_180)

Crop

y_start = 500
crop_y_size = 400
x_start = 300
crop_x_size = 800
img_cropped = image[y_start : y_start + crop_y_size, x_start : x_start + crop_x_size, :]

Affine transformation
- Preserve ‘line’, ‘length ratio’, and ‘parallelism’ in image

rows, cols, ch = image.shape
pts1 = np.float32([[50,50],[200,50],,[50,200]])
pts2 = np.float32([[100,100],[200,50],,[100,250]])
M = cv2.getAffineTransform(pts1,pts2)
shear_img = cv2.warpAffine(image, M, (cols,rows))

Modern augmentation techniques

Cutmix
- Mixing both images and labels

RandAugment
- 다양한 augmentation 방법이 존재한다. 적용시킬 최적의 augmentation of sequence를 자동으로 찾아준다.

어떠한 augmentation 을 사용할 지? 강도(magnitude)는 얼마나 적용시킬 것 인지? 를 parameter로 받는다.

2. Leverageing pre-trained information

2.1 Trainsfer learning

기존의 미리 학습시켜놓은 사전 지식을 활용하여 연관된 새로운 Task 의 적은 노력으로도 더 높은 성능으로 도달이 가능한 방법이다.
A Dataset 에서 배운 지식을 B Dataset 에서 활용 가능하다.

Approach 1 : pre-trained 된 모델의 마지막 Layer 를 새로운 FC Layer 로 추가하여 학습시키는 방법.

Approach 2 : pre-trained 된 모델의 마지막 Layer 를 새롭게 교체하고 전체 model set 에 대하여 learning rate를 다르게 주어 학습하는 방법.

2.2 Knowledge distillation

이미 학습된 Teacher Network 지식을 더 작은 모델인 Student Network 에 주입을 시켜 학습한다. (big model -> small model로 지식 전달)

최근엔 Teacher 에서 생성된 output 을 Unlableled 된 데이터에 가짜 label으로 자동 생성하는 매커니즘으로 사용한다. 그렇게해서 가짜 label로 더 큰 student network 를 사용할 때도, regularization 역할로 사용할 수 있다. ( 더 많은 데이터를 사용할 수 있기 때문에)

Distillation loss 는 teacher model과 유사한 예측 output을 뽑아내기 위한 loss이다.

Hard label (One-hot vector)
- Typically obtained from the dataset
- indicates whether a class is ‘true answer’ or not
soft label
- Typicall output of the model (=inference result)
- Regard it as ‘knowledge’. Useful to observe how the model thinks

softmax with temperature(T)
- Softmax with temperature: controls difference in output between small & larget input values
- A large T smoothens large input value difference
- Usefull to synchronize the student and teacher models’ outputs

Distillation Loss
- KLdiv(Soft label, Soft prediction)
- Loss = difference between the teacher and student network’s inference
- Learn what teacher network knows by mimicking
Student Loss
- CrossEntropy(Hard label, Soft prediction)
- Loss = difference between the student network’s inference and true label
- Learn the “right answer”

3. Leveraging unlabeled dataset for training

3.1 Semi-supervised learning

unlabeled data 가 많을 경우.

Semi-supervised learning : Unsupervised(No label) + Fully Supervised(fully labeled)

labeled dataset 을 Model에 넣어 Unlabeled dataset 을 Pseudo-labeling 을 실시해 Pseudo-labeled datset으로 만든 뒤 학습에 이용한다.

3.2 Self-training

Recap : Data efficient learning methods so far

부스트캠프 AI Tech 2기 Level2 CV Data augmentation

Data augmentation

1.1 Learning representation of data set

train data (=camera image) 들은 실제와 다른 image 들을 표현한다.

편향된 이미지로부터 모델은 Robust하게 학습할 수 없다. (=근본적으로 dataset이 real data를 충분히 표현하지 못했기 때문에)

몇몇 데이터들이 주어졌을 때, 이 데이터에 옵션을 추가해서 모델이 더욱 다양성 있게 학습할 수 있도록 해주는 것이 Augmentation 이다.

1.2 Data augmentation

목적 : training dataset과 real dataset의 분포를 유사하게 만드는 것!

1.3 Various data augmentation methods

Modern augmentation techniques

어떠한 augmentation 을 사용할 지? 강도(magnitude)는 얼마나 적용시킬 것 인지? 를 parameter로 받는다.

2. Leverageing pre-trained information

2.1 Trainsfer learning

Approach 1 : pre-trained 된 모델의 마지막 Layer 를 새로운 FC Layer 로 추가하여 학습시키는 방법.

Approach 2 : pre-trained 된 모델의 마지막 Layer 를 새롭게 교체하고 전체 model set 에 대하여 learning rate를 다르게 주어 학습하는 방법.

2.2 Knowledge distillation

이미 학습된 Teacher Network 지식을 더 작은 모델인 Student Network 에 주입을 시켜 학습한다. (big model -> small model로 지식 전달)

3. Leveraging unlabeled dataset for training

3.1 Semi-supervised learning

Semi-supervised learning : Unsupervised(No label) + Fully Supervised(fully labeled)

labeled dataset 을 Model에 넣어 Unlabeled dataset 을 Pseudo-labeling 을 실시해 Pseudo-labeled datset으로 만든 뒤 학습에 이용한다.

3.2 Self-training

self-training -> Augmentation + Teacher-Student networks + semi-supervised learning

self-training with noisy student