AI 활용 소프트웨어 개발/AI, 머신러닝, 딥러닝

딥러닝, 이미지 폴더 전처리하기.

ha2yong 2025. 8. 5. 08:46

[train, val, test 구분이 없이 클래스만 나뉘어져있는 경우]

0. train, validation, test별로 하위 폴더 만들기 (예. train/negative, train/positive)

dataset_structure = ['train/negative', 'train/positive', 'val/negative', 'val/positive', 'test/negative', 'test/positive']

for folder in dataset_structure:
	os.makedirs(folder, exist_ok=True)

1. 폴더안에 파일 리스트 가져오기
2. 셔플로 섞기
3. 클래스 별 파일 길이 구하기
4. train, val, test 리스트 뽑기(슬라이싱)
5. 리스트에서 파일이름 하나씩 뽑아서, 경로를 만들고, 파일 copy하기 

categories = ['Negative','Positive']

for category in categories:
	files = os.listdir(categoty)
    random.shuffle(files)
    num_files = len(files)
    
    train_split_len = int(num_files * 0.6)
    val_split_len = int(num_files * 0.2)
    
    to_category = category.lower()
    
    train_files = files[:train_split_len]
    val_files = files[train_split_len : train_split_len + val_split_len]
    test_files = files[train_split_len+val_split_len :]
    
    for file in train_files:
        shutil.copy(os.path.join(category,file) , f'train/{to_category}/' )
    for file in val_files:
        shutil.copy(os.path.join(category,file) , f'val/{to_category}/' )
    for file in test_files:
        shutil.copy(os.path.join(category,file) , f'test/{to_category}/' )