9/23/2020

Pytorch, Infinite DataLoader using iter & next

 


# create dataloader-iterator
data_iter = iter(data_loader)

# iterate over dataset
# alternatively you could use while(True)
for i in range(NUM_ITERS_YOU_WANT)
try:
data = next(data_iter)
except StopIteration:
# StopIteration is thrown if dataset ends
# reinitialize data loader
data_iter = iter(data_loader)
data = next(data_iter)

python argparse example


import argparse

paser = argparse.ArgumentParser()
args = paser.parse_args("")
args.cuda = False
args.show_summary = False
args.device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')


print(args.cuda)


9/21/2020

find best (optimal) threshold using roc curve

 def plot_roc_curve(fpr, tpr):

    plt.plot(fpr, tpr, color='orange', label='ROC')
    plt.plot([0, 1], [0, 1], color='darkblue', linestyle='--')
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('Receiver Operating Characteristic (ROC) Curve')
    plt.legend()
    plt.show()

y_true = np.array([0,0, 1, 1,1])
y_scores = np.array([0.0,0.09, .05, .75,1])

fpr, tpr, thresholds = roc_curve(y_true, y_scores)
print(tpr)
print(fpr)
print(thresholds)
print(roc_auc_score(y_true, y_scores))
optimal_idx = np.argmax(tpr - fpr)
optimal_threshold = thresholds[optimal_idx]
print("Threshold value is:", optimal_threshold)
plot_roc_curve(fpr, tpr)

What AUC(area under curve) value is better ?

 What AUC(area under curve) value is better ?

0.9 ~ 1 : excellent
0.8 ~ 0.9: good
0.7 ~ 0.8 : normal
0.6 ~ 0.7 : poor
0.5 ~ 0.6 : fail


python measure processing time

 


from time import process_time
# Start the stopwatch / counter
t1_start = process_time()

###
#processing
###

# Stop the stopwatch / counter
t1_stop = process_time()
sec = t1_stop-t1_start


9/20/2020

split train test dataset

 


import random

from sklearn.model_selection import train_test_split

random.shuffle(pkl_list)

pkl_train, pkl_test = train_test_split(pkl_list, test_size=0.2)


show image in jupyter notebook

 

from matplotlib import pyplot as plt
import numpy as np
import cv2

img = imread('xxx.png') #or image_data
img2 = img[:,:,::-1]
plt.imshow(img)


fix hangul separating issue in mac

 

from unicodedata import normalize
def nfd2nfc(data):
return normalize('NFC', data)


normalize('ㄷ ㅓ')

-> 더 


python change file name, get file name, dir, ext, check file exist in source code using os package

 

get file name and ext

import os
os.path.splitext("/path/to/some/file.txt")[0]
#/path/to/some/file
base = os.path.basename('/root/dir/sub/file.ext')
#'file.ext'
os.path.splitext(base)
#('file', '.ext')
os.path.splitext(base)[0]
#'file'

get dir

os.path.dirname("/path/to/some/file.txt")
#'/path/to/some'

change file name 

os.rename(r'C:\Users\Ron\Desktop\Test\Products.txt',r'C:\Users\Ron\Desktop\Test\Shipped Products.txt')


check file exist

os.path.isfile('./path_of_file')