example code in Python using OpenCV and scikit-learn to build a dataset for video classification

refer to code.


import cv2
import os
from sklearn.model_selection import train_test_split

# Define the classes for classification
classes = ['class1', 'class2', 'class3']

# Create a list to store the image paths and labels
data = []

# Loop through the videos and extract frames
for cls in classes:
video_dir = f'data/{cls}/'
for filename in os.listdir(video_dir):
video_path = os.path.join(video_dir, filename)
cap = cv2.VideoCapture(video_path)
while cap.isOpened():
ret, frame = cap.read()
if not ret:
frame_path = f'frames/{cls}/{filename}_{cap.get(cv2.CAP_PROP_POS_FRAMES)}.jpg'
cv2.imwrite(frame_path, frame)
data.append((frame_path, cls))

# Split the data into training, validation, and test sets
train_val_data, test_data = train_test_split(data, test_size=0.2, stratify=[x[1] for x in data])
train_data, val_data = train_test_split(train_val_data, test_size=0.2, stratify=[x[1] for x in train_val_data])

# Preprocess the images
def preprocess_image(image):
# Resize the image to (224, 224) and normalize the pixel values
image = cv2.resize(image, (224, 224))
image = image.astype('float32') / 255.0
return image

# Load the images and labels into memory
X_train = []
y_train = []
for image_path, label in train_data:
image = cv2.imread(image_path)
image = preprocess_image(image)

X_val = []
y_val = []
for image_path, label in val_data:
image = cv2.imread(image_path)
image = preprocess_image(image)

X_test = []
y_test = []
for image_path, label in test_data:
image = cv2.imread(image_path)
image = preprocess_image(image)

# Train and evaluate the model
# ...


In this example, we first define the classes for classification (in this case, 'class1', 'class2', and 'class3'). We then loop through the videos for each class, extract frames from each video, and store the frame paths and labels in a list called data.

Next, we split the data into training, validation, and test sets using train_test_split() from scikit-learn. We stratify the splits to ensure that each set has a proportional number of frames from each class.

We then define a function called preprocess_image() to resize the images to (224, 224) and normalize the pixel values. We load the images and labels into memory for each set using OpenCV, preprocess them using this function, and store them in lists called X_train, y_train, X_val, y_val, X_test, and y_test.

Finally, we can train and evaluate a video classification model using the preprocessed data. The specific code for this step will depend on the model architecture and training strategy you choose to use.

Thank you.