.screenrc
Computer Vision & Machine Learning Research Laboratory
.screenrc
This guide explains how to access data from Computer A (data server) on Computer B (GPU machine) for machine learning training workflows.
When training machine learning models, you often need:
This tutorial will show you how to securely connect these computers using SSH, allowing the GPU machine to access data without copying everything locally.
If you don't already have an SSH key on your GPU computer (Computer B):
# On Computer B (GPU)
ssh-keygen -t rsa -b 4096
Press Enter to accept default locations and add a passphrase if desired.
# On Computer B (GPU)
# View your public key
cat ~/.ssh/id_rsa.pub
# Copy the output to clipboard
Now transfer this key to Computer A (data server):
# Option 1: Using ssh-copy-id (easiest)
ssh-copy-id username@computerA
# Option 2: Manual setup
# First, SSH into Computer A
ssh username@computerA
# Then on Computer A, create .ssh directory if it doesn't exist
mkdir -p ~/.ssh
chmod 700 ~/.ssh
# Add your public key to authorized_keys
echo "ssh-rsa AAAA...your key here..." >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
# Exit back to Computer B
exit
Ensure you can connect without a password:
# On Computer B (GPU)
ssh username@computerA
If successful, you should connect without entering a password.
Install SSHFS on your GPU computer:
# On Computer B (GPU)
# For Ubuntu/Debian
sudo apt-get update
sudo apt-get install sshfs
# For CentOS/RHEL/Fedora
sudo dnf install fuse-sshfs
Create a mount point and mount the remote directory:
# On Computer B (GPU)
# Create mount directory
mkdir -p ~/data_mount
# Mount the remote directory
sshfs username@computerA:/path/to/data ~/data_mount
# Verify the mount worked
ls ~/data_mount
Now you can access the data in your training scripts as if it were local:
# Example PyTorch script
import torch
from torch.utils.data import Dataset, DataLoader
# Point to your mounted data directory
data_dir = "~/data_mount/dataset"
# Your training code...
To automatically mount the remote directory when your GPU computer starts:
Edit your fstab file:
sudo nano /etc/fstab
Add this line (all on one line):
username@computerA:/path/to/data /home/username/data_mount fuse.sshfs defaults,_netdev,user,idmap=user,follow_symlinks,identityfile=/home/username/.ssh/id_rsa,allow_other,reconnect 0 0
Save and exit
To unmount the remote directory:
# On Computer B (GPU)
fusermount -u ~/data_mount
For better performance with large datasets, try these SSHFS options:
sshfs username@computerA:/path/to/data ~/data_mount -o Compression=no,big_writes,cache=yes,kernel_cache
If you experience frequent disconnections, add reconnect options:
sshfs username@computerA:/path/to/data ~/data_mount -o reconnect,ServerAliveInterval=15,ServerAliveCountMax=3
For production setups with large datasets, consider using NFS instead of SSHFS for better performance.
sudo systemctl status sshd
For any issues or questions, please contact your system administrator.
Let me explain how a specific normalized feature value is calculated using one concrete example.
Let's take the feature "GroupSize" which has:
These values are post-normalization, but we can work backwards to understand how they were calculated.
The normalization function you're using is:
normalized_features = (features - mean) / std
Where:
features
are the original, raw valuesmean
is the average of all values for that feature in the training setstd
is the standard deviation of all values for that feature in the training setLet's say we have these raw values for GroupSize in the training set:
First, we calculate the mean:
Then we calculate the standard deviation:
Now, we can normalize each value:
Going back to your data:
For GroupSize, this extreme range suggests:
If we assume the mean of raw GroupSize is ฮผ and standard deviation is ฯ, then:
This tells us that your maximum raw value is over 103 standard deviations away from the mean, which is extremely far! This confirms that your raw data has a heavily skewed distribution with significant outliers.
The fact that most normalized values for GroupSize are close to the minimum (-0.045121) suggests that the most common value is slightly below the mean, while a few extreme outliers are pulling the mean upward.
This type of skewed distribution is exactly why techniques like masking and autoencoder approaches are beneficial - they can help the model learn robust representations even with such extreme distributions.
Very Nice Convoluiton Convolution (korean)
https://gaussian37.github.io/dl-concept-covolution_operation/
✌️
This guide provides step-by-step instructions for setting up LabelMe with a custom login system and proper dataset management. We'll cover the entire workflow: login → annotation → save → logout.
We'll build a system with the following components:
# Update package lists
sudo apt update
sudo apt upgrade -y
# Install necessary packages
sudo apt install -y docker.io docker-compose apache2 php libapache2-mod-php php-json
sudo systemctl enable docker
sudo systemctl start docker
# Add your user to docker group to avoid using sudo with docker commands
sudo usermod -aG docker $USER
# Log out and log back in for this to take effect
# Create main project directory
mkdir -p ~/labelme-project
cd ~/labelme-project
# Create directories for different components
mkdir -p docker-labelme
mkdir -p web-portal
mkdir -p datasets/{project1,project2}
mkdir -p annotations
# Add some sample images to project1 (optional)
# You can replace this with your own dataset copying commands
mkdir -p datasets/project1/images
# Copy some sample images if you have them
# cp /path/to/your/images/*.jpg datasets/project1/images/
Create a file docker-labelme/docker-compose.yml
:
cd ~/labelme-project/docker-labelme
nano docker-compose.yml
Add the following content:
version: '3'
services:
labelme:
image: wkentaro/labelme
container_name: labelme-server
ports:
- "8080:8080"
volumes:
- ../datasets:/data
- ../annotations:/home/developer/.labelmerc
environment:
- LABELME_SERVER=1
- LABELME_PORT=8080
- LABELME_HOST=0.0.0.0
command: labelme --server --port 8080 --host 0.0.0.0 /data
restart: unless-stopped
This step ensures annotations are saved in the proper format and location:
cd ~/labelme-project
nano annotations/.labelmerc
Add the following content:
{
"auto_save": true,
"display_label_popup": true,
"store_data": true,
"keep_prev": false,
"flags": null,
"flags_2": null,
"flags_3": null,
"label_flags": null,
"labels": ["person", "car", "bicycle", "dog", "cat", "tree", "building"],
"file_search": true,
"show_label_text": true
}
Customize the labels
list according to your annotation needs.
cd ~/labelme-project/docker-labelme
docker-compose up -d
Verify it's running:
docker ps
You should see the labelme-server container running and listening on port 8080.
cd ~/labelme-project/web-portal
nano index.php
Add the following content:
<?php
// Display errors during development (remove in production)
ini_set('display_errors', 1);
error_reporting(E_ALL);
// Start session
session_start();
// Check if there's an error message
$error_message = isset($_SESSION['error_message']) ? $_SESSION['error_message'] : '';
// Clear error message after displaying it
unset($_SESSION['error_message']);
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>LabelMe Login</title>
<style>
body {
font-family: Arial, sans-serif;
background-color: #f4f4f4;
margin: 0;
padding: 0;
display: flex;
justify-content: center;
align-items: center;
height: 100vh;
}
.login-container {
background-color: white;
padding: 30px;
border-radius: 8px;
box-shadow: 0 2px 10px rgba(0, 0, 0, 0.1);
width: 350px;
}
h2 {
text-align: center;
color: #333;
margin-bottom: 20px;
}
input[type="text"],
input[type="password"] {
width: 100%;
padding: 12px;
margin: 8px 0;
display: inline-block;
border: 1px solid #ccc;
box-sizing: border-box;
border-radius: 4px;
}
button {
background-color: #4CAF50;
color: white;
padding: 14px 20px;
margin: 10px 0;
border: none;
cursor: pointer;
width: 100%;
border-radius: 4px;
font-size: 16px;
}
button:hover {
opacity: 0.8;
}
.error-message {
color: #f44336;
text-align: center;
margin-top: 10px;
}
.logo {
text-align: center;
margin-bottom: 20px;
}
</style>
</head>
<body>
<div class="login-container">
<div class="logo">
<h2>LabelMe Annotation</h2>
</div>
<form id="loginForm" action="auth.php" method="post">
<div>
<label for="username"><b>Username</b></label>
<input type="text" placeholder="Enter Username" name="username" required>
</div>
<div>
<label for="password"><b>Password</b></label>
<input type="password" placeholder="Enter Password" name="password" required>
</div>
<div>
<label for="project"><b>Select Project</b></label>
<select name="project" style="width: 100%; padding: 12px; margin: 8px 0; display: inline-block; border: 1px solid #ccc; box-sizing: border-box; border-radius: 4px;">
<option value="project1">Project 1</option>
<option value="project2">Project 2</option>
</select>
</div>
<button type="submit">Login</button>
<?php if (!empty($error_message)): ?>
<div class="error-message"><?php echo htmlspecialchars($error_message); ?></div>
<?php endif; ?>
</form>
</div>
</body>
</html>
cd ~/labelme-project/web-portal
nano auth.php
Add the following content:
<?php
// Start session management
session_start();
// Display errors during development (remove in production)
ini_set('display_errors', 1);
error_reporting(E_ALL);
// Configuration - Store these securely in production
$users = [
'admin' => [
'password' => password_hash('admin123', PASSWORD_DEFAULT), // Use hashed passwords
'role' => 'admin'
],
'user1' => [
'password' => password_hash('user123', PASSWORD_DEFAULT),
'role' => 'annotator'
],
'user2' => [
'password' => password_hash('user456', PASSWORD_DEFAULT),
'role' => 'annotator'
]
];
// Base path to the LabelMe application
$labelme_base_url = 'http://localhost:8080'; // Change this to your LabelMe server address
// Handle login form submission
if ($_SERVER['REQUEST_METHOD'] === 'POST') {
$username = isset($_POST['username']) ? $_POST['username'] : '';
$password = isset($_POST['password']) ? $_POST['password'] : '';
$project = isset($_POST['project']) ? $_POST['project'] : 'project1';
// Validate credentials
if (isset($users[$username]) && password_verify($password, $users[$username]['password'])) {
// Set session variables
$_SESSION['logged_in'] = true;
$_SESSION['username'] = $username;
$_SESSION['role'] = $users[$username]['role'];
$_SESSION['project'] = $project;
$_SESSION['last_activity'] = time();
// Redirect to LabelMe
header("Location: labelme.php");
exit;
} else {
// Failed login
$_SESSION['error_message'] = "Invalid username or password";
header("Location: index.php");
exit;
}
}
// For logout
if (isset($_GET['logout'])) {
// Log this logout
$log_file = 'user_activity.log';
$log_message = date('Y-m-d H:i:s') . " - User: " . ($_SESSION['username'] ?? 'unknown') .
" - Action: Logged out\n";
file_put_contents($log_file, $log_message, FILE_APPEND);
// Clear session data
session_unset();
session_destroy();
// Redirect to login page
header("Location: index.php");
exit;
}
?>
cd ~/labelme-project/web-portal
nano labelme.php
Add the following content:
<?php
// Start session management
session_start();
// Display errors during development (remove in production)
ini_set('display_errors', 1);
error_reporting(E_ALL);
// Check if user is logged in
if (!isset($_SESSION['logged_in']) || $_SESSION['logged_in'] !== true) {
// Not logged in, redirect to login page
header("Location: index.php");
exit;
}
// Security: Check for session timeout (30 minutes)
$timeout = 30 * 60; // 30 minutes in seconds
if (isset($_SESSION['last_activity']) && (time() - $_SESSION['last_activity'] > $timeout)) {
// Session has expired
session_unset();
session_destroy();
header("Location: index.php?timeout=1");
exit;
}
// Update last activity time
$_SESSION['last_activity'] = time();
// Configuration
$labelme_base_url = 'http://localhost:8080'; // Change this to your LabelMe server address
$project = $_SESSION['project'] ?? 'project1';
$labelme_url = $labelme_base_url . '/' . $project;
// Log user activity
$log_file = 'user_activity.log';
$log_message = date('Y-m-d H:i:s') . " - User: " . $_SESSION['username'] .
" - Role: " . $_SESSION['role'] .
" - Project: " . $project .
" - Action: Accessed LabelMe\n";
file_put_contents($log_file, $log_message, FILE_APPEND);
?>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>LabelMe Annotation Tool</title>
<style>
body, html {
margin: 0;
padding: 0;
height: 100%;
overflow: hidden;
}
.header {
background-color: #333;
color: white;
padding: 10px;
display: flex;
justify-content: space-between;
align-items: center;
}
.user-info {
font-size: 14px;
}
.logout-btn {
background-color: #f44336;
color: white;
border: none;
padding: 5px 10px;
cursor: pointer;
border-radius: 3px;
text-decoration: none;
margin-left: 10px;
}
.logout-btn:hover {
background-color: #d32f2f;
}
.project-selector {
margin-left: 20px;
}
iframe {
width: 100%;
height: calc(100% - 50px);
border: none;
}
</style>
</head>
<body>
<div class="header">
<div>
<h3 style="margin:0;">LabelMe Annotation Tool</h3>
<span>Project: <strong><?php echo htmlspecialchars($project); ?></strong></span>
</div>
<div class="user-info">
Logged in as: <strong><?php echo htmlspecialchars($_SESSION['username']); ?></strong>
(<?php echo htmlspecialchars($_SESSION['role']); ?>)
<form method="post" action="" style="display:inline-block">
<select name="project" class="project-selector" onchange="this.form.submit()">
<option value="project1" <?php echo $project == 'project1' ? 'selected' : ''; ?>>Project 1</option>
<option value="project2" <?php echo $project == 'project2' ? 'selected' : ''; ?>>Project 2</option>
</select>
</form>
<a href="auth.php?logout=1" class="logout-btn">Logout</a>
</div>
</div>
<iframe src="<?php echo $labelme_url; ?>" allow="fullscreen"></iframe>
</body>
</html>
<?php
// Handle project switching
if ($_SERVER['REQUEST_METHOD'] === 'POST' && isset($_POST['project'])) {
$newProject = $_POST['project'];
$_SESSION['project'] = $newProject;
// Log project switch
$log_message = date('Y-m-d H:i:s') . " - User: " . $_SESSION['username'] .
" - Action: Switched to project " . $newProject . "\n";
file_put_contents($log_file, $log_message, FILE_APPEND);
// Redirect to refresh the page with new project
header("Location: labelme.php");
exit;
}
?>
sudo nano /etc/apache2/sites-available/labelme-portal.conf
Add the following configuration:
<VirtualHost *:80>
ServerName labelme.yourdomain.com # Change this to your domain or IP
DocumentRoot /home/username/labelme-project/web-portal # Update with your actual path
<Directory /home/username/labelme-project/web-portal> # Update with your actual path
Options Indexes FollowSymLinks
AllowOverride All
Require all granted
</Directory>
ErrorLog ${APACHE_LOG_DIR}/labelme-error.log
CustomLog ${APACHE_LOG_DIR}/labelme-access.log combined
</VirtualHost>
Update the paths to match your actual user and directory structure.
sudo a2ensite labelme-portal.conf
sudo systemctl restart apache2
# Set appropriate permissions for the web files
cd ~/labelme-project
sudo chown -R www-data:www-data web-portal
sudo chmod -R 755 web-portal
# Ensure the annotation directory is writable
sudo chown -R www-data:www-data annotations
sudo chmod -R 777 annotations
# Ensure datasets are accessible
sudo chmod -R 755 datasets
Structure your dataset directories as follows:
datasets/
├── project1/
│ ├── images/
│ │ ├── image1.jpg
│ │ ├── image2.jpg
│ │ └── ...
│ └── annotations/ # LabelMe will save annotations here
├── project2/
│ ├── images/
│ │ ├── image1.jpg
│ │ └── ...
│ └── annotations/
└── ...
Create a script to add new projects:
cd ~/labelme-project
nano add-project.sh
Add the following content:
#!/bin/bash
# Script to add a new project to the LabelMe setup
# Check if a project name was provided
if [ -z "$1" ]; then
echo "Usage: $0 <project_name>"
exit 1
fi
PROJECT_NAME="$1"
PROJECT_DIR="$HOME/labelme-project/datasets/$PROJECT_NAME"
# Create project directory structure
mkdir -p "$PROJECT_DIR/images"
mkdir -p "$PROJECT_DIR/annotations"
# Set permissions
chmod -R 755 "$PROJECT_DIR"
# Update the web portal to include the new project
# (This is a simplified approach - you'll need to manually edit index.php and labelme.php)
echo "Project directory created at: $PROJECT_DIR"
echo "Now copy your images to: $PROJECT_DIR/images/"
echo "Remember to manually update index.php and labelme.php to include the new project"
Make the script executable:
chmod +x add-project.sh
Open your browser and navigate to:
http://your-server-ip/
or http://labelme.yourdomain.com/
sudo apt install certbot python3-certbot-apache
sudo certbot --apache -d labelme.yourdomain.com
Edit the auth.php
file to use a database instead of hardcoded users.
If LabelMe doesn't load in the iframe:
docker ps
docker logs labelme-server
If you encounter permission issues with annotations:
sudo chmod -R 777 ~/labelme-project/annotations
sudo chown -R www-data:www-data ~/labelme-project/datasets
If annotations aren't saving properly:
.labelmerc
configuration filesudo tail -f /var/log/apache2/error.log
You now have a complete LabelMe annotation system with:
This setup allows your team to collaborate on annotation projects while maintaining control over who can access the system and what projects they can work on.
.
..
Thank you.
LightGBM (Light Gradient Boosting Machine) is a gradient boosting framework developed by Microsoft that uses tree-based learning algorithms. It's designed to be efficient, fast, and capable of handling large-scale data with high dimensionality.
Here's a visualization of how LightGBM works:
Key features of LightGBM that make it powerful:
Leaf-wise Tree Growth: Unlike traditional algorithms that grow trees level-wise, LightGBM grows trees leaf-wise, focusing on the leaf that will bring the maximum reduction in loss. This creates more complex trees but uses fewer splits, resulting in higher accuracy with the same number of leaves.
Gradient-based One-Side Sampling (GOSS): This technique retains instances with large gradients (those that need more training) and randomly samples instances with small gradients. This allows LightGBM to focus computational resources on the more informative examples without losing accuracy.
Exclusive Feature Bundling (EFB): For sparse datasets, many features are mutually exclusive (never take non-zero values simultaneously). LightGBM bundles these features together, treating them as a single feature. This reduces memory usage and speeds up training.
Gradient Boosting Framework: Like other boosting algorithms, LightGBM builds trees sequentially, with each new tree correcting the errors of the existing ensemble.
LightGBM is particularly well-suited for your solver selection task because:
When properly tuned, LightGBM can often achieve better performance than neural networks for tabular data, especially with the right hyperparameters and sufficient boosting rounds.