Course Content
Neural Networks in Java
Building and training MLP, CNN, and RNN networks with SuperML
Neural Networks in Java
SuperML Java 2.1.0 provides comprehensive support for neural networks including Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). This tutorial covers how to build, train, and deploy neural networks using Java with enterprise-grade performance.
What Youβll Learn
- Multi-Layer Perceptron (MLP) - Deep feedforward networks for tabular data
- Convolutional Neural Networks (CNN) - Image processing and computer vision
- Recurrent Neural Networks (RNN) - Sequence processing with LSTM cells
- Neural Network Preprocessing - Specialized data preparation techniques
- Model Persistence - Saving and loading trained neural networks
- Performance Optimization - Real-time training and inference
- Enterprise Deployment - Production-ready neural network systems
Prerequisites
- Completion of βIntroduction to SuperML Javaβ and βJava ML Setupβ
- Basic understanding of linear algebra and calculus
- Familiarity with neural network concepts
- Java development environment with SuperML Java 2.1.0
Neural Network Architecture Overview
SuperML Java 2.1.0 provides three main types of neural networks:
import org.superml.neural.MLPClassifier;
import org.superml.neural.CNNClassifier;
import org.superml.neural.RNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
// Multi-Layer Perceptron
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(64, 32, 16)
.setActivation("relu")
.setLearningRate(0.01);
// Convolutional Neural Network
CNNClassifier cnn = new CNNClassifier()
.setInputShape(28, 28, 1)
.setLearningRate(0.001);
// Recurrent Neural Network
RNNClassifier rnn = new RNNClassifier()
.setHiddenSize(64)
.setCellType("LSTM")
.setNumLayers(2);
Multi-Layer Perceptron (MLP)
Basic MLP Implementation
import org.superml.neural.MLPClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
import org.superml.datasets.Datasets;
import org.superml.model_selection.ModelSelection;
import org.superml.metrics.Metrics;
public class MLPExample {
public static void main(String[] args) {
System.out.println("=== SuperML 2.1.0 - MLP Neural Network ===\n");
try {
// Load dataset
var dataset = Datasets.loadIris();
var split = ModelSelection.trainTestSplit(dataset.X, dataset.y, 0.2, 42);
// Apply MLP preprocessing
NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
NeuralNetworkPreprocessor.NetworkType.MLP).configureMLP();
double[][] XTrainProcessed = preprocessor.preprocessMLP(split.XTrain);
double[][] XTestProcessed = preprocessor.preprocessMLP(split.XTest);
System.out.println("π Applied MLP preprocessing: standardization + outlier clipping");
// Create MLP with multiple hidden layers
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(128, 64, 32) // 3 hidden layers
.setActivation("relu") // ReLU activation
.setLearningRate(0.01) // Learning rate
.setMaxIter(200) // Maximum epochs
.setBatchSize(32) // Mini-batch size
.setEarlyStoppingPatience(10) // Early stopping
.setValidationFraction(0.2); // Validation split
System.out.println("π§ Training MLP with architecture: 4 β 128 β 64 β 32 β 3");
// Train the model
long startTime = System.currentTimeMillis();
mlp.fit(XTrainProcessed, split.yTrain);
long trainingTime = System.currentTimeMillis() - startTime;
// Make predictions
double[] predictions = mlp.predict(XTestProcessed);
// Evaluate performance
double accuracy = Metrics.accuracy(split.yTest, predictions);
double precision = Metrics.precision(split.yTest, predictions);
double recall = Metrics.recall(split.yTest, predictions);
double f1 = Metrics.f1Score(split.yTest, predictions);
System.out.println("\n=== MLP Results ===");
System.out.println("Training time: " + trainingTime + " ms");
System.out.println("Accuracy: " + String.format("%.4f", accuracy));
System.out.println("Precision: " + String.format("%.4f", precision));
System.out.println("Recall: " + String.format("%.4f", recall));
System.out.println("F1 Score: " + String.format("%.4f", f1));
// Display training history
double[] trainingLoss = mlp.getTrainingHistory().getLoss();
double[] validationLoss = mlp.getTrainingHistory().getValidationLoss();
System.out.println("\nπ Training History (last 5 epochs):");
for (int i = Math.max(0, trainingLoss.length - 5); i < trainingLoss.length; i++) {
System.out.printf("Epoch %d: Train Loss: %.4f, Val Loss: %.4f\n",
i + 1, trainingLoss[i], validationLoss[i]);
}
System.out.println("\nβ
MLP training completed successfully!");
} catch (Exception e) {
System.err.println("β Error in MLP training: " + e.getMessage());
e.printStackTrace();
}
}
}
Advanced MLP Configuration
import org.superml.neural.MLPClassifier;
import org.superml.neural.optimizers.Adam;
import org.superml.neural.regularizers.L2Regularizer;
public class AdvancedMLPExample {
public static void main(String[] args) {
try {
// Advanced MLP with custom configuration
MLPClassifier advancedMLP = new MLPClassifier()
.setHiddenLayerSizes(256, 128, 64, 32)
.setActivation("relu")
.setOutputActivation("softmax")
.setLearningRate(0.001)
.setOptimizer(new Adam()
.setBeta1(0.9)
.setBeta2(0.999)
.setEpsilon(1e-8))
.setRegularizer(new L2Regularizer(0.01))
.setDropoutRate(0.2)
.setBatchSize(64)
.setMaxIter(300)
.setEarlyStoppingPatience(15)
.setValidationFraction(0.15)
.setShuffleBatches(true)
.setVerbose(true);
System.out.println("π Advanced MLP Configuration:");
System.out.println("- Architecture: 256 β 128 β 64 β 32");
System.out.println("- Optimizer: Adam with Ξ²β=0.9, Ξ²β=0.999");
System.out.println("- Regularization: L2 with Ξ»=0.01");
System.out.println("- Dropout: 20% during training");
System.out.println("- Early stopping with patience=15");
// Training would continue here...
} catch (Exception e) {
System.err.println("β Error: " + e.getMessage());
}
}
}
Convolutional Neural Networks (CNN)
CNN for Image Classification
import org.superml.neural.CNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
public class CNNExample {
public static void main(String[] args) {
System.out.println("=== SuperML 2.1.0 - CNN for Image Classification ===\n");
try {
// Generate synthetic image data (28x28 grayscale images)
double[][] imageData = generateImageData(1000, 28, 28);
double[] labels = generateImageLabels(1000);
// Split data
var split = splitImageData(imageData, labels, 0.8);
// Apply CNN preprocessing
NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
NeuralNetworkPreprocessor.NetworkType.CNN).configureCNN(28, 28, 1);
double[][] XTrainProcessed = preprocessor.preprocessCNN(split.XTrain);
double[][] XTestProcessed = preprocessor.preprocessCNN(split.XTest);
System.out.println("πΌοΈ Applied CNN preprocessing: pixel normalization to [-1,1]");
System.out.println("π Training samples: " + XTrainProcessed.length);
System.out.println("π Test samples: " + XTestProcessed.length);
// Create CNN with multiple layers
CNNClassifier cnn = new CNNClassifier()
.setInputShape(28, 28, 1) // 28x28 grayscale images
.addConvLayer(32, 3, 3, "relu") // 32 filters, 3x3 kernel
.addMaxPoolLayer(2, 2) // 2x2 max pooling
.addConvLayer(64, 3, 3, "relu") // 64 filters, 3x3 kernel
.addMaxPoolLayer(2, 2) // 2x2 max pooling
.addConvLayer(128, 3, 3, "relu") // 128 filters, 3x3 kernel
.addGlobalAveragePoolLayer() // Global average pooling
.addDenseLayer(128, "relu") // Dense layer with 128 units
.addDropoutLayer(0.5) // Dropout for regularization
.addDenseLayer(3, "softmax") // Output layer (3 classes)
.setLearningRate(0.001) // Learning rate
.setMaxEpochs(100) // Training epochs
.setBatchSize(32) // Batch size
.setEarlyStoppingPatience(10); // Early stopping
System.out.println("ποΈ CNN Architecture:");
System.out.println("- Input: 28Γ28Γ1");
System.out.println("- Conv2D(32) β MaxPool β Conv2D(64) β MaxPool β Conv2D(128)");
System.out.println("- GlobalAvgPool β Dense(128) β Dropout(0.5) β Dense(3)");
// Train the CNN
long startTime = System.currentTimeMillis();
cnn.fit(XTrainProcessed, split.yTrain);
long trainingTime = System.currentTimeMillis() - startTime;
// Make predictions
double[] predictions = cnn.predict(XTestProcessed);
// Evaluate performance
double accuracy = calculateAccuracy(split.yTest, predictions);
System.out.println("\n=== CNN Results ===");
System.out.println("Training time: " + trainingTime + " ms");
System.out.println("Test accuracy: " + String.format("%.4f", accuracy));
// Display feature maps info
System.out.println("\nπ Feature Maps:");
System.out.println("- Conv Layer 1: 32 feature maps (26Γ26)");
System.out.println("- Conv Layer 2: 64 feature maps (11Γ11)");
System.out.println("- Conv Layer 3: 128 feature maps (4Γ4)");
System.out.println("\nβ
CNN training completed successfully!");
} catch (Exception e) {
System.err.println("β Error in CNN training: " + e.getMessage());
e.printStackTrace();
}
}
private static double[][] generateImageData(int samples, int height, int width) {
double[][] data = new double[samples][height * width];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
for (int j = 0; j < height * width; j++) {
data[i][j] = random.nextDouble(); // Pixel values 0-1
}
}
return data;
}
private static double[] generateImageLabels(int samples) {
double[] labels = new double[samples];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
labels[i] = random.nextInt(3); // 3 classes
}
return labels;
}
private static DataSplit splitImageData(double[][] X, double[] y, double trainRatio) {
int trainSize = (int) (X.length * trainRatio);
double[][] XTrain = new double[trainSize][];
double[][] XTest = new double[X.length - trainSize][];
double[] yTrain = new double[trainSize];
double[] yTest = new double[X.length - trainSize];
System.arraycopy(X, 0, XTrain, 0, trainSize);
System.arraycopy(X, trainSize, XTest, 0, X.length - trainSize);
System.arraycopy(y, 0, yTrain, 0, trainSize);
System.arraycopy(y, trainSize, yTest, 0, X.length - trainSize);
return new DataSplit(XTrain, XTest, yTrain, yTest);
}
private static class DataSplit {
final double[][] XTrain, XTest;
final double[] yTrain, yTest;
DataSplit(double[][] XTrain, double[][] XTest, double[] yTrain, double[] yTest) {
this.XTrain = XTrain;
this.XTest = XTest;
this.yTrain = yTrain;
this.yTest = yTest;
}
}
private static double calculateAccuracy(double[] actual, double[] predicted) {
int correct = 0;
for (int i = 0; i < actual.length; i++) {
if (Math.round(predicted[i]) == Math.round(actual[i])) {
correct++;
}
}
return (double) correct / actual.length;
}
}
Recurrent Neural Networks (RNN)
RNN for Sequence Processing
import org.superml.neural.RNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
public class RNNExample {
public static void main(String[] args) {
System.out.println("=== SuperML 2.1.0 - RNN for Sequence Processing ===\n");
try {
// Generate synthetic sequence data
double[][] sequenceData = generateSequenceData(800, 30, 8);
double[] labels = generateSequenceLabels(800);
// Split data
var split = splitSequenceData(sequenceData, labels, 0.8);
// Apply RNN preprocessing
NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
NeuralNetworkPreprocessor.NetworkType.RNN).configureRNN(30, 8, false);
double[][] XTrainProcessed = preprocessor.preprocessRNN(split.XTrain);
double[][] XTestProcessed = preprocessor.preprocessRNN(split.XTest);
System.out.println("π Applied RNN preprocessing: global scaling + temporal smoothing");
System.out.println("π Sequence length: 30 timesteps");
System.out.println("π Features per timestep: 8");
System.out.println("π Training sequences: " + XTrainProcessed.length);
// Create RNN with LSTM cells
RNNClassifier rnn = new RNNClassifier()
.setHiddenSize(64) // LSTM hidden units
.setNumLayers(2) // 2 LSTM layers
.setCellType("LSTM") // LSTM cells
.setDropoutRate(0.2) // Dropout between layers
.setBidirectional(false) // Unidirectional
.setSequenceLength(30) // Input sequence length
.setInputSize(8) // Features per timestep
.setOutputSize(3) // Number of classes
.setLearningRate(0.01) // Learning rate
.setMaxEpochs(100) // Training epochs
.setBatchSize(32) // Batch size
.setEarlyStoppingPatience(15) // Early stopping
.setGradientClipping(1.0); // Gradient clipping
System.out.println("π RNN Architecture:");
System.out.println("- Input: 30 timesteps Γ 8 features");
System.out.println("- LSTM Layer 1: 64 hidden units");
System.out.println("- LSTM Layer 2: 64 hidden units");
System.out.println("- Dropout: 20% between layers");
System.out.println("- Output: 3 classes");
// Train the RNN
long startTime = System.currentTimeMillis();
rnn.fit(XTrainProcessed, split.yTrain);
long trainingTime = System.currentTimeMillis() - startTime;
// Make predictions
double[] predictions = rnn.predict(XTestProcessed);
// Evaluate performance
double accuracy = calculateAccuracy(split.yTest, predictions);
System.out.println("\n=== RNN Results ===");
System.out.println("Training time: " + trainingTime + " ms");
System.out.println("Test accuracy: " + String.format("%.4f", accuracy));
// Display sequence processing info
System.out.println("\nπ Sequence Processing:");
System.out.println("- Total parameters: ~" + rnn.getParameterCount());
System.out.println("- Memory cells: " + (rnn.getHiddenSize() * rnn.getNumLayers()));
System.out.println("- Gradient clipping: " + rnn.getGradientClipping());
System.out.println("\nβ
RNN training completed successfully!");
} catch (Exception e) {
System.err.println("β Error in RNN training: " + e.getMessage());
e.printStackTrace();
}
}
private static double[][] generateSequenceData(int samples, int sequenceLength, int features) {
double[][] data = new double[samples][sequenceLength * features];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
for (int t = 0; t < sequenceLength; t++) {
for (int f = 0; f < features; f++) {
int idx = t * features + f;
// Create time-dependent patterns
data[i][idx] = Math.sin(t * 0.1 + f) + random.nextGaussian() * 0.1;
}
}
}
return data;
}
private static double[] generateSequenceLabels(int samples) {
double[] labels = new double[samples];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
labels[i] = random.nextInt(3); // 3 classes
}
return labels;
}
private static DataSplit splitSequenceData(double[][] X, double[] y, double trainRatio) {
int trainSize = (int) (X.length * trainRatio);
double[][] XTrain = new double[trainSize][];
double[][] XTest = new double[X.length - trainSize][];
double[] yTrain = new double[trainSize];
double[] yTest = new double[X.length - trainSize];
System.arraycopy(X, 0, XTrain, 0, trainSize);
System.arraycopy(X, trainSize, XTest, 0, X.length - trainSize);
System.arraycopy(y, 0, yTrain, 0, trainSize);
System.arraycopy(y, trainSize, yTest, 0, X.length - trainSize);
return new DataSplit(XTrain, XTest, yTrain, yTest);
}
}
Model Persistence and Deployment
Saving and Loading Neural Networks
import org.superml.persistence.ModelPersistence;
import org.superml.neural.MLPClassifier;
public class NeuralNetworkPersistence {
public static void main(String[] args) {
try {
// Train a neural network
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(128, 64, 32)
.setActivation("relu")
.setLearningRate(0.01)
.setMaxIter(100);
// Assume training data is available
// mlp.fit(XTrain, yTrain);
// Save model with metadata
Map<String, Object> metadata = new HashMap<>();
metadata.put("model_type", "MLPClassifier");
metadata.put("architecture", "128-64-32");
metadata.put("activation", "relu");
metadata.put("training_samples", 1000);
metadata.put("accuracy", 0.95);
metadata.put("created_date", new Date().toString());
String modelPath = "models/neural_network_model.superml";
ModelPersistence.save(mlp, modelPath, "Production MLP Model", metadata);
System.out.println("β
Model saved to: " + modelPath);
// Load model for inference
MLPClassifier loadedModel = ModelPersistence.load(modelPath, MLPClassifier.class);
System.out.println("β
Model loaded successfully");
System.out.println("ποΈ Architecture: " + Arrays.toString(loadedModel.getHiddenLayerSizes()));
System.out.println("β‘ Activation: " + loadedModel.getActivation());
// Use loaded model for predictions
// double[] predictions = loadedModel.predict(XTest);
} catch (Exception e) {
System.err.println("β Error in model persistence: " + e.getMessage());
e.printStackTrace();
}
}
}
Advanced Neural Network Features
Ensemble of Neural Networks
import org.superml.neural.MLPClassifier;
import org.superml.neural.CNNClassifier;
import org.superml.neural.RNNClassifier;
import org.superml.ensemble.NeuralNetworkEnsemble;
public class NeuralNetworkEnsemble {
public static void main(String[] args) {
try {
// Create ensemble of different neural networks
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(128, 64)
.setActivation("relu");
CNNClassifier cnn = new CNNClassifier()
.setInputShape(28, 28, 1)
.addConvLayer(32, 3, 3, "relu")
.addMaxPoolLayer(2, 2)
.addDenseLayer(64, "relu");
RNNClassifier rnn = new RNNClassifier()
.setHiddenSize(64)
.setCellType("LSTM")
.setNumLayers(2);
// Create ensemble
NeuralNetworkEnsemble ensemble = new NeuralNetworkEnsemble()
.addModel("mlp", mlp, 0.4) // 40% weight
.addModel("cnn", cnn, 0.35) // 35% weight
.addModel("rnn", rnn, 0.25) // 25% weight
.setVotingStrategy("weighted_average");
System.out.println("π Neural Network Ensemble:");
System.out.println("- MLP: 40% weight");
System.out.println("- CNN: 35% weight");
System.out.println("- RNN: 25% weight");
System.out.println("- Strategy: Weighted Average");
// Train ensemble (each model on preprocessed data)
// ensemble.fit(XTrain, yTrain);
// Make ensemble predictions
// double[] predictions = ensemble.predict(XTest);
} catch (Exception e) {
System.err.println("β Error in ensemble: " + e.getMessage());
}
}
}
Performance Optimization
Neural Network Performance Tips
public class NeuralNetworkOptimization {
public static void main(String[] args) {
System.out.println("=== Neural Network Performance Optimization ===\n");
// 1. Batch Processing for High Throughput
MLPClassifier optimizedMLP = new MLPClassifier()
.setHiddenLayerSizes(256, 128, 64)
.setBatchSize(128) // Larger batch size
.setMaxIter(50) // Fewer epochs
.setEarlyStoppingPatience(5) // Early stopping
.setParallelTraining(true) // Parallel processing
.setGPUAcceleration(true); // GPU acceleration if available
// 2. Memory-Efficient Training
CNNClassifier memoryEfficientCNN = new CNNClassifier()
.setInputShape(224, 224, 3)
.setBatchSize(16) // Smaller batch for large images
.setGradientAccumulation(4) // Accumulate gradients
.setMixedPrecision(true) // Use FP16 for memory efficiency
.setMemoryOptimization(true);
// 3. Fast Inference Configuration
RNNClassifier fastRNN = new RNNClassifier()
.setHiddenSize(32) // Smaller hidden size
.setNumLayers(1) // Single layer
.setCellType("GRU") // Faster than LSTM
.setInferenceOptimization(true) // Optimize for inference
.setBatchInference(true); // Batch predictions
System.out.println("π Performance Optimizations:");
System.out.println("- Batch processing: 128 samples/batch");
System.out.println("- Early stopping: Prevent overfitting");
System.out.println("- GPU acceleration: When available");
System.out.println("- Memory optimization: FP16 precision");
System.out.println("- Inference optimization: Fast predictions");
// 4. Monitoring and Profiling
long startTime = System.currentTimeMillis();
// Training code here...
long endTime = System.currentTimeMillis();
System.out.println("\nβ±οΈ Training completed in: " + (endTime - startTime) + " ms");
// Memory usage
Runtime runtime = Runtime.getRuntime();
long memoryUsed = runtime.totalMemory() - runtime.freeMemory();
System.out.println("πΎ Memory used: " + (memoryUsed / 1024 / 1024) + " MB");
}
}
Best Practices
1. Data Preparation
- Normalization: Always normalize input data for neural networks
- Augmentation: Use data augmentation for image data
- Sequence Padding: Ensure consistent sequence lengths for RNNs
- Validation Split: Reserve data for validation during training
2. Architecture Design
- Start Simple: Begin with smaller networks and increase complexity
- Regularization: Use dropout and weight decay to prevent overfitting
- Batch Normalization: Improve training stability and speed
- Residual Connections: For very deep networks
3. Training Strategies
- Learning Rate Scheduling: Decrease learning rate during training
- Early Stopping: Monitor validation loss to prevent overfitting
- Gradient Clipping: Prevent exploding gradients in RNNs
- Checkpointing: Save model state during training
4. Production Deployment
- Model Compression: Use pruning and quantization
- Batch Inference: Process multiple samples together
- Model Caching: Cache frequently used models
- Performance Monitoring: Track inference time and accuracy
Troubleshooting Common Issues
Training Problems
// Problem: Vanishing gradients
// Solution: Use ReLU activation and proper weight initialization
MLPClassifier mlp = new MLPClassifier()
.setActivation("relu")
.setWeightInitialization("xavier")
.setGradientClipping(1.0);
// Problem: Exploding gradients
// Solution: Gradient clipping
RNNClassifier rnn = new RNNClassifier()
.setGradientClipping(1.0)
.setLearningRate(0.001);
// Problem: Overfitting
// Solution: Regularization and dropout
CNNClassifier cnn = new CNNClassifier()
.addDropoutLayer(0.5)
.setL2Regularization(0.01)
.setEarlyStoppingPatience(10);
Summary
In this tutorial, you learned:
- MLP Implementation: Multi-layer perceptrons for tabular data
- CNN Architecture: Convolutional networks for image processing
- RNN with LSTM: Recurrent networks for sequence data
- Preprocessing: Specialized data preparation for neural networks
- Model Persistence: Saving and loading trained models
- Performance Optimization: Tips for faster training and inference
- Best Practices: Guidelines for production deployment
Neural networks in SuperML Java 2.1.0 provide enterprise-grade performance with real-time training capabilities. The framework handles the complexity of neural network implementation while providing you with simple, intuitive APIs.
Next Steps
- Try XGBoost: Learn gradient boosting for tabular data
- Explore AutoML: Automated neural architecture search
- Model Deployment: Production deployment with inference engine
- Advanced Preprocessing: Feature engineering for neural networks
- Ensemble Methods: Combining multiple neural networks
Youβre now ready to build sophisticated neural network applications with SuperML Java 2.1.0!