· Java Machine Learning · 12 min read
Neural Networks in Java
SuperML Java 2.1.0 provides comprehensive support for neural networks including Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). This tutorial covers how to build, train, and deploy neural networks using Java with enterprise-grade performance.
What You’ll Learn
- Multi-Layer Perceptron (MLP) - Deep feedforward networks for tabular data
- Convolutional Neural Networks (CNN) - Image processing and computer vision
- Recurrent Neural Networks (RNN) - Sequence processing with LSTM cells
- Neural Network Preprocessing - Specialized data preparation techniques
- Model Persistence - Saving and loading trained neural networks
- Performance Optimization - Real-time training and inference
- Enterprise Deployment - Production-ready neural network systems
Prerequisites
- Completion of “Introduction to SuperML Java” and “Java ML Setup”
- Basic understanding of linear algebra and calculus
- Familiarity with neural network concepts
- Java development environment with SuperML Java 2.1.0
Neural Network Architecture Overview
SuperML Java 2.1.0 provides three main types of neural networks:
import org.superml.neural.MLPClassifier;
import org.superml.neural.CNNClassifier;
import org.superml.neural.RNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
// Multi-Layer Perceptron
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(64, 32, 16)
.setActivation("relu")
.setLearningRate(0.01);
// Convolutional Neural Network
CNNClassifier cnn = new CNNClassifier()
.setInputShape(28, 28, 1)
.setLearningRate(0.001);
// Recurrent Neural Network
RNNClassifier rnn = new RNNClassifier()
.setHiddenSize(64)
.setCellType("LSTM")
.setNumLayers(2);
Multi-Layer Perceptron (MLP)
Basic MLP Implementation
import org.superml.neural.MLPClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
import org.superml.datasets.Datasets;
import org.superml.model_selection.ModelSelection;
import org.superml.metrics.Metrics;
public class MLPExample {
public static void main(String[] args) {
System.out.println("=== SuperML 2.1.0 - MLP Neural Network ===\n");
try {
// Load dataset
var dataset = Datasets.loadIris();
var split = ModelSelection.trainTestSplit(dataset.X, dataset.y, 0.2, 42);
// Apply MLP preprocessing
NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
NeuralNetworkPreprocessor.NetworkType.MLP).configureMLP();
double[][] XTrainProcessed = preprocessor.preprocessMLP(split.XTrain);
double[][] XTestProcessed = preprocessor.preprocessMLP(split.XTest);
System.out.println("📊 Applied MLP preprocessing: standardization + outlier clipping");
// Create MLP with multiple hidden layers
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(128, 64, 32) // 3 hidden layers
.setActivation("relu") // ReLU activation
.setLearningRate(0.01) // Learning rate
.setMaxIter(200) // Maximum epochs
.setBatchSize(32) // Mini-batch size
.setEarlyStoppingPatience(10) // Early stopping
.setValidationFraction(0.2); // Validation split
System.out.println("🧠 Training MLP with architecture: 4 → 128 → 64 → 32 → 3");
// Train the model
long startTime = System.currentTimeMillis();
mlp.fit(XTrainProcessed, split.yTrain);
long trainingTime = System.currentTimeMillis() - startTime;
// Make predictions
double[] predictions = mlp.predict(XTestProcessed);
// Evaluate performance
double accuracy = Metrics.accuracy(split.yTest, predictions);
double precision = Metrics.precision(split.yTest, predictions);
double recall = Metrics.recall(split.yTest, predictions);
double f1 = Metrics.f1Score(split.yTest, predictions);
System.out.println("\n=== MLP Results ===");
System.out.println("Training time: " + trainingTime + " ms");
System.out.println("Accuracy: " + String.format("%.4f", accuracy));
System.out.println("Precision: " + String.format("%.4f", precision));
System.out.println("Recall: " + String.format("%.4f", recall));
System.out.println("F1 Score: " + String.format("%.4f", f1));
// Display training history
double[] trainingLoss = mlp.getTrainingHistory().getLoss();
double[] validationLoss = mlp.getTrainingHistory().getValidationLoss();
System.out.println("\n📈 Training History (last 5 epochs):");
for (int i = Math.max(0, trainingLoss.length - 5); i < trainingLoss.length; i++) {
System.out.printf("Epoch %d: Train Loss: %.4f, Val Loss: %.4f\n",
i + 1, trainingLoss[i], validationLoss[i]);
}
System.out.println("\n✅ MLP training completed successfully!");
} catch (Exception e) {
System.err.println("❌ Error in MLP training: " + e.getMessage());
e.printStackTrace();
}
}
}
Advanced MLP Configuration
import org.superml.neural.MLPClassifier;
import org.superml.neural.optimizers.Adam;
import org.superml.neural.regularizers.L2Regularizer;
public class AdvancedMLPExample {
public static void main(String[] args) {
try {
// Advanced MLP with custom configuration
MLPClassifier advancedMLP = new MLPClassifier()
.setHiddenLayerSizes(256, 128, 64, 32)
.setActivation("relu")
.setOutputActivation("softmax")
.setLearningRate(0.001)
.setOptimizer(new Adam()
.setBeta1(0.9)
.setBeta2(0.999)
.setEpsilon(1e-8))
.setRegularizer(new L2Regularizer(0.01))
.setDropoutRate(0.2)
.setBatchSize(64)
.setMaxIter(300)
.setEarlyStoppingPatience(15)
.setValidationFraction(0.15)
.setShuffleBatches(true)
.setVerbose(true);
System.out.println("🚀 Advanced MLP Configuration:");
System.out.println("- Architecture: 256 → 128 → 64 → 32");
System.out.println("- Optimizer: Adam with β₁=0.9, β₂=0.999");
System.out.println("- Regularization: L2 with λ=0.01");
System.out.println("- Dropout: 20% during training");
System.out.println("- Early stopping with patience=15");
// Training would continue here...
} catch (Exception e) {
System.err.println("❌ Error: " + e.getMessage());
}
}
}
Convolutional Neural Networks (CNN)
CNN for Image Classification
import org.superml.neural.CNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
public class CNNExample {
public static void main(String[] args) {
System.out.println("=== SuperML 2.1.0 - CNN for Image Classification ===\n");
try {
// Generate synthetic image data (28x28 grayscale images)
double[][] imageData = generateImageData(1000, 28, 28);
double[] labels = generateImageLabels(1000);
// Split data
var split = splitImageData(imageData, labels, 0.8);
// Apply CNN preprocessing
NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
NeuralNetworkPreprocessor.NetworkType.CNN).configureCNN(28, 28, 1);
double[][] XTrainProcessed = preprocessor.preprocessCNN(split.XTrain);
double[][] XTestProcessed = preprocessor.preprocessCNN(split.XTest);
System.out.println("🖼️ Applied CNN preprocessing: pixel normalization to [-1,1]");
System.out.println("📊 Training samples: " + XTrainProcessed.length);
System.out.println("📊 Test samples: " + XTestProcessed.length);
// Create CNN with multiple layers
CNNClassifier cnn = new CNNClassifier()
.setInputShape(28, 28, 1) // 28x28 grayscale images
.addConvLayer(32, 3, 3, "relu") // 32 filters, 3x3 kernel
.addMaxPoolLayer(2, 2) // 2x2 max pooling
.addConvLayer(64, 3, 3, "relu") // 64 filters, 3x3 kernel
.addMaxPoolLayer(2, 2) // 2x2 max pooling
.addConvLayer(128, 3, 3, "relu") // 128 filters, 3x3 kernel
.addGlobalAveragePoolLayer() // Global average pooling
.addDenseLayer(128, "relu") // Dense layer with 128 units
.addDropoutLayer(0.5) // Dropout for regularization
.addDenseLayer(3, "softmax") // Output layer (3 classes)
.setLearningRate(0.001) // Learning rate
.setMaxEpochs(100) // Training epochs
.setBatchSize(32) // Batch size
.setEarlyStoppingPatience(10); // Early stopping
System.out.println("🏗️ CNN Architecture:");
System.out.println("- Input: 28×28×1");
System.out.println("- Conv2D(32) → MaxPool → Conv2D(64) → MaxPool → Conv2D(128)");
System.out.println("- GlobalAvgPool → Dense(128) → Dropout(0.5) → Dense(3)");
// Train the CNN
long startTime = System.currentTimeMillis();
cnn.fit(XTrainProcessed, split.yTrain);
long trainingTime = System.currentTimeMillis() - startTime;
// Make predictions
double[] predictions = cnn.predict(XTestProcessed);
// Evaluate performance
double accuracy = calculateAccuracy(split.yTest, predictions);
System.out.println("\n=== CNN Results ===");
System.out.println("Training time: " + trainingTime + " ms");
System.out.println("Test accuracy: " + String.format("%.4f", accuracy));
// Display feature maps info
System.out.println("\n🔍 Feature Maps:");
System.out.println("- Conv Layer 1: 32 feature maps (26×26)");
System.out.println("- Conv Layer 2: 64 feature maps (11×11)");
System.out.println("- Conv Layer 3: 128 feature maps (4×4)");
System.out.println("\n✅ CNN training completed successfully!");
} catch (Exception e) {
System.err.println("❌ Error in CNN training: " + e.getMessage());
e.printStackTrace();
}
}
private static double[][] generateImageData(int samples, int height, int width) {
double[][] data = new double[samples][height * width];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
for (int j = 0; j < height * width; j++) {
data[i][j] = random.nextDouble(); // Pixel values 0-1
}
}
return data;
}
private static double[] generateImageLabels(int samples) {
double[] labels = new double[samples];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
labels[i] = random.nextInt(3); // 3 classes
}
return labels;
}
private static DataSplit splitImageData(double[][] X, double[] y, double trainRatio) {
int trainSize = (int) (X.length * trainRatio);
double[][] XTrain = new double[trainSize][];
double[][] XTest = new double[X.length - trainSize][];
double[] yTrain = new double[trainSize];
double[] yTest = new double[X.length - trainSize];
System.arraycopy(X, 0, XTrain, 0, trainSize);
System.arraycopy(X, trainSize, XTest, 0, X.length - trainSize);
System.arraycopy(y, 0, yTrain, 0, trainSize);
System.arraycopy(y, trainSize, yTest, 0, X.length - trainSize);
return new DataSplit(XTrain, XTest, yTrain, yTest);
}
private static class DataSplit {
final double[][] XTrain, XTest;
final double[] yTrain, yTest;
DataSplit(double[][] XTrain, double[][] XTest, double[] yTrain, double[] yTest) {
this.XTrain = XTrain;
this.XTest = XTest;
this.yTrain = yTrain;
this.yTest = yTest;
}
}
private static double calculateAccuracy(double[] actual, double[] predicted) {
int correct = 0;
for (int i = 0; i < actual.length; i++) {
if (Math.round(predicted[i]) == Math.round(actual[i])) {
correct++;
}
}
return (double) correct / actual.length;
}
}
Recurrent Neural Networks (RNN)
RNN for Sequence Processing
import org.superml.neural.RNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
public class RNNExample {
public static void main(String[] args) {
System.out.println("=== SuperML 2.1.0 - RNN for Sequence Processing ===\n");
try {
// Generate synthetic sequence data
double[][] sequenceData = generateSequenceData(800, 30, 8);
double[] labels = generateSequenceLabels(800);
// Split data
var split = splitSequenceData(sequenceData, labels, 0.8);
// Apply RNN preprocessing
NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
NeuralNetworkPreprocessor.NetworkType.RNN).configureRNN(30, 8, false);
double[][] XTrainProcessed = preprocessor.preprocessRNN(split.XTrain);
double[][] XTestProcessed = preprocessor.preprocessRNN(split.XTest);
System.out.println("📈 Applied RNN preprocessing: global scaling + temporal smoothing");
System.out.println("📊 Sequence length: 30 timesteps");
System.out.println("📊 Features per timestep: 8");
System.out.println("📊 Training sequences: " + XTrainProcessed.length);
// Create RNN with LSTM cells
RNNClassifier rnn = new RNNClassifier()
.setHiddenSize(64) // LSTM hidden units
.setNumLayers(2) // 2 LSTM layers
.setCellType("LSTM") // LSTM cells
.setDropoutRate(0.2) // Dropout between layers
.setBidirectional(false) // Unidirectional
.setSequenceLength(30) // Input sequence length
.setInputSize(8) // Features per timestep
.setOutputSize(3) // Number of classes
.setLearningRate(0.01) // Learning rate
.setMaxEpochs(100) // Training epochs
.setBatchSize(32) // Batch size
.setEarlyStoppingPatience(15) // Early stopping
.setGradientClipping(1.0); // Gradient clipping
System.out.println("🔄 RNN Architecture:");
System.out.println("- Input: 30 timesteps × 8 features");
System.out.println("- LSTM Layer 1: 64 hidden units");
System.out.println("- LSTM Layer 2: 64 hidden units");
System.out.println("- Dropout: 20% between layers");
System.out.println("- Output: 3 classes");
// Train the RNN
long startTime = System.currentTimeMillis();
rnn.fit(XTrainProcessed, split.yTrain);
long trainingTime = System.currentTimeMillis() - startTime;
// Make predictions
double[] predictions = rnn.predict(XTestProcessed);
// Evaluate performance
double accuracy = calculateAccuracy(split.yTest, predictions);
System.out.println("\n=== RNN Results ===");
System.out.println("Training time: " + trainingTime + " ms");
System.out.println("Test accuracy: " + String.format("%.4f", accuracy));
// Display sequence processing info
System.out.println("\n🔍 Sequence Processing:");
System.out.println("- Total parameters: ~" + rnn.getParameterCount());
System.out.println("- Memory cells: " + (rnn.getHiddenSize() * rnn.getNumLayers()));
System.out.println("- Gradient clipping: " + rnn.getGradientClipping());
System.out.println("\n✅ RNN training completed successfully!");
} catch (Exception e) {
System.err.println("❌ Error in RNN training: " + e.getMessage());
e.printStackTrace();
}
}
private static double[][] generateSequenceData(int samples, int sequenceLength, int features) {
double[][] data = new double[samples][sequenceLength * features];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
for (int t = 0; t < sequenceLength; t++) {
for (int f = 0; f < features; f++) {
int idx = t * features + f;
// Create time-dependent patterns
data[i][idx] = Math.sin(t * 0.1 + f) + random.nextGaussian() * 0.1;
}
}
}
return data;
}
private static double[] generateSequenceLabels(int samples) {
double[] labels = new double[samples];
java.util.Random random = new java.util.Random(42);
for (int i = 0; i < samples; i++) {
labels[i] = random.nextInt(3); // 3 classes
}
return labels;
}
private static DataSplit splitSequenceData(double[][] X, double[] y, double trainRatio) {
int trainSize = (int) (X.length * trainRatio);
double[][] XTrain = new double[trainSize][];
double[][] XTest = new double[X.length - trainSize][];
double[] yTrain = new double[trainSize];
double[] yTest = new double[X.length - trainSize];
System.arraycopy(X, 0, XTrain, 0, trainSize);
System.arraycopy(X, trainSize, XTest, 0, X.length - trainSize);
System.arraycopy(y, 0, yTrain, 0, trainSize);
System.arraycopy(y, trainSize, yTest, 0, X.length - trainSize);
return new DataSplit(XTrain, XTest, yTrain, yTest);
}
}
Model Persistence and Deployment
Saving and Loading Neural Networks
import org.superml.persistence.ModelPersistence;
import org.superml.neural.MLPClassifier;
public class NeuralNetworkPersistence {
public static void main(String[] args) {
try {
// Train a neural network
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(128, 64, 32)
.setActivation("relu")
.setLearningRate(0.01)
.setMaxIter(100);
// Assume training data is available
// mlp.fit(XTrain, yTrain);
// Save model with metadata
Map<String, Object> metadata = new HashMap<>();
metadata.put("model_type", "MLPClassifier");
metadata.put("architecture", "128-64-32");
metadata.put("activation", "relu");
metadata.put("training_samples", 1000);
metadata.put("accuracy", 0.95);
metadata.put("created_date", new Date().toString());
String modelPath = "models/neural_network_model.superml";
ModelPersistence.save(mlp, modelPath, "Production MLP Model", metadata);
System.out.println("✅ Model saved to: " + modelPath);
// Load model for inference
MLPClassifier loadedModel = ModelPersistence.load(modelPath, MLPClassifier.class);
System.out.println("✅ Model loaded successfully");
System.out.println("🏗️ Architecture: " + Arrays.toString(loadedModel.getHiddenLayerSizes()));
System.out.println("⚡ Activation: " + loadedModel.getActivation());
// Use loaded model for predictions
// double[] predictions = loadedModel.predict(XTest);
} catch (Exception e) {
System.err.println("❌ Error in model persistence: " + e.getMessage());
e.printStackTrace();
}
}
}
Advanced Neural Network Features
Ensemble of Neural Networks
import org.superml.neural.MLPClassifier;
import org.superml.neural.CNNClassifier;
import org.superml.neural.RNNClassifier;
import org.superml.ensemble.NeuralNetworkEnsemble;
public class NeuralNetworkEnsemble {
public static void main(String[] args) {
try {
// Create ensemble of different neural networks
MLPClassifier mlp = new MLPClassifier()
.setHiddenLayerSizes(128, 64)
.setActivation("relu");
CNNClassifier cnn = new CNNClassifier()
.setInputShape(28, 28, 1)
.addConvLayer(32, 3, 3, "relu")
.addMaxPoolLayer(2, 2)
.addDenseLayer(64, "relu");
RNNClassifier rnn = new RNNClassifier()
.setHiddenSize(64)
.setCellType("LSTM")
.setNumLayers(2);
// Create ensemble
NeuralNetworkEnsemble ensemble = new NeuralNetworkEnsemble()
.addModel("mlp", mlp, 0.4) // 40% weight
.addModel("cnn", cnn, 0.35) // 35% weight
.addModel("rnn", rnn, 0.25) // 25% weight
.setVotingStrategy("weighted_average");
System.out.println("🔗 Neural Network Ensemble:");
System.out.println("- MLP: 40% weight");
System.out.println("- CNN: 35% weight");
System.out.println("- RNN: 25% weight");
System.out.println("- Strategy: Weighted Average");
// Train ensemble (each model on preprocessed data)
// ensemble.fit(XTrain, yTrain);
// Make ensemble predictions
// double[] predictions = ensemble.predict(XTest);
} catch (Exception e) {
System.err.println("❌ Error in ensemble: " + e.getMessage());
}
}
}
Performance Optimization
Neural Network Performance Tips
public class NeuralNetworkOptimization {
public static void main(String[] args) {
System.out.println("=== Neural Network Performance Optimization ===\n");
// 1. Batch Processing for High Throughput
MLPClassifier optimizedMLP = new MLPClassifier()
.setHiddenLayerSizes(256, 128, 64)
.setBatchSize(128) // Larger batch size
.setMaxIter(50) // Fewer epochs
.setEarlyStoppingPatience(5) // Early stopping
.setParallelTraining(true) // Parallel processing
.setGPUAcceleration(true); // GPU acceleration if available
// 2. Memory-Efficient Training
CNNClassifier memoryEfficientCNN = new CNNClassifier()
.setInputShape(224, 224, 3)
.setBatchSize(16) // Smaller batch for large images
.setGradientAccumulation(4) // Accumulate gradients
.setMixedPrecision(true) // Use FP16 for memory efficiency
.setMemoryOptimization(true);
// 3. Fast Inference Configuration
RNNClassifier fastRNN = new RNNClassifier()
.setHiddenSize(32) // Smaller hidden size
.setNumLayers(1) // Single layer
.setCellType("GRU") // Faster than LSTM
.setInferenceOptimization(true) // Optimize for inference
.setBatchInference(true); // Batch predictions
System.out.println("🚀 Performance Optimizations:");
System.out.println("- Batch processing: 128 samples/batch");
System.out.println("- Early stopping: Prevent overfitting");
System.out.println("- GPU acceleration: When available");
System.out.println("- Memory optimization: FP16 precision");
System.out.println("- Inference optimization: Fast predictions");
// 4. Monitoring and Profiling
long startTime = System.currentTimeMillis();
// Training code here...
long endTime = System.currentTimeMillis();
System.out.println("\n⏱️ Training completed in: " + (endTime - startTime) + " ms");
// Memory usage
Runtime runtime = Runtime.getRuntime();
long memoryUsed = runtime.totalMemory() - runtime.freeMemory();
System.out.println("💾 Memory used: " + (memoryUsed / 1024 / 1024) + " MB");
}
}
Best Practices
1. Data Preparation
- Normalization: Always normalize input data for neural networks
- Augmentation: Use data augmentation for image data
- Sequence Padding: Ensure consistent sequence lengths for RNNs
- Validation Split: Reserve data for validation during training
2. Architecture Design
- Start Simple: Begin with smaller networks and increase complexity
- Regularization: Use dropout and weight decay to prevent overfitting
- Batch Normalization: Improve training stability and speed
- Residual Connections: For very deep networks
3. Training Strategies
- Learning Rate Scheduling: Decrease learning rate during training
- Early Stopping: Monitor validation loss to prevent overfitting
- Gradient Clipping: Prevent exploding gradients in RNNs
- Checkpointing: Save model state during training
4. Production Deployment
- Model Compression: Use pruning and quantization
- Batch Inference: Process multiple samples together
- Model Caching: Cache frequently used models
- Performance Monitoring: Track inference time and accuracy
Troubleshooting Common Issues
Training Problems
// Problem: Vanishing gradients
// Solution: Use ReLU activation and proper weight initialization
MLPClassifier mlp = new MLPClassifier()
.setActivation("relu")
.setWeightInitialization("xavier")
.setGradientClipping(1.0);
// Problem: Exploding gradients
// Solution: Gradient clipping
RNNClassifier rnn = new RNNClassifier()
.setGradientClipping(1.0)
.setLearningRate(0.001);
// Problem: Overfitting
// Solution: Regularization and dropout
CNNClassifier cnn = new CNNClassifier()
.addDropoutLayer(0.5)
.setL2Regularization(0.01)
.setEarlyStoppingPatience(10);
Summary
In this tutorial, you learned:
- MLP Implementation: Multi-layer perceptrons for tabular data
- CNN Architecture: Convolutional networks for image processing
- RNN with LSTM: Recurrent networks for sequence data
- Preprocessing: Specialized data preparation for neural networks
- Model Persistence: Saving and loading trained models
- Performance Optimization: Tips for faster training and inference
- Best Practices: Guidelines for production deployment
Neural networks in SuperML Java 2.1.0 provide enterprise-grade performance with real-time training capabilities. The framework handles the complexity of neural network implementation while providing you with simple, intuitive APIs.
Next Steps
- Try XGBoost: Learn gradient boosting for tabular data
- Explore AutoML: Automated neural architecture search
- Model Deployment: Production deployment with inference engine
- Advanced Preprocessing: Feature engineering for neural networks
- Ensemble Methods: Combining multiple neural networks
You’re now ready to build sophisticated neural network applications with SuperML Java 2.1.0!