Press ESC to exit fullscreen
πŸ“– Lesson ⏱️ 135 minutes

Neural Networks in Java

Building and training MLP, CNN, and RNN networks with SuperML

Neural Networks in Java

SuperML Java 2.1.0 provides comprehensive support for neural networks including Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). This tutorial covers how to build, train, and deploy neural networks using Java with enterprise-grade performance.

What You’ll Learn

  • Multi-Layer Perceptron (MLP) - Deep feedforward networks for tabular data
  • Convolutional Neural Networks (CNN) - Image processing and computer vision
  • Recurrent Neural Networks (RNN) - Sequence processing with LSTM cells
  • Neural Network Preprocessing - Specialized data preparation techniques
  • Model Persistence - Saving and loading trained neural networks
  • Performance Optimization - Real-time training and inference
  • Enterprise Deployment - Production-ready neural network systems

Prerequisites

  • Completion of β€œIntroduction to SuperML Java” and β€œJava ML Setup”
  • Basic understanding of linear algebra and calculus
  • Familiarity with neural network concepts
  • Java development environment with SuperML Java 2.1.0

Neural Network Architecture Overview

SuperML Java 2.1.0 provides three main types of neural networks:

import org.superml.neural.MLPClassifier;
import org.superml.neural.CNNClassifier;
import org.superml.neural.RNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;

// Multi-Layer Perceptron
MLPClassifier mlp = new MLPClassifier()
    .setHiddenLayerSizes(64, 32, 16)
    .setActivation("relu")
    .setLearningRate(0.01);

// Convolutional Neural Network
CNNClassifier cnn = new CNNClassifier()
    .setInputShape(28, 28, 1)
    .setLearningRate(0.001);

// Recurrent Neural Network
RNNClassifier rnn = new RNNClassifier()
    .setHiddenSize(64)
    .setCellType("LSTM")
    .setNumLayers(2);

Multi-Layer Perceptron (MLP)

Basic MLP Implementation

import org.superml.neural.MLPClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;
import org.superml.datasets.Datasets;
import org.superml.model_selection.ModelSelection;
import org.superml.metrics.Metrics;

public class MLPExample {
    public static void main(String[] args) {
        System.out.println("=== SuperML 2.1.0 - MLP Neural Network ===\n");
        
        try {
            // Load dataset
            var dataset = Datasets.loadIris();
            var split = ModelSelection.trainTestSplit(dataset.X, dataset.y, 0.2, 42);
            
            // Apply MLP preprocessing
            NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
                NeuralNetworkPreprocessor.NetworkType.MLP).configureMLP();
            
            double[][] XTrainProcessed = preprocessor.preprocessMLP(split.XTrain);
            double[][] XTestProcessed = preprocessor.preprocessMLP(split.XTest);
            
            System.out.println("πŸ“Š Applied MLP preprocessing: standardization + outlier clipping");
            
            // Create MLP with multiple hidden layers
            MLPClassifier mlp = new MLPClassifier()
                .setHiddenLayerSizes(128, 64, 32)  // 3 hidden layers
                .setActivation("relu")             // ReLU activation
                .setLearningRate(0.01)             // Learning rate
                .setMaxIter(200)                   // Maximum epochs
                .setBatchSize(32)                  // Mini-batch size
                .setEarlyStoppingPatience(10)      // Early stopping
                .setValidationFraction(0.2);       // Validation split
            
            System.out.println("🧠 Training MLP with architecture: 4 β†’ 128 β†’ 64 β†’ 32 β†’ 3");
            
            // Train the model
            long startTime = System.currentTimeMillis();
            mlp.fit(XTrainProcessed, split.yTrain);
            long trainingTime = System.currentTimeMillis() - startTime;
            
            // Make predictions
            double[] predictions = mlp.predict(XTestProcessed);
            
            // Evaluate performance
            double accuracy = Metrics.accuracy(split.yTest, predictions);
            double precision = Metrics.precision(split.yTest, predictions);
            double recall = Metrics.recall(split.yTest, predictions);
            double f1 = Metrics.f1Score(split.yTest, predictions);
            
            System.out.println("\n=== MLP Results ===");
            System.out.println("Training time: " + trainingTime + " ms");
            System.out.println("Accuracy: " + String.format("%.4f", accuracy));
            System.out.println("Precision: " + String.format("%.4f", precision));
            System.out.println("Recall: " + String.format("%.4f", recall));
            System.out.println("F1 Score: " + String.format("%.4f", f1));
            
            // Display training history
            double[] trainingLoss = mlp.getTrainingHistory().getLoss();
            double[] validationLoss = mlp.getTrainingHistory().getValidationLoss();
            
            System.out.println("\nπŸ“ˆ Training History (last 5 epochs):");
            for (int i = Math.max(0, trainingLoss.length - 5); i < trainingLoss.length; i++) {
                System.out.printf("Epoch %d: Train Loss: %.4f, Val Loss: %.4f\n", 
                    i + 1, trainingLoss[i], validationLoss[i]);
            }
            
            System.out.println("\nβœ… MLP training completed successfully!");
            
        } catch (Exception e) {
            System.err.println("❌ Error in MLP training: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Advanced MLP Configuration

import org.superml.neural.MLPClassifier;
import org.superml.neural.optimizers.Adam;
import org.superml.neural.regularizers.L2Regularizer;

public class AdvancedMLPExample {
    public static void main(String[] args) {
        try {
            // Advanced MLP with custom configuration
            MLPClassifier advancedMLP = new MLPClassifier()
                .setHiddenLayerSizes(256, 128, 64, 32)
                .setActivation("relu")
                .setOutputActivation("softmax")
                .setLearningRate(0.001)
                .setOptimizer(new Adam()
                    .setBeta1(0.9)
                    .setBeta2(0.999)
                    .setEpsilon(1e-8))
                .setRegularizer(new L2Regularizer(0.01))
                .setDropoutRate(0.2)
                .setBatchSize(64)
                .setMaxIter(300)
                .setEarlyStoppingPatience(15)
                .setValidationFraction(0.15)
                .setShuffleBatches(true)
                .setVerbose(true);
            
            System.out.println("πŸš€ Advanced MLP Configuration:");
            System.out.println("- Architecture: 256 β†’ 128 β†’ 64 β†’ 32");
            System.out.println("- Optimizer: Adam with β₁=0.9, Ξ²β‚‚=0.999");
            System.out.println("- Regularization: L2 with Ξ»=0.01");
            System.out.println("- Dropout: 20% during training");
            System.out.println("- Early stopping with patience=15");
            
            // Training would continue here...
            
        } catch (Exception e) {
            System.err.println("❌ Error: " + e.getMessage());
        }
    }
}

Convolutional Neural Networks (CNN)

CNN for Image Classification

import org.superml.neural.CNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;

public class CNNExample {
    public static void main(String[] args) {
        System.out.println("=== SuperML 2.1.0 - CNN for Image Classification ===\n");
        
        try {
            // Generate synthetic image data (28x28 grayscale images)
            double[][] imageData = generateImageData(1000, 28, 28);
            double[] labels = generateImageLabels(1000);
            
            // Split data
            var split = splitImageData(imageData, labels, 0.8);
            
            // Apply CNN preprocessing
            NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
                NeuralNetworkPreprocessor.NetworkType.CNN).configureCNN(28, 28, 1);
            
            double[][] XTrainProcessed = preprocessor.preprocessCNN(split.XTrain);
            double[][] XTestProcessed = preprocessor.preprocessCNN(split.XTest);
            
            System.out.println("πŸ–ΌοΈ Applied CNN preprocessing: pixel normalization to [-1,1]");
            System.out.println("πŸ“Š Training samples: " + XTrainProcessed.length);
            System.out.println("πŸ“Š Test samples: " + XTestProcessed.length);
            
            // Create CNN with multiple layers
            CNNClassifier cnn = new CNNClassifier()
                .setInputShape(28, 28, 1)          // 28x28 grayscale images
                .addConvLayer(32, 3, 3, "relu")    // 32 filters, 3x3 kernel
                .addMaxPoolLayer(2, 2)             // 2x2 max pooling
                .addConvLayer(64, 3, 3, "relu")    // 64 filters, 3x3 kernel
                .addMaxPoolLayer(2, 2)             // 2x2 max pooling
                .addConvLayer(128, 3, 3, "relu")   // 128 filters, 3x3 kernel
                .addGlobalAveragePoolLayer()       // Global average pooling
                .addDenseLayer(128, "relu")        // Dense layer with 128 units
                .addDropoutLayer(0.5)              // Dropout for regularization
                .addDenseLayer(3, "softmax")       // Output layer (3 classes)
                .setLearningRate(0.001)            // Learning rate
                .setMaxEpochs(100)                 // Training epochs
                .setBatchSize(32)                  // Batch size
                .setEarlyStoppingPatience(10);     // Early stopping
            
            System.out.println("πŸ—οΈ CNN Architecture:");
            System.out.println("- Input: 28Γ—28Γ—1");
            System.out.println("- Conv2D(32) β†’ MaxPool β†’ Conv2D(64) β†’ MaxPool β†’ Conv2D(128)");
            System.out.println("- GlobalAvgPool β†’ Dense(128) β†’ Dropout(0.5) β†’ Dense(3)");
            
            // Train the CNN
            long startTime = System.currentTimeMillis();
            cnn.fit(XTrainProcessed, split.yTrain);
            long trainingTime = System.currentTimeMillis() - startTime;
            
            // Make predictions
            double[] predictions = cnn.predict(XTestProcessed);
            
            // Evaluate performance
            double accuracy = calculateAccuracy(split.yTest, predictions);
            
            System.out.println("\n=== CNN Results ===");
            System.out.println("Training time: " + trainingTime + " ms");
            System.out.println("Test accuracy: " + String.format("%.4f", accuracy));
            
            // Display feature maps info
            System.out.println("\nπŸ” Feature Maps:");
            System.out.println("- Conv Layer 1: 32 feature maps (26Γ—26)");
            System.out.println("- Conv Layer 2: 64 feature maps (11Γ—11)");
            System.out.println("- Conv Layer 3: 128 feature maps (4Γ—4)");
            
            System.out.println("\nβœ… CNN training completed successfully!");
            
        } catch (Exception e) {
            System.err.println("❌ Error in CNN training: " + e.getMessage());
            e.printStackTrace();
        }
    }
    
    private static double[][] generateImageData(int samples, int height, int width) {
        double[][] data = new double[samples][height * width];
        java.util.Random random = new java.util.Random(42);
        
        for (int i = 0; i < samples; i++) {
            for (int j = 0; j < height * width; j++) {
                data[i][j] = random.nextDouble(); // Pixel values 0-1
            }
        }
        return data;
    }
    
    private static double[] generateImageLabels(int samples) {
        double[] labels = new double[samples];
        java.util.Random random = new java.util.Random(42);
        
        for (int i = 0; i < samples; i++) {
            labels[i] = random.nextInt(3); // 3 classes
        }
        return labels;
    }
    
    private static DataSplit splitImageData(double[][] X, double[] y, double trainRatio) {
        int trainSize = (int) (X.length * trainRatio);
        
        double[][] XTrain = new double[trainSize][];
        double[][] XTest = new double[X.length - trainSize][];
        double[] yTrain = new double[trainSize];
        double[] yTest = new double[X.length - trainSize];
        
        System.arraycopy(X, 0, XTrain, 0, trainSize);
        System.arraycopy(X, trainSize, XTest, 0, X.length - trainSize);
        System.arraycopy(y, 0, yTrain, 0, trainSize);
        System.arraycopy(y, trainSize, yTest, 0, X.length - trainSize);
        
        return new DataSplit(XTrain, XTest, yTrain, yTest);
    }
    
    private static class DataSplit {
        final double[][] XTrain, XTest;
        final double[] yTrain, yTest;
        
        DataSplit(double[][] XTrain, double[][] XTest, double[] yTrain, double[] yTest) {
            this.XTrain = XTrain;
            this.XTest = XTest;
            this.yTrain = yTrain;
            this.yTest = yTest;
        }
    }
    
    private static double calculateAccuracy(double[] actual, double[] predicted) {
        int correct = 0;
        for (int i = 0; i < actual.length; i++) {
            if (Math.round(predicted[i]) == Math.round(actual[i])) {
                correct++;
            }
        }
        return (double) correct / actual.length;
    }
}

Recurrent Neural Networks (RNN)

RNN for Sequence Processing

import org.superml.neural.RNNClassifier;
import org.superml.preprocessing.NeuralNetworkPreprocessor;

public class RNNExample {
    public static void main(String[] args) {
        System.out.println("=== SuperML 2.1.0 - RNN for Sequence Processing ===\n");
        
        try {
            // Generate synthetic sequence data
            double[][] sequenceData = generateSequenceData(800, 30, 8);
            double[] labels = generateSequenceLabels(800);
            
            // Split data
            var split = splitSequenceData(sequenceData, labels, 0.8);
            
            // Apply RNN preprocessing
            NeuralNetworkPreprocessor preprocessor = new NeuralNetworkPreprocessor(
                NeuralNetworkPreprocessor.NetworkType.RNN).configureRNN(30, 8, false);
            
            double[][] XTrainProcessed = preprocessor.preprocessRNN(split.XTrain);
            double[][] XTestProcessed = preprocessor.preprocessRNN(split.XTest);
            
            System.out.println("πŸ“ˆ Applied RNN preprocessing: global scaling + temporal smoothing");
            System.out.println("πŸ“Š Sequence length: 30 timesteps");
            System.out.println("πŸ“Š Features per timestep: 8");
            System.out.println("πŸ“Š Training sequences: " + XTrainProcessed.length);
            
            // Create RNN with LSTM cells
            RNNClassifier rnn = new RNNClassifier()
                .setHiddenSize(64)                 // LSTM hidden units
                .setNumLayers(2)                   // 2 LSTM layers
                .setCellType("LSTM")               // LSTM cells
                .setDropoutRate(0.2)               // Dropout between layers
                .setBidirectional(false)           // Unidirectional
                .setSequenceLength(30)             // Input sequence length
                .setInputSize(8)                   // Features per timestep
                .setOutputSize(3)                  // Number of classes
                .setLearningRate(0.01)             // Learning rate
                .setMaxEpochs(100)                 // Training epochs
                .setBatchSize(32)                  // Batch size
                .setEarlyStoppingPatience(15)      // Early stopping
                .setGradientClipping(1.0);         // Gradient clipping
            
            System.out.println("πŸ”„ RNN Architecture:");
            System.out.println("- Input: 30 timesteps Γ— 8 features");
            System.out.println("- LSTM Layer 1: 64 hidden units");
            System.out.println("- LSTM Layer 2: 64 hidden units");
            System.out.println("- Dropout: 20% between layers");
            System.out.println("- Output: 3 classes");
            
            // Train the RNN
            long startTime = System.currentTimeMillis();
            rnn.fit(XTrainProcessed, split.yTrain);
            long trainingTime = System.currentTimeMillis() - startTime;
            
            // Make predictions
            double[] predictions = rnn.predict(XTestProcessed);
            
            // Evaluate performance
            double accuracy = calculateAccuracy(split.yTest, predictions);
            
            System.out.println("\n=== RNN Results ===");
            System.out.println("Training time: " + trainingTime + " ms");
            System.out.println("Test accuracy: " + String.format("%.4f", accuracy));
            
            // Display sequence processing info
            System.out.println("\nπŸ” Sequence Processing:");
            System.out.println("- Total parameters: ~" + rnn.getParameterCount());
            System.out.println("- Memory cells: " + (rnn.getHiddenSize() * rnn.getNumLayers()));
            System.out.println("- Gradient clipping: " + rnn.getGradientClipping());
            
            System.out.println("\nβœ… RNN training completed successfully!");
            
        } catch (Exception e) {
            System.err.println("❌ Error in RNN training: " + e.getMessage());
            e.printStackTrace();
        }
    }
    
    private static double[][] generateSequenceData(int samples, int sequenceLength, int features) {
        double[][] data = new double[samples][sequenceLength * features];
        java.util.Random random = new java.util.Random(42);
        
        for (int i = 0; i < samples; i++) {
            for (int t = 0; t < sequenceLength; t++) {
                for (int f = 0; f < features; f++) {
                    int idx = t * features + f;
                    // Create time-dependent patterns
                    data[i][idx] = Math.sin(t * 0.1 + f) + random.nextGaussian() * 0.1;
                }
            }
        }
        return data;
    }
    
    private static double[] generateSequenceLabels(int samples) {
        double[] labels = new double[samples];
        java.util.Random random = new java.util.Random(42);
        
        for (int i = 0; i < samples; i++) {
            labels[i] = random.nextInt(3); // 3 classes
        }
        return labels;
    }
    
    private static DataSplit splitSequenceData(double[][] X, double[] y, double trainRatio) {
        int trainSize = (int) (X.length * trainRatio);
        
        double[][] XTrain = new double[trainSize][];
        double[][] XTest = new double[X.length - trainSize][];
        double[] yTrain = new double[trainSize];
        double[] yTest = new double[X.length - trainSize];
        
        System.arraycopy(X, 0, XTrain, 0, trainSize);
        System.arraycopy(X, trainSize, XTest, 0, X.length - trainSize);
        System.arraycopy(y, 0, yTrain, 0, trainSize);
        System.arraycopy(y, trainSize, yTest, 0, X.length - trainSize);
        
        return new DataSplit(XTrain, XTest, yTrain, yTest);
    }
}

Model Persistence and Deployment

Saving and Loading Neural Networks

import org.superml.persistence.ModelPersistence;
import org.superml.neural.MLPClassifier;

public class NeuralNetworkPersistence {
    public static void main(String[] args) {
        try {
            // Train a neural network
            MLPClassifier mlp = new MLPClassifier()
                .setHiddenLayerSizes(128, 64, 32)
                .setActivation("relu")
                .setLearningRate(0.01)
                .setMaxIter(100);
            
            // Assume training data is available
            // mlp.fit(XTrain, yTrain);
            
            // Save model with metadata
            Map<String, Object> metadata = new HashMap<>();
            metadata.put("model_type", "MLPClassifier");
            metadata.put("architecture", "128-64-32");
            metadata.put("activation", "relu");
            metadata.put("training_samples", 1000);
            metadata.put("accuracy", 0.95);
            metadata.put("created_date", new Date().toString());
            
            String modelPath = "models/neural_network_model.superml";
            ModelPersistence.save(mlp, modelPath, "Production MLP Model", metadata);
            
            System.out.println("βœ… Model saved to: " + modelPath);
            
            // Load model for inference
            MLPClassifier loadedModel = ModelPersistence.load(modelPath, MLPClassifier.class);
            
            System.out.println("βœ… Model loaded successfully");
            System.out.println("πŸ—οΈ Architecture: " + Arrays.toString(loadedModel.getHiddenLayerSizes()));
            System.out.println("⚑ Activation: " + loadedModel.getActivation());
            
            // Use loaded model for predictions
            // double[] predictions = loadedModel.predict(XTest);
            
        } catch (Exception e) {
            System.err.println("❌ Error in model persistence: " + e.getMessage());
            e.printStackTrace();
        }
    }
}

Advanced Neural Network Features

Ensemble of Neural Networks

import org.superml.neural.MLPClassifier;
import org.superml.neural.CNNClassifier;
import org.superml.neural.RNNClassifier;
import org.superml.ensemble.NeuralNetworkEnsemble;

public class NeuralNetworkEnsemble {
    public static void main(String[] args) {
        try {
            // Create ensemble of different neural networks
            MLPClassifier mlp = new MLPClassifier()
                .setHiddenLayerSizes(128, 64)
                .setActivation("relu");
            
            CNNClassifier cnn = new CNNClassifier()
                .setInputShape(28, 28, 1)
                .addConvLayer(32, 3, 3, "relu")
                .addMaxPoolLayer(2, 2)
                .addDenseLayer(64, "relu");
            
            RNNClassifier rnn = new RNNClassifier()
                .setHiddenSize(64)
                .setCellType("LSTM")
                .setNumLayers(2);
            
            // Create ensemble
            NeuralNetworkEnsemble ensemble = new NeuralNetworkEnsemble()
                .addModel("mlp", mlp, 0.4)      // 40% weight
                .addModel("cnn", cnn, 0.35)     // 35% weight
                .addModel("rnn", rnn, 0.25)     // 25% weight
                .setVotingStrategy("weighted_average");
            
            System.out.println("πŸ”— Neural Network Ensemble:");
            System.out.println("- MLP: 40% weight");
            System.out.println("- CNN: 35% weight");
            System.out.println("- RNN: 25% weight");
            System.out.println("- Strategy: Weighted Average");
            
            // Train ensemble (each model on preprocessed data)
            // ensemble.fit(XTrain, yTrain);
            
            // Make ensemble predictions
            // double[] predictions = ensemble.predict(XTest);
            
        } catch (Exception e) {
            System.err.println("❌ Error in ensemble: " + e.getMessage());
        }
    }
}

Performance Optimization

Neural Network Performance Tips

public class NeuralNetworkOptimization {
    public static void main(String[] args) {
        System.out.println("=== Neural Network Performance Optimization ===\n");
        
        // 1. Batch Processing for High Throughput
        MLPClassifier optimizedMLP = new MLPClassifier()
            .setHiddenLayerSizes(256, 128, 64)
            .setBatchSize(128)              // Larger batch size
            .setMaxIter(50)                 // Fewer epochs
            .setEarlyStoppingPatience(5)    // Early stopping
            .setParallelTraining(true)      // Parallel processing
            .setGPUAcceleration(true);      // GPU acceleration if available
        
        // 2. Memory-Efficient Training
        CNNClassifier memoryEfficientCNN = new CNNClassifier()
            .setInputShape(224, 224, 3)
            .setBatchSize(16)               // Smaller batch for large images
            .setGradientAccumulation(4)     // Accumulate gradients
            .setMixedPrecision(true)        // Use FP16 for memory efficiency
            .setMemoryOptimization(true);
        
        // 3. Fast Inference Configuration
        RNNClassifier fastRNN = new RNNClassifier()
            .setHiddenSize(32)              // Smaller hidden size
            .setNumLayers(1)                // Single layer
            .setCellType("GRU")             // Faster than LSTM
            .setInferenceOptimization(true) // Optimize for inference
            .setBatchInference(true);       // Batch predictions
        
        System.out.println("πŸš€ Performance Optimizations:");
        System.out.println("- Batch processing: 128 samples/batch");
        System.out.println("- Early stopping: Prevent overfitting");
        System.out.println("- GPU acceleration: When available");
        System.out.println("- Memory optimization: FP16 precision");
        System.out.println("- Inference optimization: Fast predictions");
        
        // 4. Monitoring and Profiling
        long startTime = System.currentTimeMillis();
        
        // Training code here...
        
        long endTime = System.currentTimeMillis();
        System.out.println("\n⏱️ Training completed in: " + (endTime - startTime) + " ms");
        
        // Memory usage
        Runtime runtime = Runtime.getRuntime();
        long memoryUsed = runtime.totalMemory() - runtime.freeMemory();
        System.out.println("πŸ’Ύ Memory used: " + (memoryUsed / 1024 / 1024) + " MB");
    }
}

Best Practices

1. Data Preparation

  • Normalization: Always normalize input data for neural networks
  • Augmentation: Use data augmentation for image data
  • Sequence Padding: Ensure consistent sequence lengths for RNNs
  • Validation Split: Reserve data for validation during training

2. Architecture Design

  • Start Simple: Begin with smaller networks and increase complexity
  • Regularization: Use dropout and weight decay to prevent overfitting
  • Batch Normalization: Improve training stability and speed
  • Residual Connections: For very deep networks

3. Training Strategies

  • Learning Rate Scheduling: Decrease learning rate during training
  • Early Stopping: Monitor validation loss to prevent overfitting
  • Gradient Clipping: Prevent exploding gradients in RNNs
  • Checkpointing: Save model state during training

4. Production Deployment

  • Model Compression: Use pruning and quantization
  • Batch Inference: Process multiple samples together
  • Model Caching: Cache frequently used models
  • Performance Monitoring: Track inference time and accuracy

Troubleshooting Common Issues

Training Problems

// Problem: Vanishing gradients
// Solution: Use ReLU activation and proper weight initialization
MLPClassifier mlp = new MLPClassifier()
    .setActivation("relu")
    .setWeightInitialization("xavier")
    .setGradientClipping(1.0);

// Problem: Exploding gradients
// Solution: Gradient clipping
RNNClassifier rnn = new RNNClassifier()
    .setGradientClipping(1.0)
    .setLearningRate(0.001);

// Problem: Overfitting
// Solution: Regularization and dropout
CNNClassifier cnn = new CNNClassifier()
    .addDropoutLayer(0.5)
    .setL2Regularization(0.01)
    .setEarlyStoppingPatience(10);

Summary

In this tutorial, you learned:

  • MLP Implementation: Multi-layer perceptrons for tabular data
  • CNN Architecture: Convolutional networks for image processing
  • RNN with LSTM: Recurrent networks for sequence data
  • Preprocessing: Specialized data preparation for neural networks
  • Model Persistence: Saving and loading trained models
  • Performance Optimization: Tips for faster training and inference
  • Best Practices: Guidelines for production deployment

Neural networks in SuperML Java 2.1.0 provide enterprise-grade performance with real-time training capabilities. The framework handles the complexity of neural network implementation while providing you with simple, intuitive APIs.

Next Steps

  • Try XGBoost: Learn gradient boosting for tabular data
  • Explore AutoML: Automated neural architecture search
  • Model Deployment: Production deployment with inference engine
  • Advanced Preprocessing: Feature engineering for neural networks
  • Ensemble Methods: Combining multiple neural networks

You’re now ready to build sophisticated neural network applications with SuperML Java 2.1.0!