Thursday, January 30, 2020

[ebook] Deep Learning with JavaScript Neural networks in TensorFlow.js

Deep learning has transformed the fields of computer vision, image processing, and natural language applications. Thanks to TensorFlow.js, now JavaScript developers can build deep learning apps without relying on Python or R. Deep Learning with JavaScript shows developers how they can bring DL technology to the web. Written by the main authors of the TensorFlow library, this new book provides fascinating use cases and in-depth instruction for deep learning apps in JavaScript in your browser or on Node.

about the technology

Running deep learning applications in the browser or on Node-based backends opens up exciting possibilities for smart web applications. With the TensorFlow.js library, you build and train deep learning models with JavaScript. Offering uncompromising production-quality scalability, modularity, and responsiveness, TensorFlow.js really shines for its portability. Its models run anywhere JavaScript runs, pushing ML farther up the application stack.

about the book

In Deep Learning with JavaScript, you’ll learn to use TensorFlow.js to build deep learning models that run directly in the browser. This fast-paced book, written by Google engineers, is practical, engaging, and easy to follow. Through diverse examples featuring text analysis, speech processing, image recognition, and self-learning game AI, you’ll master all the basics of deep learning and explore advanced concepts, like retraining existing models for transfer learning and image generation.

detailed TOC

PART 1: MOTIVATION AND BASIC CONCEPTS

1DEEP LEARNING AND JAVASCRIPT

PART 2: A GENTLE INTRODUCTION TO TENSORFLOW.JS

2 GETTING STARTED: SIMPLE LINEAR REGRESSION IN TENSORFLOW.JS

3 ADDING NONLINEARITY: BEYOND WEIGHTED SUMS

4 RECOGNIZING IMAGES AND SOUNDS USING CONVNETS

5 TRANSFER LEARNING: REUSING PRETRAINED NEURAL NETWORKS

PART 3: ADVANCED DEEP LEARNING WITH TENSORFLOW.JS

6 WORKING WITH DATA

7 VISUALIZING DATA AND MODELS

8 UNDERFITTING, OVERFITTING, AND THE UNIVERSAL WORKFLOW OF MACHINE LEARNING

9 DEEP LEARNING FO R SEQUENCES AND TEXT

10 GENERATIVE DEEP LEARNING

11 BASICS OF DEEP REINFORCEMENT LEARNING

PART 4: SUMMARY AND CLOSING WORDS

12 TESTING, OPTIMIZING, AND DEPLOYING MODELS

13 SUMMARY, CONCLUSIONS, AND BEYOND

APPENDIXES

APPENDIX A: INSTALLING TFJS-NODE-GPU AND ITS DEPENDENCIES

APPENDIX B: A QUICK TUTORIAL OF TENSORS AND OPERATIONS IN TENSORFLOW.JS

what's inside

Image and language processing in the browser
Tuning ML models with client-side data
Text and image creation with generative deep learning
Source code samples to test and modify

about the reader

For JavaScript programmers interested in deep learning.

about the author

Shanging Cai, Stanley Bileschi and Eric D. Nielsen are software engineers with experience on the Google Brain team, and were crucial to the development of the high-level API of TensorFlow.js. This book is based in part on the classic, Deep Learning with Python by François Chollet.

https://www.manning.com/books/deep-learning-with-javascript

Saturday, January 4, 2020

DeepLearning With TensorFlowJS 4 - The intuitions behind Gradient-Descent Optimization

One-layer model is fitting a linear function f(input), defined as output = kernel * input + bias

The kernel and bias are tunable parameters (the weights) of the dense layer.

These weights contain the information learned by the network from exposure to the training data.

Initially, these weights are filled with small random values (a step called random initialization).

To find a good setting for the kernel and bias (collectively, the weights) we need two things:

A measure that tells us how well we are doing at a given setting of the weights. This is represented by a loss function measurement.

A method to update the weights’ values so that next time we will do better than we currently are doing, according to the measure previously mentioned. This is accomplished by an optimizer method i.e. the algorithm by which the network will update its weights (kernel and bias, in this case) based on the data and the loss function.

The compile() method specifies 'sgd' as the optimizer and 'meanAbsoluteError' as the loss.

'meanAbsoluteError' means that the loss function will calculate how far the predictions are from the targets, take their absolute values (making them all positive), and then return the average of those values:

meanAbsoluteError = average( absolute(modelOutput - targets))

'sgd' stands for stochastic gradient descent, a calculus formula to determine what adjustments should be made to the weights in order to reduce the loss.

The fit() method is the training process of a model in TensorFlow.js. It can often be long-running, lasting for seconds or minutes. Therefore, the async/await feature is used.

The evaluate() method calculates the loss function as applied to the provided example features and targets. It is similar to the fit() method in that it calculates the same loss, but evaluate() does not update the model’s weights.

The training loop iterates through the following steps:

1. Draw a batch of training samples x and corresponding targets y_true. A batch is simply a number of input examples put together as a tensor. The number of examples in a batch is called the batch size. In practical deep learning, it is often set to be a power of 2, such as 128 or 256. Examples are batched together to take advantage of the GPU’s parallel processing power and to make the calculated values of the gradients more stable.

2. Run the network on x (a step called the forward pass) to obtain predictions y_pred.

3. Compute the loss of the network on the batch, a measure of the mismatch between y_true and y_pred. Recall that the loss function is specified when model.compile() is called.

4. Update all the weights (parameters) in the network in a way that slightly reduces the loss on this batch. The detailed updates to the individual weights are managed by the optimizer, which was specified during the model.compile() call.

The loss as a function of all tunable parameters is known as the loss surface concept.

The loss surface for this example has a bowl shape, with a global minimum at the bottom of the bowl representing the best parameter settings.

In general, however, the loss surface of a deep-learning model is much more complex. It will have many more than two dimensions and could have many local minima i.e. points that are lower than anything nearby but not the lowest overall.

For larger problems i.e. when optimizing millions of weights, the likelihood of randomly selecting a good direction becomes vanishingly small.

A much better approach is to take advantage of the fact that all operations used in the network are differentiable and hence, to compute the gradient of the loss with regard to the network’s parameters.

The mathematical definition of a gradient specifies a direction along which the loss function increases. When training neural networks, the loss should gradually decrease. Therefore the weights should be moved in the direction opposite the gradient. This training process is aptly named gradient descent.

One of the most desirable properties of deep neural networks are that they are universal approximators. Which means they should be able to cover non-convex functions as well. The problem with non-convex functions is that your initial guess might not be near the global minima and gradient descent might converge to a local minima. A solution to this problem is the stochastic gradient descent approach.

(https://www.mltut.com/stochastic-gradient-descent-a-super-easy-complete-guide/)

The term “stochastic” means drawing random samples from the training data during each gradient-descent step for efficiency, as opposed to using every training data sample at every step. In short, stochastic gradient descent is simply a modification of gradient descent for computational efficiency.

Stochastic means nondeterministic or unpredictable. Random generally means unrecognizable, not adhering to a pattern. A random variable is also called a stochastic variable. (https://math.stackexchange.com/questions/114373/whats-the-difference-between-stochastic-and-random)

Friday, January 3, 2020

DeepLearning With TensorFlowJS 3 - Fitting The Model

This tutorial is based on the book Deep Learning With JavaScript (TensorFlowJS).

////
// Data Markers
////
const dataTraceTrain = {
  x: trainData.sizeMB,
  y: trainData.timeSec,
  name: 'trainData',
  mode: 'markers',
  type: 'scatter',
  marker: {symbol: "circle", size: 8}
};
const dataTraceTest = {
  x: testData.sizeMB,
  y: testData.timeSec,
  name: 'testData',
  mode: 'markers',
  type: 'scatter',
  marker: {symbol: "triangle-up", size: 10}
};
const dataTrace10Epochs = {
  x: [0, 2],
  y: [0, 0.01],
  name: 'model after N epochs',
  mode: 'lines',
  line: {color: 'blue', width: 1, dash: 'dot'},
};
const dataTrace20Epochs = {
  x: [0, 2],
  y: [0, 0.01],
  name: 'model after N epochs',
  mode: 'lines',
  line: {color: 'green', width: 2, dash: 'dash'}
};
const dataTrace100Epochs = {
  x: [0, 2],
  y: [0, 0.01],
  name: 'model after N epochs',
  mode: 'lines',
  line: {color: 'red', width: 3, dash: 'longdash'}
};
const dataTrace200Epochs = {
  x: [0, 2],
  y: [0, 0.01],
  name: 'model after N epochs',
  mode: 'lines',
  line: {color: 'black', width: 4, dash: 'solid'}
};

////
// Set up plotly plot.
////
Plotly.newPlot('dataSpaceWith4Lines', [dataTraceTrain, dataTraceTest, dataTrace10Epochs, dataTrace20Epochs, dataTrace100Epochs, dataTrace200Epochs], {
  width: 700,
  title: 'Model fit result',
  xaxis: {
     title: 'size (MB)'
   },
  yaxis: {
    title: 'time (sec)'
  }
});

////
// Construct and compile model.
////
const model = tf.sequential();
model.add(tf.layers.dense({
  units: 1,
  inputShape: [1],
}));
// Use a slower learning rate for illustration purposes.
const optimizer = tf.train.sgd(0.0005);
model.compile({optimizer: optimizer, loss: 'meanAbsoluteError'});

// Updates a specified line on the plot
function updateScatterWithLines(dataTrace,k, b, N, traceIndex) {
  dataTrace.x = [0, 10];
  dataTrace.y = [b, b + (k * 10)];
  var update = {
    x: [dataTrace.x],
    y: [dataTrace.y],
    name: 'model after ' + N + ' epochs'
  } 
  Plotly.restyle('dataSpaceWith4Lines', update, traceIndex);
}

// Initialize to k=0, b=0 for pretty illustration purposes.  
// You may want to remove this to see how the model looks with
// random initialization
let k = 0;
let b = 0;
model.setWeights([tf.tensor2d([k], [1, 1]), tf.tensor1d([b])]);

////
// Train the model within an async block
////
(async () => {
  await model.fit(trainTensors.sizeMB, trainTensors.timeSec, {
    epochs: 200,
    // Use callbacks to orchestrate certain functions at certain triggers.
    callbacks: {
      onEpochEnd: async (epoch, logs) => {
        k = model.getWeights()[0].dataSync()[0];
        b = model.getWeights()[1].dataSync()[0];
        // console.log(`epoch ${epoch}`);
        if (epoch === 9) {
          updateScatterWithLines(dataTrace10Epochs, k, b, 10,  2);
          console.log('wrote model 10');
        }        
        if (epoch === 19) {
          updateScatterWithLines(dataTrace20Epochs, k, b, 20,  3);
          console.log('wrote model 20');
        }        
        if (epoch === 99) {
          updateScatterWithLines(dataTrace100Epochs, k, b, 100, 4);
          console.log('wrote model 100');
        }        
        if (epoch === 199) {
          updateScatterWithLines(dataTrace200Epochs, k, b, 200, 5);
          console.log('wrote model 200');
        }        
        await tf.nextFrame();
      }
    }
  });
})();  
  
</script>

https://codepen.io/tfjs-book/pen/VEVMMd

Thursday, January 2, 2020

DeepLearning With TensorFlowJS 2 - Plotting Tensor Data

This tutorial is based on the book Deep Learning With JavaScript (TensorFlowJS).

Tensors

Tensors are the core data structure of TensorFlow.js

Tensors can also be thought of as containers for numbers.

They are a generalization of vectors and matrices to potentially higher dimensions.

The number of dimensions and size of each dimension is called the tensor’s shape.

Declaring a tensor

// Pass an array of values to create a vector.

tf.tensor([1, 2, 3, 4]).print();

// Pass a nested array of values to make a matrix or a higher

// dimensional tensor.

tf.tensor([[1, 2], [3, 4]]).print();

//Creates rank-1 tf.Tensor with the provided values, shape and dtype.

tf.tensor1d([1, 2, 3]).print();

//Creates rank-2 tf.Tensor with the provided values, shape and dtype.

// Pass a nested array.

tf.tensor2d([[1, 2], [3, 4]]).print();

(REFERENCE: https://js.tensorflow.org/api/latest/#tensor)

Plotly.js is a charting library that comes with over 40 chart types, 3D charts, statistical graphs, and SVG maps.

(REFERENCE: https://www.w3schools.com/js/js_graphics_plotly.asp )

<p> Plot of 'lattitude' feature vs row index<p>
<div id="dataSpace" class="plots"></div>

const trainData = {
  sizeMB:  [0.080, 9.000, 0.001, 0.100, 8.000, 5.000, 0.100, 6.000, 0.050, 0.500,
            0.002, 2.000, 0.005, 10.00, 0.010, 7.000, 6.000, 5.000, 1.000, 1.000],
  timeSec: [0.135, 0.739, 0.067, 0.126, 0.646, 0.435, 0.069, 0.497, 0.068, 0.116,
            0.070, 0.289, 0.076, 0.744, 0.083, 0.560, 0.480, 0.399, 0.153, 0.149]
};
const testData = {
  sizeMB:  [5.000, 0.200, 0.001, 9.000, 0.002, 0.020, 0.008, 4.000, 0.001, 1.000,
            0.005, 0.080, 0.800, 0.200, 0.050, 7.000, 0.005, 0.002, 8.000, 0.008],
  timeSec: [0.425, 0.098, 0.052, 0.686, 0.066, 0.078, 0.070, 0.375, 0.058, 0.136,
            0.052, 0.063, 0.183, 0.087, 0.066, 0.558, 0.066, 0.068, 0.610, 0.057]
};

trainXs = tf.tensor2d(trainData.sizeMB, [20, 1]);
trainYs = tf.tensor2d(trainData.timeSec, [20, 1]);
testXs = tf.tensor2d(testData.sizeMB, [20, 1]);
testYs = tf.tensor2d(testData.timeSec, [20, 1]);

const dataTraceTrain = {
  x: trainData.sizeMB,
  y: trainData.timeSec,
  name: 'trainData',
  mode: 'markers',
  type: 'scatter',
  marker: {symbol: "circle", size: 8}
};
const dataTraceTest = {
  x: testData.sizeMB,
  y: testData.timeSec,
  name: 'testData',
  mode: 'markers',
  type: 'scatter',
  marker: {symbol: "triangle-up", size: 10}
};

Plotly.newPlot('dataSpace', [dataTraceTrain, dataTraceTest], {
  width: 700,
  title: 'File download duration',
  xaxis: {
     title: 'size (MB)'
   },
  yaxis: {
    title: 'time (sec)'
  }
});
</script>

https://codepen.io/tfjs-book/pen/dgQVze

Polyglot Studio

-

- -