Nvidia Digits - The Beginner's Guide

In this article

Dataset

Dataset source

The MNIST is a dataset of handwritten digits, including 0 to 9 digits. There are two ways to download the datasets, the first is to convert the files into images by  following the instructions of the website ( http://yann.lecun.com/exdb/mnist/). The second is download the datasets through kaggle(https://www.kaggle.com/, three tested dataset for references is listed in the following Table 1-1.

Table 1-1

Nvidia digits dataset

Nvidia digits dataset puts images in different folders to classify them. Take MNIST for example, the images which represent 0 to 9 will be classified to the corresponding folders, as shown in Figure 1-1.

Figure 1-1

The format of images in the dataset must be one of image/jpeg and image/png, the size difference with the training model listed in Table 1-2. 

Table 1-2

Model Size
LeNet 28x28 Gray
AlexNet 256x256
GoogLeNet 224x224

Take MNIST for example, the size should be 28x28 and grayscale as Figure 1-2.

Figure 1-2

For the use of loading datasets, put the training dataset which followed the above image format into the data folder. Take MNIST for example, training dataset and testing dataset is included, so the training dataset and testing dataset will be put into different folders named training and testing in the data folder respectively.


Dataset instructions

The following instructions will be based on the second method in the chapter Dataset source to download dataset from kaggle, and explain the operations from download datasets to put the datasets into corresponding folders.

First, open a new tab in the browser, insert the url with the row of Table 1-2 ( https://www.kaggle.com/prokaggler/mnistpng), and the webpage will display as Figure 1-3.

Figure 1-3

As shown in Figure 1-4, there is a button with Download(17MB) in the right above corner of the webpage. A downloading file named archive.zip will display in the left bottom of the page if the browser is Google Chrome. Training dataset and testing dataset is included in the archive.zip.

Figure 1-4

Unzip the download archive.zip file, there are training and testing folders in the mnist_png folder. The training folder and testing folder will both contain the handwritten dataset samples of 0 to 9 digits. Figure 1-5, Figure 1-6 and Figure 1-7 in the below for references.

Figure 1-5

Figure 1-6

Figure 1-7

At last, for the use of loading datasets, copy the training and testing folder to the data folder mentioned in chapter Nvidia digits dataset, as shown in Figure 1-8.

Figure 1-8


Home

Get into the home page of digits by opening the browser and insert http://127.0.0.1:5000 in url, as shown in Figure 2-1.

Figure 2-1


Load Dataset

1

Click on the Datasets tab in the Home page and after clicking on the Images button on the right side of the page, then choose the Classification, as shown in the Figure 3-1 and Figure 3-2 respectively.

Figure 3-1

Figure 3-2

The browser will turn into the login page, if you have not logged in yet. Insert any user name as you want, the login page is for recognizing the owner of the datasets or the models only, as shown in Figure 3-3.

Figure 3-3

2

Start from the left block to choose the related information of the dataset, as the description in the chapter Dataset, the Image Type is Grayscale and the Image Size is 28x28, take Figure 3-4 for reference.

Figure 3-4

Then the right block should be filled with the folder path of the dataset, please fill the folder path we mentioned in the chapter Dataset. The minimum samples and the maximum samples can be set to the training dataset respectively, the class will be ignored directly if the number of the samples in the folder is less than minimum samples. However, the class will only load the number of maximum samples of the samples in a class if the number of the samples is larger than the maximum samples. The training dataset can be divided into training dataset, validation dataset and testing dataset,  the percentage of the validation dataset and testing dataset to all dataset can also be set in this block.

The dataset is including training dataset and testing dataset already in MINIST, only the validation dataset should be set, so the percentage of the validation dataset is 25% and the testing dataset is 0%, as shown in Figure 3-5.

Figure 3-5

The last is the block below, DB Backend and Image Encoding can be set according to the different requirements, and the Group Name is optional and Dataset Name is required to be set on the loading dataset, starts to load the dataset after clicking on the Create button.

In the MNIST case, as shown in Figure 3-6, DB Backend is LMDB, Image Encoding is PNG and Dataset Name is mnist.

Figure 3-6

3

The browser will open a new page after finishing the previous step, as shown in Figure 3-7, the Job Information and Parse Folder block will show the related settings we do in the previous step, and the block in the right will show the current status of the loading.

Figure 3-7

The percentage of the validation of the dataset is 25% and testing dataset is set to 0% due to training dataset and testing dataset is included in the MNIST already. There will be 2 bar graphs after scrolling down, the 2 bar graphs are corresponding to the training dataset and validation dataset, containing 0 to 9 the number of samples in each class. The testing dataset will not show due to the percentage setting in step 2 is 0%. Take Figure 3-8 and Figure 3-9 for references respectively.

Besides, please download the train.txt and val.txt under Input File (before shuffling) in each bar graph, for the use of testing trained models.

Figure 3-8

Figure 3-9

After finishing loading datasets, as shown in Figure 3-10, the total number of the loaded dataset will display under DB Entries. The training dataset is 49999 and the validation dataset is 15001, it meets the percentage settings in step 2 (training dataset is 49999/65000=77%, validation dataset is 15001/65000=23%).

Figure 3-10

Back to the home page by clicking on DIGITS in the upper left of the webpage, the loaded dataset will display in the tab Datasets, as shown in Figure 3-11.

Figure 3-11


Train Model

1

As shown in Figure 4-1 and Figure 4-2, click on the tab Models in the home page. Click on the drop-down button Images in the right of the page, start to train the model by clicking on the Classification.

Figure 4-1

Figure 4-2

2

It will turn to a new page after clicking on the Classification, as shown in the Figure 4-3. Choose the dataset to train the model in the left block, and the different settings for the training model can be set in the middle and right block. The MNIST example uses the default value for the training model.

Figure 4-3

Then scroll down for Network settings, as shown in Figure 4-4, in the MNIST case choose the LeNet in the block Standard Networks. At last, click on the Create button to train the model after filling with optional Group Name and required Model Name.

Figure 4-4

3

The browser turns to a new page after clicking on the Create button. Take Figure 4-5 for references, the Dataset block in the middle of the page shows the dataset information of the settings in the step2. The training status shows in the block Job Status in the right of the page.

Figure 4-5

After finishing training the model, the x-axis of the below figure shows 0 to 30 epoches due to the use of the default value is 30 epoch. The accuracy and loss information of each training and validation dataset will be shown Figure 4-6.

Figure 4-6

The learning rate information also shows in the below figure, take Figure 4-7 for references.

Figure 4-7


Test Trained Model

Scroll down to the same page of the trained model, the block Trained Models will be shown as Figure 5-1.

Figure 5-1

Test trained model is divided into 2 methods, the first is classifying by one image, and another one is classifying by multiple images. The following chapters describe the operations of 2 methods respectively.


Classify one image

First, choose the trained model you want to test in the drop-down menu below Select Model. Classifying one image can also be divided into 2 ways. The first is to choose the image in the test dataset from the data folder (hint: the chapter Dataset instructions). Since the test dataset has been put into the corresponding data folder, as shown in the Figure 5-2, the path of the test dataset starts with /data/testing/. After filling with the path, click on the Classify One button to test the trained model.

Figure 5-2

The browser turns to the result page after clicking on the Classify One button, the test image displays on the left of the page. The table next to the test image shows the prediction results with the percentage of the first 5 possible classes. The ground truth of the test image is 5, as shown in Figure 5-3, the possible classes are 5, 6, 8, 9 and 2, the corresponding percentages are 99.99%, 0.01%, 0%, 0% and 0% respectively. 

Figure 5-3

Another way to classify one image is to upload the image from your own computer. To choose the model you want to test as the same we do in the above. Click on the Browse button under Upload image to choose the test image. To start the test by clicking on the Classify One button.

Check on the Show visualization and statistics option above the Classify button as Figure 5-4 additionally, the analysis information of the prediction will also  display in the result page.

Figure 5-4

Same to display the test image in the left of the page, next to the test image is the prediction result, as shown in Figure 5-5.

Figure 5-5

Due to the Show visualization and statistics has been checked, as shown in Figure 5-6, the analysis information of the predictions also display in the same page below.

Figure 5-6


Classify multiple images

To classify multiple images in the right side of the Trained Models block. The files named train.txt and val.txt have been downloaded for the use of testing that is mentioned in the step 3 of chapter Load dataset. val.txt will be taken as an example in the following chapter, as shown in Figure 5-7. Click on the Browse below the Upload Image List in the Test a list of images block, then clicking on the Classify Many button after choosing the val.txt.

Figure 5-7

As Figure 5-8 shown, on the top of the page displays the prediction summary. Top-1 accuracy represents the percentage of the most probable result is the correct answer when predicting an image. Top-5 accuracy represents the percentage of the correct answer that is included in the first 5 probable results when predicting an image.

Figure 5-8

Scrolling down in the page displays the prediction results of each class in the table shown as Figure 5-9. Take the first row 0 as an example, the accuracy of the prediction to 0 is 99.73% due to 1477 images being predicted to the correct result 0, 2 images being predicted to the wrong result 2, and 2 images being predicted to the wrong result 6. 

Figure 5-9

In the bottom of the page lists the first 5 prediction results for each of the images in the val.txt, as shown in Figure 5-10.

Figure 5-10

Still need help? Contact Us Contact Us