Reading a hand-written Bank Slip Without using any OCR or Cognitive Service

When I first thought about this Project, I did not even know the hurdles that is gonna come. But, eventually I found the way-around. Thanks to all those who shared their knowledge generously, so that a new can learn easily. Time to give back to the community. To cut short I put limited details. The Github link will have more details….

The initial concept came from the following website. I took the concept or the main steps only. I found better reusable code in other sites. But this is the first step for me and I really like the way he represented the entire solution. Do refer to his article and it is also a good project to try.

https://medium.com/@surya.kommareddy/number-recognition-using-convolutional-neural-networks-part-1-5dc8a394b0cf

 

To keep it short I will start with the hurdles and how I overcome each of them and finally what actually increased the accuracy.

I started with:

  • A scanned copy of the actual slip

  • For detecting the digits I compiled a CNN model based on LeNet 5 and train it with MNIST Hand-Written Digits data.

 

Hurdles

  • Some digits were even on the Box Boundaries (which is quite normal). But this lead to a high probability of mis-predictions as the engine will not able to isolate them, and will assume that part of the digit
  • Even after everything, the accuracy of the model was not good.

Some digits were even on the Box Boundaries (which is quite normal). But this lead to a high probability of mis-predictions as the engine will not able to isolate them, and will assume that part of the digit

Trick 1: After trying all possible options, I decided to just take the “Blue” text out of the image, which is possible using OpenCV and which gives an image like this…

This is what exactly makes the thing relatively easy. 🙂

Even after everything, the accuracy of the model was not good at all 🙁

  • This put me under great stress as my whole effort was about to go in vain. The model was detecting 7 as 2, 2 as 7, 9 as 7 blah blah… not at all satisfactory.
  • So, what actually was wrong in all these?
  • And here what I found…

The issue with MNIST:

Yes, the raw MNIST has some real issues. If you notice the RAW images carefully, you might also notice.

If you carefully look at all the 7s in MNIST, you will see the bottom of most of the 7s are touching the borderline. Whereas for 2 it seldom touches. This kind of bias creates the problem if the data that you are sending is not following the same pattern.

Trick 2: As a solution, I did the same pre-processing steps for MNIST data which I used before sending the digits detected from the Bank Slip and recompiled my model with the new dataset. The sample 7s and 2s are given below…

Processed MNIST Digit 2:

Processed MNIST Digit 7:

If you like to embark to a similar kind of project, can refer the following few useful websites/blogs…

etc.

I will upload the code to my Github repository. You may download all of these from there. Do let me know if you have some ideas and how you fixed it. Till then Happy Learning !! 🙂

 

Please follow and like us: