When I first thought about this Project, I did not even know the hurdles that is gonna come. But, eventually I found the way-around. Thanks to all those who shared their knowledge generously, so that a new can learn easily. Time to give back to the community. To cut short I put limited details. The Github link will have more details….
The initial concept came from the following website. I took the concept or the main steps only. I found better reusable code in other sites. But this is the first step for me and I really like the way he represented the entire solution. Do refer to his article and it is also a good project to try.
To keep it short I will start with the hurdles and how I overcome each of them and finally what actually increased the accuracy.
I started with:
- A scanned copy of the actual slip
- For detecting the digits I compiled a CNN model based on LeNet 5 and train it with MNIST Hand-Written Digits data.
Hurdles
- Some digits were even on the Box Boundaries (which is quite normal). But this lead to a high probability of mis-predictions as the engine will not able to isolate them, and will assume that part of the digit
- Even after everything, the accuracy of the model was not good.
Some digits were even on the Box Boundaries (which is quite normal). But this lead to a high probability of mis-predictions as the engine will not able to isolate them, and will assume that part of the digit
Trick 1: After trying all possible options, I decided to just take the “Blue” text out of the image, which is possible using OpenCV and which gives an image like this…
This is what exactly makes the thing relatively easy. 🙂
Even after everything, the accuracy of the model was not good at all 🙁
- This put me under great stress as my whole effort was about to go in vain. The model was detecting 7 as 2, 2 as 7, 9 as 7 blah blah… not at all satisfactory.
- So, what actually was wrong in all these?
- And here what I found…
The issue with MNIST:
Yes, the raw MNIST has some real issues. If you notice the RAW images carefully, you might also notice.
If you carefully look at all the 7s in MNIST, you will see the bottom of most of the 7s are touching the borderline. Whereas for 2 it seldom touches. This kind of bias creates the problem if the data that you are sending is not following the same pattern.
Trick 2: As a solution, I did the same pre-processing steps for MNIST data which I used before sending the digits detected from the Bank Slip and recompiled my model with the new dataset. The sample 7s and 2s are given below…
Processed MNIST Digit 2:
Processed MNIST Digit 7:
If you like to embark to a similar kind of project, can refer the following few useful websites/blogs…
- https://www.learnopencv.com/ (Highly Recommended)
- https://medium.com/@kananvyas
- https://medium.com/@surya.kommareddy/number-recognition-using-convolutional-neural-networks-part-1-5dc8a394b0cf
etc.
I will upload the code to my Github repository. You may download all of these from there. Do let me know if you have some ideas and how you fixed it. Till then Happy Learning !! 🙂

As the world is fast moving from Robotic Process Automation to IA (Intelligent Automation), my current objective is to bring in cutting age AI and Machine learning capabilities in use for ongoing and upcoming Projects. Build capabilities inside the team so that they can contribute and support these kinds of high-end projects. Perform POCs to increase customer focus and engagement in IA space.
I have more than 14 years of experience in implementations, troubleshooting & developments on various technologies:
1) Blue Prism, UiPath, Nice RPA.
2) Tableau, Power BI, Cognos.
3) Nice Recording solution and Desktop Tagging.
4) Verint WFO & WFM solutions,
5) Avaya AES & PBX systems.
6) Nortel, Genesys & other contact center Products.
I am capable of coding in both high level as well as low-level Programming languages. Have sound knowledge of Linux, Unix & Windows systems. I am also aware of multiple scripting languages which help me to do product customization & modifications.