Optical Character Recognition and Computer Vision Technology in UiPath

SINEM BATMACA | TEST SPECIALIST

Robotic Process Automation (RPA) can be basically referred to as software solutions that automate routine and repetitive processes. RPA is considered as a software that has a high impact on Return on Investment (ROI). Demands for RPA systems are increasing, as businesses have witnessed that RPA systems improve business performance. For this reason, many new RPA vendors have entered the market in recent years and among RPA software vendors, UiPath is the market leader.

Undoubtedly one of the most useful and versatile tools of the UiPath platform is the Record feature. Instead of designing process step by step, it enables design the process in the process performed by the user. The software robots tracks these clicks and actions on the screen and converts these steps into a workflow.

However, the record function struggles with distinguishing text fields and buttons, especially in Remote Desktop Applications such as Citrix. When recording in these environments, the entire application window is viewed as a button on the robots side.

In this context, Optical Character Recognition (OCR) technology is one of the solutions used to overcome the automation difficulties in remote environments. During process recording, a UiPath user can select an OCR machine on the UiPath Studio, select the text to be automated in the window, and allow the robot to find the text every time. The fact that the text is in a different place is not an obstacle for the robot to operate. Advanced character and image recognition software is one of the main reasons why RPA is so successful.

On the other hand, automating virtual desktop environments such as Citrix, VMware, VNC and Windows Remote Desktop with high reliability has been one of the challenges of OCR technology. The possibility that minor changes in the UI could disrupt automation leads to reliability and maintenance issues.

The image matching algorithm is prone to errors due to the possibility of changes in the appearance of the target elements. Computer vision ensures that these challenges are overcome with high reliability.

Computer vision technology is basically a feature that allows robots to “see” the screen, identify elements without the need of selectors, and image matching.

This feature, which enables robots to recognize user interfaces like human, is provided by an algorithm developed using a combination of Artificial Intelligence (AI), Machine Learning, Optical Character Recognition (OCR), text fuzzy matching technologies and a                         multi-anchoring system.

In addition to virtual desktop environment, Computer Vision can be used to identify elements in a variety of scenarios where traditional UI automation methods are challenged, including SAP, Flash, SilverLight, PDFs and even images.

Unlike traditional image automation, UiPath’s computer vision feature is not based on image matching. This makes it highly resistant to interface changes, including color, font, size and resolution changes. Computer Vision eliminates the dependence on selectors while maintaining workflows which known by RPA developers. The increase in the number of visible display elements makes further automation possible.

With the development of such technologies, we can say that robots have become more insightful in recent years, which proves that we are one step closer to software automation robots that can make complex decisions.

 

Resources

  1. https://www.uipath.com/blog/what-a-robot-sees-using-ocr-in-rpa
  2. https://www.uipath.com/blog/introducing-new-uipath-ai-computer-vision

 

Please follow and like us:

Leave a Reply