Artificial vision for an android robot
Principles and frameworks to give a robot the ability to interact with its environment.
Our humanoid robot will be functional only when it has at least an arm and a prehensile hand and a vision system that can be based in a primary way on a shooting device, coupled to a software for image processing.
Principles of computer vision
This vision system must be able to accomplish a set of tasks...
- The robot must have a memory of all the objects in its environment.
- It must be able to identify new objects and add them to the bank.
- Recognizing an object on an image by identifying the definition of the object already in the database.
- The robot must handle an object based on the objective assigned, to anticipate the results of actions that could be done to achieve a goal, taking into account all other objects of the environment.
- It must be able to manipulate objects, so evaluate their distance, their proportions relative to its artificial hands, how to hold them.
- Detect and identify gestures.
- It must learn and then generalize a process to apply to objects of various sizes, weights, and shapes.
To do this, it must have the following resources...
- An image capture apparatus. It can be a camera or a video camera.
- A database to save and find objects instantly. Software such as Redis or PostgreSQL can fit.
- A pattern recognition program to identify objects. This is based on computer vision algorithms capable of identifying already known objects or even recognize a moving object which it is not already known (on a sequence of pictures or a video).
- A gesture detection device like Kinect.
- An interface with the recognition software giving the operator a view and control over the operations.
Some of these resources may be provided by existing frameworks.
Frameworks to artificial vision
Open source under the BSD license.
It offers a number of algorithms to identify objects and put them in predefined classes such as people, faces, car, house etc...
Example of use:
- Pattern recognition.
- Face detection.
- It is used for interactive art.
- For an HDR image from multiple exposures. The framework includes a photography library.
- Various other image manipulations.
- Learning with the machine learning module.
Works with programs writen in C++, Java and Python on Windows, Linux, Android, iOS. The file size to download is 350 megs.
Framework to build applications using artificial vision. It is based on OpenCV and thus constitutes an interface to a simplified use of the latter.
It has a JavaScript and CoffeScript version, SimpleCV-JS. It works with Node.js. The demonstration is barely amazing, you will need to make you opinion by you own experience.
Library provided by Qualcomm, the maker of processors. This computer vision-based module works with another module for augmented reality by the same author, and is designed for mobiles but could also suit robots as long as they use ARM processors.
Its features:
- Gesture recognition.
- Face recognition.
- Association of information to actual view of places (augmented reality).
This program requires Java and the Android SDK to work as well as development tools for the latter.
Specializing in computer vision, as opposed to OpenCV, it proposes a set of algorithms for object recognition.
- Facial recognition.
- Detection of difficult to identify objects.
- Algorithm Learning Tracking Detection, also called Predator, which identifies objects at a first view and follows their movements. You may also download this algorithm directly on GitHub.
- And some others.
So another easier than OpenCV but less comprehensive alternative.
C++ libraries for computer vision. It is composed of several independent and lightweight modules.
Cambridge Video Dynamics/LibCVD
Library of C functions for computer vision and other image processing. LGPL license, runs on Windows and Linux.
These programs are all based on algorithms and it seems that imperative programming is most commonly used for this type of processing to date. This does not exempt to consider other modes, based on different programming paradigms, facilitating machine learning. Declarative programming is particularly suitable to describe an environment and reactive programming for a system of interacting objects.