The architecture of a vision system is strongly related to the application it is meant to solve. Some systems are “stand-alone” machines designed to solve specific problems (e.g. measurement/identification), while others are integrated into a more complex framework that can include e.g. mechanical actuators, sensors etc. Nevertheless, all vision systems operate are characterized by these fundamental operations:

Image acquisition. The first and most important task of a vision system is to acquire an image, usually by means of light-sensitive sensor. This image can be a traditional 2-D image, or a 3-D points set, or an image sequence. A number of parameters can be configured in this phase, such as image triggering, camera exposure time, lens aperture, lighting geometry, and so on.

Feature extraction. In this phase, specific characteristics can be extrapolated from the image: lines, edges, angles, regions of interest (ROIs), as well as more complex features, such as motion tracking, shapes and textures.

Detection/segmentation. at this point of the process, the system must decide which information previously collected will be passed on up the chain for further elaboration.

High-level processing. The input at this point usually consists of a narrow set of data. The purpose of this last step can be to:

  • Classify objects or object’s feature in a particular class
  • Verify that the input has the specifications required by the model or class
  • Measure/estimate/calculate specifics parameters as position or dimensions of object or object’s features