Machine Vision and Intelligent Systems

Professor Matti Pietikäinen and Professor Juha Röning, 
Information Processing Laboratory and Computer Engineering Laboratory, 
Department of Electrical Engineering, University of Oulu


Background and Mission

Machine vision is a continuously growing area of research dealing with processing and analyzing of image data. It plays a key role in the development of intelligent systems. The principal goal of research on intelligent systems is to develop machines that have the ability to perceive, reason, move, and learn from experiences.

The mission of the MVIS group is to carry out leading edge long term research on machine vision and intelligent systems technology, with an aim of bringing the results of the research close to applications.

The research on machine vision focuses on problems in texture analysis, color and face image analysis, document image analysis, tracking and motion estimation, 3D modeling and camera calibration, and visualization-based user interfacing. The research on intelligent systems concentrates on context-aware mobile systems, intelligent service robots, neural network methodology for signal analysis, and analysis of biomedical ECG and EEG signals.

The research in the above areas provides the basis for original applied research in various areas, including visual inspection, mobile communications, multimedia, intelligent interfaces and medical engineering.

The group co-operates with many international and domestic partners. In applied research, the group has played central roles in European Esprit projects and several joint projects funded by the National Technology Agency (Tekes) and industry. The group and its members are active in the scientific community. For example, in 2001 the group co-organized with the MediaTeam group an international Infotech Workshop on Information Retrieval (IR 2001), Prof. Matti Pietikäinen served on the editorial boards of two major journals, and several members of the group were on the committees of international conferences.

Dr. Janne Heikkilä won the prestigious young investigators prize of Tekniikan edistämissäätiö (the Finnish Foundation for Technical Advancement) and senior investigator's prize of the University of Oulu, Heikki Pylkkö won the award for the best Finnish Master's thesis in pattern recognition in 2000 (Pattern Recognition Society of Finland) and Matti Niskanen the award for the best Master's thesis in machine vision in 2000 (Vision Club of Finland). Our paper on unsupervised training in visual inspection by Matti Niskanen, Olli Silven and Hannu Kauppinen received the best paper award in the OCAV 2001 conference in France.

The activities of the MVIS group are led by professors Matti Pietikäinen (Director), Juha Röning (Associate Director), Olli Silvén and Tapio Seppänen.

Scientific Progress

Machine vision

Image texture analysis is an important fundamental problem in machine vision. During the past few years, the group has developed theoretically and computationally simple but very efficient nonparametric approaches to texture analysis based on Local Binary Patterns (LBP) and signed gray level differences. In 2001, LBP-type operators for multiresolution gray scale and rotation invariant texture classification were further investigated. A paper describing this method was accepted for publication in the prestigious IEEE Transactions on Pattern Analysis and Machine Intelligence journal. Joint use of color and texture in classification was studied, suggesting separate processing of color and pattern information. A real-time color and texture based method for defect recognition was also developed. A new framework for empirical evaluation of texture analysis algorithms is under development. For this purpose, a very large texture image database, Outex, was created containing over 300 different textures imaged with three different illuminants, nine rotations and six spatial scales ( ).

Techniques for color-based skin detection and face tracking in varying illumination conditions were further investigated. Our approach makes use of a chromaticity constraint called skin locus to select pixel candidates for adapting the face color distribution to the color shift caused by the illumination change. The skin color was not only studied on images; also spectral knowledge of it was utilized. The skin color signals, containing information both about reflectances of skin and about spectral power distributions of illumination, were also studied. A method combining skin locus based face modeling with the mean shift algorithm was developed for color-based face tracking. To make a reference for comparing different tracking methods, a new face video database was created with different cameras. It contains face videos with drastic color changes, color images with different camera calibrations and illuminations, ground truths for face localization, and spectral data.

A new approach to face detection in color images using a skin locus model and hierarchical filtering was developed. The method provides promising results in varying conditions, allowing the sizes and orientations of faces to differ in varying backgrounds and illumination. Approaches to face recognition were also investigated and some algorithms were implemented. When looking for issues to improve the face learning process, the properties of a new unsupervised learning approach, locally linear embedding, were investigated. The locally linear embedding algorithm gave also promising preliminary results in a visual inspection task.

In our research on grain size distribution determination, the purpose is to develop techniques for on-line grain size measurement in industrial processes. Methodology based on texture analysis and a non-segmenting approach is being developed. A new texture analysis method can efficiently discriminate different grain classes from each other.

In the research on plant vitality measurement, our concept of visualization-based user interfacing for inspection applications was adopted and further developed. With plants, the concept is used for investigating the effect of different features in spectral image segmentation. For observing the daily variations of plants, more test material was collected in a greenhouse environment.

In document image analysis, research on extracting text information from document images with complex colored and textured backgrounds was carried out. A simple texture-based method for text localization using edge information was proposed, as well as a method for finding horizontal and vertical text lines from complex binarized document images. New research concerning text extraction from scene images was begun.

In 2001, the research on image sequence analysis was focused on human tracking methods, transform domain video analysis, video coding, video surveillance, video indexing, video quality measurement, and geometric camera calibration.

Human tracking based on dominant motion extraction using particle filtering has been further developed. A real-time software module of this method was implemented for a PC platform as a part of a framework for a general purpose environment for visualizing and testing different tracking methods. Also, various regularization techiques have been explored for improving the spatial coherence of the human tracking results. The most prominent application area of this research is visual surveillance, but also other applications, including video coding, has been investigated.

In transform domain video analysis, an extensive amount of image transforms have been studied for several purposes. The main interests have been in block motion estimation and in object recognition. For motion estimation, a novel solution based on number theoretic transform has been developed. A highly optimized software module has been implemented for a general purpose processor that can be used in conjunction with a video encoder. A hardware implementation of this method is currently under development. Also, some other transforms have been found to be useful in motion estimation. For object recognition purposes, a completely new group of image transforms was proposed that can be invariant with respect to different linear transformations including affine transformation.

In video coding, a content based rate control technique has been developed which is able to utilize classification between foreground and background objects for adjusting the compression quality according to the corresponding priorities and the channel capacity. The new solution is compatible with most of the current video coding standards and it is also transparent for the decoder, and thus, only the encoder has to be modified.

The research carried out in video surveillance is closely related to video coding work. The main objective in this area is to develop solutions for wireless video surveillance systems. As a result, a framework for a stand-alone system has been produced, and a test system has been implemented with capabilities for preselecting, compressing, and wirelessly transmitting high resolution video, for example, from a moving vehicle to a remote control centre.

In video indexing, a Java based software package Video Data Mining System (VDMS) has been implemented for content based video retrieval. The system is capable of analyzing MPEG video clips based on certain feature data that are computed directly from the video stream. Different types of videos, such as music, sports and news, are then indexed using this information, and the user can make interactive visual searches of these videos with the aid of a new kind of a user interface, which represents the video database as two dimensional maps where similar videos are located in proximity to each other.

In video quality measurement, a novel solution for analyzing the video transmission quality has been developed. It can be used for measuring the effects of the bit errors in digital video, and to indicate the needs for improving the level of error correction. The solution used does not require any reference or test video to be transmitted but it can basically utilize the normal transmission mode without knowing the original source. The tests have shown that the quality measure obtained is comparable with the traditional signal to noise ratio measurement using known source data.

Camera calibration methodology has also progressed during this period. A new calibration procedure based on a coplanar target model and a linear algorithm has been proposed. A common problem of camera calibration in industrial applications is the need for large calibration targets that are difficult to make and maintain in a known three dimensional shape. Using this new method it is possible to utilize targets where the control points are located on a single plane, and still to achieve good precision of the camera parameters in a short period of time, whereas the traditional methods would have required much more computation.

Intelligent systems

The research on intelligent systems concentrates on context-aware mobile systems, intelligent service robots, neural network methodology for signal analysis, and analysis of biomedical ECG and EEG signals. The research on context-aware systems focuses on software architecture, 3-D sensing technology, machine vision, context recognition and control methods. Distributed software architecture is being developed for mobile context-aware systems. The architecture offers well defined and reusable interfaces for different resources like sensors, actuators, computing devices and user interfaces. The location and implementation of a software component is transparent for the rest of the system.

The architecture is being applied both to systems controlling mobile service robots and to ubiquitous systems serving mobile users. In 2001, a general agent-based architecture, Genie of the Net, for managing services on behalf of the user was developed. A first prototype, the Family Calendar, was completed and demonstrated. This work was in co-operation with VTT Electronics. The work on mobile robots continued on basic robot resources like motion control and vision. This was realized in a new, Finnish Academy funded project "Robots Serving Humans" . The goal of this project is to develop methods and components for the next generation intelligent service robots. In this project, we emphasize the role of robust senses (especially color vision) and learning (self-organization). The main application, and thus the test-bed for the developed methods, is intelligent telepresence, where the robot is a semi autonomous agent providing a convenient way to monitor and access remote environments for a human user.

Evolutionary methods to evolve neural controllers for a mobile robot were studied. An approach called Evolutionary Neuron Migration was use to evolve neural control structures for the mobile robot. The neural structures were able to solve real problems in perception and control. The feasibility of the approach was demonstrated by evolving robust navigation behavior for a real robot.

A robust color tracking system has been developed for a mobile robot too, utilizing the expertise of the group in color vision. The system is capable of tracking objects of different colors in various lighting conditions. With this tracking system, a mobile system can focus its attention on the colored objects on the scene, and track a person based on skin color as well. The first successful experiments were performed with the color tracking system to control a soccer robot. The robot was able to find a moving ball and approach it. An article about the color tracking system was published in the IEEE International Conference on Robotics and Automation.

Previous work on situated control architecture was continued. Our control system for playing simulated soccer acted as a qualifying competitor for the RoboCup'99. A student group participated in the simulation league of the fourth world cup of robotic soccer, RoboCup'01, held in the USA. In this competition, the games of the simulation league were shown to the audience using the 3D visualization software developed by the student group.

The distributed software architecture was further developed. A common interface was developed for all robotics resources. Furthermore, methods to dynamically construct state machines for robot controllers were studied.

New data mining methods were developed for steel plant quality control. Data mining contains three closely related workflows: data preprocessing, steel strip quality prediction, and furnace control data study. The data preprocessing aims at finding variables and data items that make a significant contribution to defective coils. The methods for this purpose were statistical analysis, visualization of parallel coordinates, k-means clustering and self-organizing maps. The steel strip quality study utilized neural networks for predicting the temperatures and dimensions of steel strips after the finishing mill. The modeling was done with the information available before the finishing mill phase. In the furnace control data study, an adaptive neural network model was built for predicting the post roughing mill temperature of the pre-rolled steel strip. The furnace control software was written based on the results of the study. The development of an intelligent furnace control system continues.

Research into ECG signals has been conducted. The main goal in the project is to study heart rate variability of patients and healthy people in order to develop methods for automatic diagnosis and prognosis for clinical use. The medical part of the research is performed in the Oulu University Hospital, while our group is responsible for signal analysis algorithms. The work is highly international, including universities from Europe and the USA. Due to the success in the scientific research, the ECG team was granted the 3-year status of Quality Research Unit of the University of Oulu. This enabled the opening up of a new research area, the analysis of T-wave dynamics and morphology. During 2001, three journal papers on heart rate variability methods were published, and four journal papers were submitted for evaluation.

Research on EEG signals was launched in 1999. The goal of this study is to develop methods for automatic assessment of the depth of anesthesia and physical reaction to pain in clinical operations. The work is performed in cooperation with the Oulu University Hospital and our EEG team is responsible for algorithm development. During the year 2001, research was directed towards inclusion of other types of biosignals to the analysis system, too, and performing sensor fusion for estimating better the above mentioned parameters. During 2001, one journal article was published and two were accepted for publication.


Exploitation of Results

The results of our research were applied to real-world problems in many projects, often in collaboration with industrial and other partners. Some examples of exploitation are described below.

In the research on wood surface inspection, a software package demonstrating the visualization-based user interfacing was delivered to some inspection system manufacturers and end users. The concept will be
integrated into inspection devices. The approach provides easy and reliable training and maintenance of an inspection system classifier, and is suitable for real-time implementation.

Inspection methods developed by the group have been used in the wood grading system being developed for VTT Building Technology, and the non-supervised learning and user-interface approach developed by the group is used in the 3-D steel surface inspection systems made by Thermo Radiometrie.

The European Esprit project called "Intraoperative real-time visualization and instrument tracking for MRI" (IRVIT) carried out in 1998-99 demonstrated the feasibility of an interactive MRI intervention system, which features real-time visual feedback. A follow-up project "Advanced Minimally Invasive Therapy Using MRI" (AMIT) targeting clinical trials was in progress during the reporting period. A new proposal on exploiting part of the results in the form of open software was in preparation.

The group's expertise in robotics was applied in developing a mobile robot for domestic help. A teleoperated robot serves as the remote eyes of the elderly and those who take care of them. During the reporting period, the main task was to develop teleoperation capabilities for the robot. A voice controlled service robot was successfully demonstrated. The purpose of the robot is to assist elderly people in their homes and provide a communication link to the health care personnel. A design project was launched with the University of Lapland to further develop the appearance of the robot, and make it suitable for various applications and research studies regarding human-robot interaction.

The development of a wheeled walking aid offering services based on modern information technology has continued during 2001. The project is in co-operation with the University of Lapland. The services envisaged for such a system include video calls, safety monitoring and controlling the environment.


Future Goals

We will continue to strengthen our long term research and researcher training. We will also continuously seek opportunities for exploitation of our research results by collaborating with partners from industry and other research institutions in national and international research programs and projects. In order to further sharpen its focus, the MVIS group has been recently divided into two separate but closely cooperating groups: Machine Vision ( ) and Intelligent Systems ( ). These groups together with our earlier "spin-off" group MediaTeam Oulu ( ) form the Machine Vision and Media Processing Unit (MVMP) ( ). The MVMP unit coordinates research and researcher training activities between its groups and has some joint facilities and support staff.



professors & doctors


graduate students






person years



External Funding

 Source EUR

Academy of Finland 

410 700

Ministry of Education

302 400


780 700

domestic private

355 000

EU + other international 

180 100


2 029 000


Doctoral Theses

Alakuijala J (2001) Algorithms for modeling anatomic and target volumes in image-guided neurosurgery and radiotherapy. Acta Universitatis Ouluensis C 165.


Selected Publications

Heikkilä J (2001) A multi-view camera calibration method for coplanar targets. Proc. 12th Scandinavian Conference on Image Analysis, June 11-14, Bergen, Norway, 409-414.

Koskinen M, Seppänen T, Tuukkanen J, Yli-Hankala A & Jäntti V (2001) Propofol anaesthesia induces phase synchronization changes in EEG. Clinical Neurophysiology 112, 386-392.

Laurinen P, Röning J & Tuomela H (2001) Steel slab temperature modelling using neural and Bayesian networks. Fourt International ICSC Symposium on Soft Computing and Intelligent Systems for Industry, June 26-29, Paisley, Scotland, UK.

Martinkauppi B, Sangi P, Soriano M, Pietikäinen M, Huovinen S & Laaksonen M (2001) Illumination invariant face tracking with mean shift and skin locus. Proc. IEEE International Workshop on Cues in Communication (Cues 2001), December 9, Kauai, Hawaii, 44-49.

Niskanen M, Silvén O & Kauppinen H (2001) Experiments with SOM based inspection of wood. Proc. International Conference on Quality Control by Artificial Vision (QCAV 2001), May 21-23, Le Creusot, France, 2: 311-316.

Ojala T, Valkealahti K, Oja E & Pietikäinen M (2001) Texture discrimination with multidimensional distributions of signed gray level differences. Pattern Recognition 34(3): 727-739.

Pietikäinen M & Okun O (2001) Edge-based method for text detection from complex document images. Proc. Sixth International Conference on Document Analysis and Recognition (ICDAR 2001), September 10-13, Seattle, WA, USA, 286-291.

Pirilä-Parkkinen K, Pirttiniemi P, Alvesalo L, Silven O, Heikkilä J & Osborne RH (2001) The relationship of handeness to asymmetry in the occlusal morphology of first permanent molars. European Journal of Morphology 39(2): 81-89.

Pirttikangas S, Riekki J, Kaartinen J, Miettinen J, Nissilä S & Röning J (2001) Genie of the Net: A New approach for a context-aware health club. Working notes of Workshop on Ubiquitous Data Mining of Mobile and Distributed Environments, September 7, Freiburg, Germany.

Pylkkö H, Riekki J & Röning J (2001) Real-time color-based tracking via a marker interface. Proc. IEEE International Conference on Robotics and Automation, May 21-26, Seoul, South Korea, 2:1214-1219.

Salo MA, Seppänen T & Huikuri HV (2001) Ectopic beats in heart rate variability analysis: Effects of editing on time and frequency domain measures. Annals of Noninvasive Electrocardiology 6(1): 5-17.

Sangi P, Heikkilä J & Silvén O (2001) Extracting motion components from image sequences using particle filters. Proc. 12th Scandinavian Conference on Image Analysis, June 11-14, Bergen, Norway, 508-514.

Tulppo MP, Mäkikallio TH, Seppänen T, Shoemaker K, Tutungi E, Hughson RL & Huikuri HV (2001) Effects of pharmacological adrenergic and vagal modulation on fractal and complexity properties of heart rate dynamics. Clin Phys 21(5): 515-523.