Publications - AImageLab

Multimedia Surveillance Systems

Authors: Cucchiara, Rita

The integration of video technology and sensor networks constitutes the fundamental infrastructure for new generations of multimedia surveillance systems, where … (Read full abstract)

The integration of video technology and sensor networks constitutes the fundamental infrastructure for new generations of multimedia surveillance systems, where many different media streams (audio, video, images, textual data, sensor signals) will concur to provide an automatic analysis of the controlled environment and a real-time interpretation of the scene. New solutions can be devised to enlarge the view of traditional surveillance systems by means of distributed architectures with fixed and active cameras, to enhance their view with other sensed data, to explore multi-resolution views with zooming and omnidirectional cameras. Applications regard surveillance of wide indoor and outdoor area and particularly people surveillance: in this case, multimedia surveillance systems can be enriched with biometric technology; the best views of detected persons and their extracted visual features (e.g. faces, voices, trajectories)can be exploited for people identification. VSSN05 is the third edition of the workshop, co-located at ACM Multimedia Conference, that embraces research reports on video surveillance and, since the edition of 2004, sensor networks. Thispaper gives a short overview of the hot topics in multimedia surveillance systems and introduces some research activities currently engaged in the world and presented at VSSN05.

2005 Relazione in Atti di Convegno

DOI IRIS

Posture Classification in a Multi-camera Indoor Environment

Authors: Cucchiara, R.; Prati, A.; Vezzani, R.

Published in: PROCEEDINGS - INTERNATIONAL CONFERENCE ON IMAGE PROCESSING

Posture classification is a key process for analyzing thepeople’s behaviour. Computer vision techniques can behelpful in automating this process, but … (Read full abstract)

Posture classification is a key process for analyzing thepeople’s behaviour. Computer vision techniques can behelpful in automating this process, but clutteredenvironments and consequent occlusions make this taskoften difficult. Different views provided by multiplecameras can be exploited to solve occlusions by warpingknown object appearance into the occluded view. To thisaim, this paper describes an approach to postureclassification based on projection histograms, reinforcedby HMM for assuring temporal coherence of the posture.The single camera posture classification is then exploitedin the multi-camera system to solve the cases in which theocclusions make the classification impossible.Experimental results of the classification from both thesingle camera and the multi-camera system are provided.

2005 Relazione in Atti di Convegno

DOI IRIS

Probabilistic posture classification for human-behavior analysis

Authors: Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Vezzani, Roberto

Published in: IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS

Computer vision and ubiquitous multimedia access nowadays make feasible the development of a mostly automated system for human-behavior analysis. In … (Read full abstract)

Computer vision and ubiquitous multimedia access nowadays make feasible the development of a mostly automated system for human-behavior analysis. In this context, our proposal is to analyze human behaviors by classifying the posture of the monitored person and, consequently, detecting corresponding events and alarm situations, like a fall. To this aim, our approach can be divided in two phases: for each frame, the projection histograms (Haritaoglu et al., 1998) of each person are computed and compared with the probabilistic projection maps stored for each posture during the training phase; then, the obtained posture is further validated exploiting the information extracted by a tracking module in order to take into account the reliability of the classification of the first phase. Moreover, the tracking algorithm is used to handle occlusions, making the system particularly robust even in indoors environments. Extensive experimental results demonstrate a promising average accuracy of more than 95% in correctly classifying human postures, even in the case of challenging conditions.

2005 Articolo su rivista

DOI IRIS

Real Time Semantic Adaptation of Sports Video with User-centred Performance Analysis

Authors: M., Bertini; Cucchiara, Rita; A., Del Bimbo; Prati, Andrea

Semantic video adaptation improves traditional adaptation by taking into account the degree of relevance of the different portions of the … (Read full abstract)

Semantic video adaptation improves traditional adaptation by taking into account the degree of relevance of the different portions of the content. It employs solutions to detect the significant parts of the video and applies different compression ratios to elements that have different importance. Performance of semantic adaptation heavily depends on the quality and precision of the automatic annotation, whether it operates in strict or nonstrict real time, and the codec which is used to perform adaptation at the event or object level. It should consider the effects of the errors in the automatic extraction of objects and events over the operation of the adaptation subsystem, and relate these effects to the preferences for the objects and events of the video program, that have been decided by the user. In this paper, we present strict real time annotation and adaptation of sports video and introduce two new performance measures: Viewing Quality Loss and Bit-rate Cost Increase, that are obtained from classical PSNR and Bit Ratio, but relate the results of semantic adaptation with the user’s preferences and expectations.

2005 Relazione in Atti di Convegno

IRIS

Shot detection and motion analysis for automatic MPEG-7 annotation of sports videos

Authors: Tardini, Giovanni; Grana, Costantino; R., Marchi; Cucchiara, Rita

Published in: LECTURE NOTES IN COMPUTER SCIENCE

In this paper we describe general algorithms that are devised for MPEG-7 automatic annotation of Formula 1 videos, and in … (Read full abstract)

In this paper we describe general algorithms that are devised for MPEG-7 automatic annotation of Formula 1 videos, and in particular for camera-car shots detection. We employed a shot detection algorithm suitable for cuts and linear transitions detection, which is able to precisely detect both the transition's center and length. Statistical features based on MPEG motion compensation vectors arc then employed to provide motion characterization, using a subset of the motion types defined in MPEG-7, and shot type classification. Results on shot detection and classification are provided.

2005 Relazione in Atti di Convegno

DOI IRIS

Shot Detection for Formula 1 Video Digital Libraries

Authors: Cucchiara, Rita; Grana, Costantino; Tardini, Giovanni

Metadata extraction is one of the first tasks to be performed for automatic Digital Library annotation, and in particular shot … (Read full abstract)

Metadata extraction is one of the first tasks to be performed for automatic Digital Library annotation, and in particular shot detection has been widely explored in literature. While a lot of methods have been proposed for the detection of abrupt cuts, only a small number of them has explicitly addressed the problem of gradual transitions. In this paper we propose an algorithm that exploits a precise model of linear transition. Experimental results on Formula 1 car races videos show the robustness of this method. These test videos are characterized by extreme situations such as fast camera and objects motion and very different kinds of shots. The algorithm is able to estimate the exact length of the transition and an error score is also given as a fitness measure to the linear model, to discriminate true transitions from false detections. The final shot segmentation is delivered as an MPEG7 compliant output.

2005 Relazione in Atti di Convegno

IRIS

T_PARK: Ambient Intelligence for Security in Public Parks

Authors: Cucchiara, Rita; Prati, Andrea; L., Benini; E., Farella

Published in: IEE CONFERENCE PUBLICATION

In this paper, we present joint research activities in computer vision and sensor networks for a distributedsurveillance of urban parks. … (Read full abstract)

In this paper, we present joint research activities in computer vision and sensor networks for a distributedsurveillance of urban parks. Distributed visual surveillance of urban environments is one of the most interesting scenarios in Ambient Intelligence; in addition, the automated monitoring of public parks, often crowded by children and aduits, is still a very difficult task due to the number of objects of interests. In this context, integrating the power of low cost sensors with the information provided by cameras can lead to a more reliable solution to people tracking in wide areas. Specifically, the deficiencies of one approach can be (at least partially) covered by the advantages of the other. The goal is to perform people tracking in parks (toachieve trackable parks - T-Parks), both in zones covered by overlapped cameras and afso, thanks to sensors, in areas not covered by any camera. In this paper, we propose a new technique for multi-camera people tracking based on a learning phase to automatically calibrate pairs of cameras and to build Areas of Field Views (AoFoVs) in order to establish consistent labelling of people. In addition, sensornetworks distributed at the borders of the AoFoV give an estimation of the probability of people overlapping, triggering specific algorithms of face detection or headcounting to identify the single person. The research ofT-Parks is part of a two-year Italian project called LAICA, intended to provide advanced services for citizens and public officers based on ambient intelligence technologies.

2005 Relazione in Atti di Convegno

IRIS

Video Annotation with Pictorially Enriched Ontologies

Authors: C., Torniai; A., Del Bimbo; Cucchiara, Rita; M., Bertini

Video annotation is typically performed by classifying video elements according to some pre-defined ontology of the video content domain. Ontologies … (Read full abstract)

Video annotation is typically performed by classifying video elements according to some pre-defined ontology of the video content domain. Ontologies are defined by establishing relationships between linguistic terms, that specify domain concepts at different abstraction levels. However, although linguistic terms are appropriate to distinguish event and object categories, they are inadequate when they must describe specific patterns of events or video entities. Instead, in these cases, pattern specifications are better expressed through visual prototypes that capture the essence of the event or entity. Pictorially enriched ontologies, that include visual concepts together with linguistic keywords, are therefore needed tosupport video annotation up to the level of detail of pattern specification. This paper presents pictorially enriched ontologies and provide a solution for their implementation in the soccer video domain. The pictorially enriched ontology is used both to directly assign multimedia objects to concepts, providing a more meaningful definition than the linguistics terms, and to extend the initial knowledge of the domain, adding subclasses of highlights or new highlight classes that were not defined in the linguistic ontology. Automatic annotation of soccer clips up to the pattern specification level using a pictorially enriched ontology is discussed.

2005 Relazione in Atti di Convegno

DOI IRIS

Video understanding and content-based retrieval

Authors: Y., Zhai; J., Liu; X., Cao; A., Basharat; A., Hakeem; S., Ali; M., Shah; Grana, Costantino; Cucchiara, Rita

This year, the joint team of UCF and the University of Modenahas participated in the following tasks: (1) shot boundarydetection, … (Read full abstract)

This year, the joint team of UCF and the University of Modenahas participated in the following tasks: (1) shot boundarydetection, (2) low-level feature extraction, (3) high-levelfeature extraction, (4) topic search and (5) BBC rushes management.The shot boundary detection was contributed bythe Image Lab at the University of Modena. The other taskswere performed by the Computer Vision Team at UCF.

2005 Relazione in Atti di Convegno

IRIS

Publications by Rita Cucchiara

Multimedia Surveillance Systems

Posture Classification in a Multi-camera Indoor Environment

Probabilistic posture classification for human-behavior analysis

Real Time Semantic Adaptation of Sports Video with User-centred Performance Analysis

Shot detection and motion analysis for automatic MPEG-7 annotation of sports videos

Shot Detection for Formula 1 Video Digital Libraries

T_PARK: Ambient Intelligence for Security in Public Parks

Video Annotation with Pictorially Enriched Ontologies

Video understanding and content-based retrieval