Browsing by Subject "Computer Vision"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item Activity retrieval in closed captioned videos(2009-08) Gupta, Sonal; Mooney, Raymond J. (Raymond Joseph); Grauman, KristenRecognizing activities in real-world videos is a difficult problem exacerbated by background clutter, changes in camera angle & zoom, occlusion and rapid camera movements. Large corpora of labeled videos can be used to train automated activity recognition systems, but this requires expensive human labor and time. This thesis explores how closed captions that naturally accompany many videos can act as weak supervision that allows automatically collecting 'labeled' data for activity recognition. We show that such an approach can improve activity retrieval in soccer videos. Our system requires no manual labeling of video clips and needs minimal human supervision. We also present a novel caption classifier that uses additional linguistic information to determine whether a specific comment refers to an ongoing activity. We demonstrate that combining linguistic analysis and automatically trained activity recognizers can significantly improve the precision of video retrieval.Item Realization Of A Spatial Augmented Reality System - A Digital Whiteboard Using a Kinect Sensor and a PC Projector(2013-05-01) Kolomenski, Andrei ARecent rapid development of cost-effective, accurate digital imaging sensors, high-speed computational hardware, and tractable design software has given rise to the growing field of augmented reality in the computer vision realm. The system design of a 'Digital Whiteboard' system is presented with the intention of realizing a practical, cost-effective and publicly available spatial augmented reality system. A Microsoft Kinect sensor and a PC projector coupled with a desktop computer form a type of spatial augmented reality system that creates a projection based graphical user interface that can turn any wall or planar surface into a 'Digital Whiteboard'. The system supports two kinds of user inputs consisting of depth and infra-red information. An infra-red collimated light source, like that of a laser pointer pen, serves as a stylus for user input. The user can point and shine the infra-red stylus on the selected planar region and the reflection of the infra-red light source is registered by the system using the infra-red camera of the Kinect. Using the geometric transformation between the Kinect and the projector, obtained with system calibration, the projector displays contours corresponding to the movement of the stylus on the 'Digital Whiteboard' region, according to a smooth curve fitting algorithm. The described projector-based spatial augmented reality system provides new unique possibilities for user interaction with digital content.Item Two Case Studies on Vision-based Moving Objects Measurement(2012-10-19) Zhang, JiIn this thesis, we presented two case studies on vision-based moving objects measurement. In the first case, we used a monocular camera to perform ego-motion estimation for a robot in an urban area. We developed the algorithm based on vertical line features such as vertical edges of buildings and poles in an urban area, because vertical lines are easy to be extracted, insensitive to lighting conditions/shadows, and sensitive to camera/robot movements on the ground plane. We derived an incremental estimation algorithm based on the vertical line pairs. We analyzed how errors are introduced and propagated in the continuous estimation process by deriving the closed form representation of covariance matrix. Then, we formulated the minimum variance ego-motion estimation problem into a convex optimization problem, and solved the problem with the interior-point method. The algorithm was extensively tested in physical experiments and compared with two popular methods. Our estimation results consistently outperformed the two counterparts in robustness, speed, and accuracy. In the second case, we used a camera-mirror system to measure the swimming motion of a live fish and the extracted motion data was used to drive animation of fish behavior. The camera-mirror system captured three orthogonal views of the fish. We also built a virtual fish model to assist the measurement of the real fish. The fish model has a four-link spinal cord and meshes attached to the spinal cord. We projected the fish model into three orthogonal views and matched the projected views with the real views captured by the camera. Then, we maximized the overlapping area of the fish in the projected views and the real views. The maximization result gave us the position, orientation, and body bending angle for the fish model that was used for the fish movement measurement. Part of this algorithm is still under construction and will be updated in the future.