- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
高级计算机视觉
展开查看详情
1 .Object Recognition I Linda Shapiro EE/CSE 576 1
2 .Low- to High-Level low-level edge image mid-level consistent high-level line clusters Building Recognition 2
3 .High-Level Computer Vision • Detection of classes of objects (faces, motorbikes, trees, cheetahs) in images • Recognition of specific objects such as George Bush or machine part #45732 • Classification of images or parts of images for medical or scientific applications • Recognition of events in surveillance videos • Measurement of distances for robotics
4 . High-level vision uses techniques from AI • Graph-Matching: A*, Constraint Satisfaction, Branch and Bound Search, Simulated Annealing • Learning Methodologies: Decision Trees, Neural Nets, SVMs, EM Classifier • Probabilistic Reasoning, Belief Propagation, Graphical Models
5 .Graph Matching for Object Recognition • For each specific object, we have a geometric model. • The geometric model leads to a symbolic model in terms of image features and their spatial relationships. • An image is represented by all of its features and their spatial relationships. • This leads to a graph matching problem.
6 . House Example 2D model 2D image P L RP and RL are connection relations. f(S1)=Sj f(S4)=Sn f(S7)=Sg f(S10)=Sf f(S2)=Sa f(S5)=Si f(S8) = Sl f(S11)=Sh f(S3)=Sb f(S6)=Sk f(S9)=Sd
7 . But this is too simplistic • The model specifies all the features of the object that may appear in the image. • Some of them don’t appear at all, due to occlusion or failures at low or mid level. • Some of them are broken and not recognized. • Some of them are distorted. • Relationships don’t all hold.
8 . TRIBORS: view class matching of polyhedral objects edges from image model overlayed improved location • A view-class is a typical 2D view of a 3D object. • Each object had 4-5 view classes (hand selected). • The representation of a view class for matching included: - triplets of line segments visible in that class - the probability of detectability of each triplet The first version of this program used iterative-deepening A* search.
9 . RIO: Relational Indexing for Object Recognition • RIO worked with more complex parts that could have - planar surfaces - cylindrical surfaces - threads
10 . Object Representation in RIO • 3D objects are represented by a 3D mesh and set of 2D view classes. • Each view class is represented by an attributed graph whose nodes are features and whose attributed edges are relationships. • For purposes of indexing, attributed graphs are stored as sets of 2-graphs, graphs with 2 nodes and 2 relationships. share an arc coaxial arc ellipse cluster
11 . RIO Features ellipses coaxials coaxials-multi parallel lines junctions triples close and far L V Y Z U
12 . RIO Relationships • share one arc • share one line • share two lines • coaxial • close at extremal points • bounding box encloses / enclosed by
13 .Hexnut Object How are 1, 2, and 3 related? What other features and relationships can you find?
14 . Graph and 2-Graph Representations 1 coaxials- multi encloses 1 1 2 3 encloses 2 ellipse e e e c encloses 3 parallel coaxial 2 3 3 2 lines RDF!
15 .Relational Indexing for Recognition Preprocessing (off-line) Phase for each model view Mi in the database • encode each 2-graph of Mi to produce an index • store Mi and associated information in the indexed bin of a hash table H
16 . Matching (on-line) phase 1. Construct a relational (2-graph) description D for the scene 2. For each 2-graph G of D • encode it, producing an index to access the hash table H • cast a vote for each Mi in the associated bin 3. Select the Mi’s with high votes as possible hypotheses 4. Verify or disprove via alignment, using the 3D meshes
17 .The Voting Process
18 . RIO Verifications incorrect hypothesis 1. The matched features of the hypothesized object are used to determine its pose. 2. The 3D mesh of the object is used to project all its features onto the image. 3. A verification procedure checks how well the object features line up with edges on the image.
19 .Use of classifiers is big in computer vision today. • 2 Examples: – Rowley’s Face Detection using neural nets – Yi’s image classification using EM
20 . Object Detection: Rowley’s Face Finder 1. convert to gray scale 2. normalize for lighting 3. histogram equalization 4. apply neural net(s) trained on 16K images What data is fed to the classifier? 32 x 32 windows in a pyramid structure
21 . Object Class Recognition using Images of Abstract Regions Yi Li, Jeff A. Bilmes, and Linda G. Shapiro Department of Computer Science and Engineering Department of Electrical Engineering University of Washington
22 . Problem Statement Given: Some images and their corresponding descriptions {trees, grass, cherry trees} {cheetah, trunk} {mountains, sky} {beach, sky, trees, water} To solve: What object classes are present in new images ? ? ? ?
23 . Image Features for Object Recognition • Color • Texture • Structure • Context
24 . Abstract Regions Original Images Color Regions Texture Regions Line Clusters
25 . Abstract Regions Multiple segmentations whose regions are not labeled; a list of labels is provided for each training image. image various different segmentations region attributes from several different types of labels regions {sky, building}
26 . Model Initial Estimation • Estimate the initial model of an object using all the region features from all images that contain the object Tree Sky
27 . EM Classifier: the Idea Initial Model for “trees” Final Model for “trees” EM Initial Model for “sky” Final Model for “sky”
28 . EM Algorithm • Start with K clusters, each represented by a probability distribution • Assuming a Gaussian or Normal distribution, each cluster is represented by its mean and variance (or covariance matrix) and has a weight. • Go through the training data and soft-assign it to each cluster. Do this by computing the probability that each training vector belongs to each cluster. • Using the results of the soft assignment, recompute the parameters of each cluster. • Perform the last 2 steps iteratively.
29 .1-D EM with Gaussian Distributions • Each cluster Cj is represented by a Gaussian distribution N(j , j). • Initialization: For each cluster Cj initialize its mean j , variance j, and weight j. N(1 , 1) N(2 , 2) N(3 , 3) 1 = P(C1) 2 = P(C2) 3 = P(C3) • With no other knowledge, use random means and variances and equal weights.