20_Models_For_Words

下载 0

快召唤伙伴们来围观吧
微博 QQ QQ空间 贴吧
文档嵌入链接
<iframe src="https://www.slidestalk.com/u3800/20_Models_For_Words94499?embed" frame border="0" width="640" height="360" scrolling="no" allowfullscreen="true">复制
微信扫一扫分享
已成功复制到剪贴板

魏先生

发布于

6年前

1886

人观看

#信息技术

Most models treat data as continuous Likelihood based on normal distribution Visual words = discrete representation of image Likelihood based on categorical distribution Useful for difficult tasks such as scene recognition and object recognition

展开查看详情

1 .Computer vision: models, learning and inference Chapter 20 Models for Visual Words

2 .Visual words 2 2 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Most models treat data as continuous Likelihood based on normal distribution Visual words = discrete representation of image Likelihood based on categorical distribution Useful for difficult tasks such as scene recognition and object recognition

4 .Structure 4 4 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Computing visual words Bag of words model Latent Dirichlet allocation Single author-topic model Constellation model Scene model Applications

5 .Computing dictionary of visual words 5 5 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince For every one of the I training images, select a set of J i spatial locations. Interest points Regular grid Compute a descriptor at each spatial location in each image Cluster all of these descriptor vectors into K groups using a method such as the K-Means algorithm (or others!) The means of the K clusters are used as the K prototype vectors in the dictionary.

6 .Encoding images as visual words 6 6 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Select a set of J spatial locations in the image using the same method as for the dictionary Compute the descriptor at each of the J spatial locations. Compare each descriptor to the set of K prototype descriptors in the dictionary Assign a discrete index to this location that corresponds to the index of the closest word in the dictionary. End result: Discrete feature index x and y position

7 .Structure 7 7 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Computing visual words Bag of words model Latent Dirichlet allocation Single author-topic model Constellation model Scene model Applications

8 .where Bag of words model 8 8 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Key idea: Abandon all spatial information Just represent image by relative frequency (histogram) of words from dictionary for n’th class

10 .Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Categorical Distribution or can think of data as vector with all elements zero except k th e.g. [0,0,0,1 0] For short we write: Categorical distribution describes situation where K possible outcomes y=1… y=k . Takes K parameters where 10 Review

12 .Categorical distribution: MAP 12 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince With a uniform prior ( a 1..K =1), gives same result as maximum likelihood. Take derivative, set to zero and re-arrange: Review

16 .Structure 16 16 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Computing visual words Bag of words model Latent Dirichlet allocation Single author-topic model Constellation model Scene model Applications

17 .Latent Dirichlet allocation 17 17 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Describes relative frequency of visual words across a set of images (no world term) Words not generated independently (connected by hidden variable) Analogy to text documents Each image contains mixture of several topics (parts) Each topic induces a distribution over words

21 .Learning LDA model 21 21 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Part labels p hidden variables If we knew them then it would be easy to estimate the MAP parameters: How about EM algorithm? Unfortunately, parts p within each image are not independent

23 .Learning 23 23 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Strategy: Write an expression for posterior distribution over part labels Draw samples from posterior using MCMC Use samples to estimate parameters

24 .“Lucky” that we chose conjugate priors! 1. Posterior over part labels 24 24 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Can compute two terms in numerator in closed form Denominator intractable

25 .2. Draw samples from posterior 25 25 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Gibbs’ sampling : fix all part labels except one and sample from conditional distribution This can be computed in closed form

29 .Structure 29 29 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince Computing visual words Bag of words model Latent Dirichlet allocation Single author-topic model Constellation model Scene model Applications

2点赞

0收藏

0下载