Naïve Bayes Algorithm. 5/40. Marko Stupar 11/3370 sm113370m@student.etf.rs. Naïve Bayes algorithm. Reasoning. New information arrived. How to classify ...

小二郎发布于2018/06/08 00:00

注脚

1.Bayesian Data Mining University of Belgrade School of Electrical Engineering Department of Computer Engineering and Information Theory Marko Stupar 11/3370 sm113370m@student.etf.rs 1 /40

2.Data Mining problem Too many attributes in training set (colons of table) Existing algorithms need too much time to find solution We need to classify, estimate, predict in real time Marko Stupar 11/3370 sm113370m@student.etf.rs Target Value 1 Value 2 . . . Value 100000 2 /40

3.Problem importance Find relation between: All Diseases, All Medications, All Symptoms, Marko Stupar 11/3370 sm113370m@student.etf.rs 3 /40

4.Existing solutions Marko Stupar 11/3370 sm113370m@student.etf.rs CART, C4.5 Too many iterations Continuous arguments need binning Rule induction Continuous arguments need binning Neural networks High computational time K-nearest neighbor Output depends only on distance based close values 4 /40

5.Classification, Estimation, Prediction Used for large data set Very easy to construct Not using complicated iterative parameter estimations Often does surprisingly well May not be the best possible classifier Robust, Fast, it can usually be relied on to Marko Stupar 11/3370 sm113370m@student.etf.rs Naïve Bayes Algorithm 5 /40

6.Marko Stupar 11/3370 sm113370m@student.etf.rs Naïve Bayes algorithm Reasoning New information arrived How to classify Target? Target Attribute 1 Attribute 2 Attribute n … … … .. … … .. .. .. .. .. .. … .. … Training Set Target Attribute 1 Attribute 2 Attribute n a1 a2 an Missing 6 /40

7.Target can be one of discrete values: t1, t2, …, tn Marko Stupar 11/3370 sm113370m@student.etf.rs Naïve Bayes algorithm Reasoning How to calculate? Did not helped very much? Naïve Bayes assumes that A1…An are independent, So: 7 /40

8.Age Income Student Credit Target Buys Computer 1 Youth High No Fair No 2 Youth High No Excellent No 3 Middle High No Fair Yes 4 Senior Medium No Fair Yes 5 Senior Low Yes Fair Yes 6 Senior Low Yes Excellent No 7 Middle Low Yes Excellent Yes 8 Youth Medium No Fair No 9 Youth Low Yes Fair Yes 10 Senior Medium Yes Fair Yes 11 Youth Medium Yes Excellent Yes 12 Middle Medium No Excellent Yes 13 Middle High Yes Fair Yes 14 Senior Medium No Excellent No Attributes = (Age= youth , Income= medium , Student= yes , Credit_rating = fair ) P(Attributes, Buys_Computer =Yes) = P(Age= youth|Buys_Computer =yes) * P(Income= medium|Buys_Computer =yes) * P(Student= yes|Buys_Computer =yes) * P( Credit_rating = fair|Buys_Computer =yes) * P( Buys_Computer =yes) =2/9 * 4/9 * 6/9 * 6/9 * 9/14 = 0.028 Attributes = (Age= youth , Income= medium , Student= yes , Credit_rating = fair ) P(Attributes, Buys_Computer =No) = P(Age= youth|Buys_Computer =no) * P(Income= medium|Buys_Computer =no) * P(Student= yes|Buys_Computer =no) * P( Credit_rating = fair|Buys_Computer =no) * P( Buys_Computer =no) =3/5 * 2/5 * 1/5 * 2/5 * 5/14 = 0.007 Naïve Bayes Discrete Target Example Marko Stupar 11/3370 sm113370m@student.etf.rs 8 /40

9.Naïve Bayes Discrete Target - Example Attributes = (Age= youth , Income= medium , Student= yes , Credit_rating = fair ) Target = Buys_Computer = [ Yes | No ] ? P(Attributes, Buys_Computer =Yes) = P(Age= youth|Buys_Computer =yes) * P(Income= medium|Buys_Computer =yes) * P(Student= yes|Buys_Computer =yes) * P( Credit_rating = fair|Buys_Computer =yes) * P( Buys_Computer =yes) =2/9 * 4/9 * 6/9 * 6/9 * 9/14 = 0.028 P(Attributes, Buys_Computer =No) = P(Age= youth|Buys_Computer =no) * P(Income= medium|Buys_Computer =no) * P(Student= yes|Buys_Computer =no) * P( Credit_rating = fair|Buys_Computer =no) * P( Buys_Computer =no) =3/5 * 2/5 * 1/5 * 2/5 * 5/14 = 0.007  P( Buys_Computer =Yes | Attributes) > P( Buys_Computer =No| Attributes) Therefore, the naïve Bayesian classifier predicts Buys_Computer = Yes for previously given Attributes Marko Stupar 11/3370 sm113370m@student.etf.rs 9 /40

10.Naïve Bayes Discrete Target – Spam filter Attributes = Text Document = w1,w2,w3… Array of words Target = Spam = [Yes | No] ? - probability that the i-th word of a given document occurs in documents, in training set, that are classified as Spam - probability that all words of document occur in Spam documents in training set Marko Stupar 11/3370 sm113370m@student.etf.rs 10 /40

11.Naïve Bayes Discrete Target – Spam filter - Bayes factor Sample correction – if there is a word in document that never occurred in training set the whole will be zero. Sample correction solution – put some low value for that Marko Stupar 11/3370 sm113370m@student.etf.rs 11 /40

12.Gaussian Naïve Bayes Continuous Attributes Continuous attributes do not need binning (like CART and C4.5) Choose adequate PDF for each Attribute in training set Gaussian PDF is most likely to be used to estimate the attribute probability density function (PDF) Calculate PDF parameters by using Maximum Likelihood Method Naïve Bayes assumption - each attribute is independent of other, so joint PDF of all attributes is result of multiplication of single attributes PDFs Marko Stupar 11/3370 sm113370m@student.etf.rs 12 /40

13.Gaussian Naïve Bayes Continuous Attributes - Example Training set Validation set sex height (feet) weight (lbs) foot size (inches) male 6 180 12 male 5.92 190 11 male 5.58 170 12 male 5.92 165 10 female 5 100 6 female 5.5 150 8 female 5.42 130 7 female 5.75 150 9 Target = male Target = female height (feet) 5.885 0.027175 5.4175 0.07291875 weight (lbs) 176.25 126.5625 132.5 418.75 foot size(inches) 11.25 0.6875 7.5 1.25 sex height (feet) weight (lbs) foot size (inches) Target 6 130 8 Marko Stupar 11/3370 sm113370m@student.etf.rs 13 /40

14.Naïve Bayes - Extensions Easy to extend Gaussian Bayes – sample of extension Estimate Target – If Target is real number, but in training set has only few acceptable discrete values t1… tn , we can estimate Target by: A large number of modifications have been introduced, by the statistical, data mining, machine learning, and pattern recognition communities, in an attempt to make it more flexible Modifications are necessarily complications, which detract from its basic simplicity Marko Stupar 11/3370 sm113370m@student.etf.rs 14 /40

15.Naïve Bayes - Extensions Are Attributes always really independent? A1 = Weight, A2 = Height, A3 = Shoe Size, Target = [ male|female ]? How can that influence our Naïve Bayes data mining? Assumption ? Bayesian network Marko Stupar 11/3370 sm113370m@student.etf.rs 15 /40

16.Marko Stupar 11/3370 sm113370m@student.etf.rs Bayesian Network Bayesian network is a directed acyclic graph (DAG) with a probability table for each node. Bayesian network contains: Nodes and Arcs between them Nodes represent arguments from database Arcs between nodes represent their probabilistic dependencies Target A2 A1 A3 P(A2|Target) A6 A4 A5 P(A6|A4A5) A7 P(A3|Target,A4A6) 16 /40

17.Bayesian Network What to do Read Network Construct Network Training Set Continue with naïve Bayes Joint Probability Distribution Naïve Bayes Marko Stupar 11/3370 sm113370m@student.etf.rs 17 /40

18.Marko Stupar 11/3370 sm113370m@student.etf.rs Bayesian Network Read Network Chain rule of probability Bayesian network - Uses Markov Assumption A7 A2 A5 Joint Probability Distribution A7 depends only on A2 and A5 18 /40

19.Bayesian Network Read Network - Example How to get P(N|B), P(B|M,T)? Expert knowledge From Data(relative frequency estimates) Or a combination of both Medication Blood Cloth Trauma Heart Attack Nothing Stroke P(M) P(!M) 0.2 0.8 P(T) P(!T) 0.05 0.95 M T P(B) P(!B) T T 0.95 0.05 T F 0.3 0.7 F T 0.6 0.4 F F 0.9 0.1 B P(H) P(!H) T 0.4 0.6 F 0.15 0.85 B P(N) P(!N) T 0.25 0.75 F 0.75 0.25 B P(S) P(!S) T 0.35 0.65 F 0.1 0.9 Marko Stupar 11/3370 sm113370m@student.etf.rs 19 /40

20.Manually From Database – Automatically Heuristic algorithms 1. heuristic search method to construct a model 2.evaluates model using a scoring method Bayesian scoring method entropy based method minimum description length method 3. go to 1 if score of new model is not significantly better Algorithms that analyze dependency among nodes Measure dependency by conditional independence (CI) test Bayesian Network Construct Network Marko Stupar 11/3370 sm113370m@student.etf.rs 20 /40

21.Bayesian Network Construct Network Heuristic algorithms Advantages less time complexity in worst case Disadvantage May not find the best solution due to heuristic nature Algorithms that analyze dependency among nodes Advantages usually asymptotically correct Disadvantage CI tests with large condition-sets may be unreliable unless the volume of data is enormous. Marko Stupar 11/3370 sm113370m@student.etf.rs 21 /40

22.Bayesian Network Construct Network - Example 1. Choose an ordering of variables X 1 , … , X n 2. For i = 1 to n add X i to the network select parents from X 1 , … ,X i-1 such that P (X i | Parents(X i )) = P (X i | X 1 , ... X i-1 ) Marko Stupar 11/3370 sm113370m@student.etf.rs Marry Calls John Calls Alarm Burglary Earthquake P (J | M) = P (J)? No P (A | J, M) = P (A | J) ? P (A | J, M) = P (A) ? No P (B | A, J, M) = P (B | A) ? Yes P (B | A, J, M) = P (B) ? No P (E | B, A ,J, M) = P (E | A) ? No P (E | B, A, J, M) = P (E | A, B) ? Yes 22 /40

23.Create Network – from database d(Directional)-Separation d-Separation is graphical criterion for deciding, from a given causal graph(DAG), whether a disjoint sets of nodes X-set, Y-set are independent, when we know realization of third Z-set Z-set - is instantiated(values of it’s nodes are known) before we try to determine d-Separation(independence) between X-set and Y-set X-set and Y-set are d-Separated by given Z-set if all paths between them are blocked Example of Path : N1 <- N2 -> N3 -> N4 -> N5 <- N6 <- N7 N5 – “head-to-head” node Path is not blocked if every “head-to-head” node is in Z-set or has descendant in Z-set, and all other nodes are not in Z-set. Marko Stupar 11/3370 sm113370m@student.etf.rs 23 /40

24.Create Network – from database d-Separation Example 1. Does D d -separate C and F? There are two undirected paths from C to F: ( i ) C -B -E –F This blocked given D by the node E, since E is not one of the given nodes and has both arrows on the path going into it. (ii) C - B - A - D - E - F. This path is also blocked by E (and D as well). So, D does d-separate C and F 2. Do D and E d-separate C and F? The path C - B - A - D - E - F is blocked by the node D given {D,E}. However, E no longer blocks C - B - E - F path since it “given” node. So, D and E do not d-separate C and F 3. Write down all pairs of nodes which are independent of each other. Nodes which are independent are those that are d-separated by the empty set of nodes. This means every path between them must contain at least one node with both path arrows going into it, which is E in current context. We find that F is independent of A, of B, of C and of D. All other pairs of nodes are dependent on each other. 4. Which pairs of nodes are independent of each other given B? We need to find which nodes are d-separated by B. A, C and D are all d-separated from F because of the node E. C is d-separated from all the other nodes (except B) given B. The independent pairs given B are hence: AF, AC, CD, CE, CF, DF. 5. Do we have that: P(AF|E) = P(A|E)P(F|E)? (are A and F independent given E?) A and F are NOT independent given E, since E does not d-separate A and F Marko Stupar 11/3370 sm113370m@student.etf.rs 24 /40

25.Create Network – from database Markov Blanket MB(A) - set of nodes composed of A’s parents, its children and their other parents When given MB(A) every other set of nodes in network is conditionally independent or d-Separated of A MB(A) - The only knowledge needed to predict the behavior of A – Pearl 1988. Marko Stupar 11/3370 sm113370m@student.etf.rs 25 /40

26.Mutual information Conditional mutual information Used to quantify dependence between nodes X and Y If we say that X and Y are d-Separated by condition set Z, and that they are conditionally independent Marko Stupar 11/3370 sm113370m@student.etf.rs Create Network – from database Conditional independence (CI) Test 26 /40

27. Marko Stupar 11/3370 sm113370m@student.etf.rs Backup Slides (Not needed) 27 /40

28.Create Network – from database Naïve Bayes Very fast Very robust Target node is the father of all other nodes The low number of probabilities to be estimated Knowing the value of the target makes each node independent Marko Stupar 11/3370 sm113370m@student.etf.rs 28 /40

29.Create Network – from database Augmented Naïve Bayes Naive structure + relations among son nodes | knowing the value of the target node More precise results than with the naive architecture Costs more in time Models: Pruned Naive Bayes (Naive Bayes Build) Simplified decision tree (Single Feature Build) Boosted (Multi Feature Build) Marko Stupar 11/3370 sm113370m@student.etf.rs 29 /40

30.Create Network – from database Augmented Naïve Bayes Tree Augmented Naive Bayes (TAN) Model (a) Compute I(Ai, Aj|Target) between each pair of attributes, i≠j (b) Build a complete undirected graph in which the vertices are the attributes A1, A2, … The weight of an edge connecting Ai and Aj is I(Ai, Aj|Target) (c) Build a maximum weighted spanning tree. (d) Transform the resulting undirected tree to a directed one by choosing a root variable and setting the direction of all edges to be outward from it. (e) Construct a tree augmented naive Bayes model by adding a vertex labeled by C and adding an directional edge from C to each Ai. Marko Stupar 11/3370 sm113370m@student.etf.rs 30 /40

31.Create Network – from database Sons and Spouses Target node is the father of a subset of nodes possibly having other relationships Showing the set of nodes being indirectly linked to the target Time cost of the same order as for the augmented naive Bayes Marko Stupar 11/3370 sm113370m@student.etf.rs 31 /40

32.Create Network – from database Markov Blanket Get relevant nodes on time frame lower than with the other two algorithms Augmented Naïve Bayes and Sons & Spouses Good tool for analyzing one variable Searches for the nodes that belong to the Markov Blanket The observation of the nodes belonging to the Markov Blanket makes the target node independent of all the other nodes. Marko Stupar 11/3370 sm113370m@student.etf.rs 32 /40

33.Create Network – from database Augmented Markov Blanket Marko Stupar 11/3370 sm113370m@student.etf.rs 33 /40

34.An Algorithm for Bayesian Belief Network Construction from Data Jie Cheng, David A. Bell, Weiru Liu School of Information and Software Engineering University of Ulster at Jordanstown Northern Ireland, UK, BT37 0QB e-mail: { j.cheng , da.bell , w.liu}@ ulst.ac.uk Phase I: (Drafting) 1. Initiate a graph G(V, E) where V={all the nodes of a data set}, E={ }. Initiate two empty ordered set S, R. 2. For each pair of nodes ( v , v ) i j where v v V i j , Î , compute mutual information I v v i j ( , ) using equation (1). For the pairs of nodes that have mutual information greater than a certain small value e , sort them by their mutual information from large to small and put them into an ordered set S. 3. Get the first two pairs of nodes in S and remove them from S. Add the corresponding arcs to E. (the direction of the arcs in this algorithm is determined by the previously available nodes ordering.) 4. Get the first pair of nodes remained in S and remove it from S. If there is no open path between the two nodes (these two nodes are d-separated given empty set), add the corresponding arc to E; Otherwise, add the pair of nodes to the end of an ordered set R. 5. Repeat step 4 until S is empty. Phase II: (Thickening) 6. Get the first pair of nodes in R and remove it from R. 7. Find a block set that blocks each open path between these two nodes by a set of minimum number of nodes. (This procedure find_block_set (current graph, node1, node2) is given at the end of this subsection.) Conduct a CI test. If these two nodes are still dependent on each other given the block set, connect them by an arc. 8. go to step 6 until R is empty. Phase III: (Thinning) 9. For each arc in E, if there are open paths between the two nodes besides this arc, remove this arc from E temporarily and call procedure find_block_set (current graph, node1, node2). Conduct a CI test on the condition of the block set. If the two nodes are dependent, add this arc back to E; otherwise remove the arc permanently. Marko Stupar 11/3370 sm113370m@student.etf.rs Create Network – from database Construction Algorithm - Example 34 /40

35.An Algorithm for Bayesian Belief Network Construction from Data Jie Cheng, David A. Bell, Weiru Liu School of Information and Software Engineering University of Ulster at Jordanstown Northern Ireland, UK, BT37 0QB e-mail: { j.cheng , da.bell , w.liu}@ ulst.ac.uk Phase I: (Drafting) 1. Initiate a graph G(V, E) where V={all the nodes of a data set}, E={ }. Initiate two empty ordered set S, R. 2. For each pair of nodes ( v , v ) i j where v v V i j , Î , compute mutual information I v v i j ( , ) using equation (1). For the pairs of nodes that have mutual information greater than a certain small value e , sort them by their mutual information from large to small and put them into an ordered set S. 3. Get the first two pairs of nodes in S and remove them from S. Add the corresponding arcs to E. (the direction of the arcs in this algorithm is determined by the previously available nodes ordering.) 4. Get the first pair of nodes remained in S and remove it from S. If there is no open path between the two nodes (these two nodes are d-separated given empty set), add the corresponding arc to E; Otherwise, add the pair of nodes to the end of an ordered set R. 5. Repeat step 4 until S is empty. Phase II: (Thickening) 6. Get the first pair of nodes in R and remove it from R. 7. Find a block set that blocks each open path between these two nodes by a set of minimum number of nodes. (This procedure find_block_set (current graph, node1, node2) is given at the end of this subsection.) Conduct a CI test. If these two nodes are still dependent on each other given the block set, connect them by an arc. 8. go to step 6 until R is empty. Phase III: (Thinning) 9. For each arc in E, if there are open paths between the two nodes besides this arc, remove this arc from E temporarily and call procedure find_block_set (current graph, node1, node2). Conduct a CI test on the condition of the block set. If the two nodes are dependent, add this arc back to E; otherwise remove the arc permanently. Marko Stupar 11/3370 sm113370m@student.etf.rs Create Network – from database Construction Algorithm - Example 34 /40

36.An Algorithm for Bayesian Belief Network Construction from Data Jie Cheng, David A. Bell, Weiru Liu School of Information and Software Engineering University of Ulster at Jordanstown Northern Ireland, UK, BT37 0QB e-mail: { j.cheng , da.bell , w.liu}@ ulst.ac.uk Phase I: (Drafting) 1. Initiate a graph G(V, E) where V={all the nodes of a data set}, E={ }. Initiate two empty ordered set S, R. 2. For each pair of nodes ( v , v ) i j where v v V i j , Î , compute mutual information I v v i j ( , ) using equation (1). For the pairs of nodes that have mutual information greater than a certain small value e , sort them by their mutual information from large to small and put them into an ordered set S. 3. Get the first two pairs of nodes in S and remove them from S. Add the corresponding arcs to E. (the direction of the arcs in this algorithm is determined by the previously available nodes ordering.) 4. Get the first pair of nodes remained in S and remove it from S. If there is no open path between the two nodes (these two nodes are d-separated given empty set), add the corresponding arc to E; Otherwise, add the pair of nodes to the end of an ordered set R. 5. Repeat step 4 until S is empty. Phase II: (Thickening) 6. Get the first pair of nodes in R and remove it from R. 7. Find a block set that blocks each open path between these two nodes by a set of minimum number of nodes. (This procedure find_block_set (current graph, node1, node2) is given at the end of this subsection.) Conduct a CI test. If these two nodes are still dependent on each other given the block set, connect them by an arc. 8. go to step 6 until R is empty. Phase III: (Thinning) 9. For each arc in E, if there are open paths between the two nodes besides this arc, remove this arc from E temporarily and call procedure find_block_set (current graph, node1, node2). Conduct a CI test on the condition of the block set. If the two nodes are dependent, add this arc back to E; otherwise remove the arc permanently. Marko Stupar 11/3370 sm113370m@student.etf.rs Create Network – from database Construction Algorithm - Example 34 /40

37.Bayesian Network Software Bayesia Lab Weka - Machine Learning Software in Java AgenaRisk , Analytica , Banjo, Bassist, Bayda , BayesBuilder , Bayesware Discoverer , B-course, Belief net power constructor, BNT, BNJ, BucketElim , BUGS, Business Navigator 5, CABeN , Causal discoverer , CoCo+Xlisp , Cispace , DBNbox , Deal, DeriveIt , Ergo , GDAGsim , Genie, GMRFsim , GMTk , gR , Grappa, Hugin Expert, Hydra, Ideal, Java Bayes , KBaseAI , LibB , MIM, MSBNx , Netica , Optimal Reinsertion, PMT Marko Stupar 11/3370 sm113370m@student.etf.rs 37 /40

38.Problem Trend Marko Stupar 11/3370 sm113370m@student.etf.rs History The term "Bayesian networks" was coined by Judea Pearl in 1985 In the late 1980s the seminal texts Probabilistic Reasoning in Intelligent Systems and Probabilistic Reasoning in Expert Systems summarized the properties of Bayesian networks Fields of Expansion Naïve Bayes Choose optimal PDF Bayesian Networks Find new way to construct network 38 /40

39.Bibliography – borrowed parts Naïve Bayes Classifiers, Andrew W. Moore Professor School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~awm awm@cs.cmu.edu 412-268-7599 http://en.wikipedia.org/wiki/Bayesian_network Bayesian Measurement of Associations in Adverse Drug Reaction Databases William DuMouchel Shannon Laboratory, AT&T Labs –Research dumouchel@research.att.com DIMACS Tutorial on Statistical Surveillance Methods Rutgers University June 20, 2003 http://download.oracle.com/docs/cd/B13789_01/datamine.101/b10698/3predict.htm#1005771 CS/CNS/EE 155: Probabilistic Graphical Models Problem Set 2 Handed out: 21 Oct 2009 Due: 4 Nov 2009 Learning Bayesian Networks from Data: An Efficient Approach Based on Information Theory Jie Cheng Dept. of Computing Science University of Alberta Alberta , T6G 2H1 Email: jcheng@cs.ualberta.ca David Bell, Weiru Liu Faculty of Informatics, University of Ulster, UK BT37 0QB Email: {w.liu, da.bell }@ ulst.ac.uk http://www.bayesia.com/en/products/bayesialab/tutorial.php ISyE8843A, Brani Vidakovic Handout 17 1 Bayesian Networks Bayesian networks Chapter 14 Section 1 – 2 Naive- Bayes Classification Algorithm Lab4-NaiveBayes.pdf Top 10 algorithms in data mining XindongWu · Vipin Kumar · J. Ross Quinlan · Joydeep Ghosh · Qiang Yang · Hiroshi Motoda · Geoffrey J. McLachlan · Angus Ng · Bing Liu · Philip S. Yu · Zhi-Hua Zhou · Michael Steinbach · David J. Hand · Dan Steinberg Received: 9 July 2007 / Revised: 28 September 2007 / Accepted: 8 October 2007 Published online: 4 December 2007 © Springer- Verlag London Limited 2007 Causality Computational Systems Biology Lab Arizona State University Michael Verdicchio With some slides and slide content from: Judea Pearl, Chitta Baral , Xin Zhang Marko Stupar 11/3370 sm113370m@student.etf.rs 39 /40

40. Marko Stupar 11/3370 sm113370m@student.etf.rs Questions? 40 /40

user picture
  • 小二郎
  • Apparently, this user prefers to keep an air of mystery about them.

相关Slides

  • 视觉任务之间是否有关系,或者它们是否无关?例如,表面法线可以简化估算图像的深度吗?直觉回答了这些问题,暗示了视觉任务中存在结构。了解这种结构具有显著的价值;它是传递学习的基本概念,并提供了一种原则性的方法来识别任务之间的冗余,例如,无缝地重用相关任务之间的监督或在一个系统中解决许多任务而不会增加复杂性。 我们提出了一种完全计算的方法来建模视觉任务的空间结构。这是通过在隐空间中的二十六个2D,2.5D,3D和语义任务的字典中查找(一阶和更高阶)传递学习依赖性来完成的。该产品是用于任务迁移学习的计算分类地图。我们研究了这种结构的后果,例如:非平凡的关系,并利用它们来减少对标签数据的需求。例如,我们表明,解决一组10个任务所需的标记数据点总数可以减少大约2/3(与独立训练相比),同时保持性能几乎相同。我们提供了一套用于计算和探测这种分类结构的工具,包括用户可以用来为其用例设计有效监督策略。

  • 尽管最近在生成图像建模方面取得了进展,但是从像ImageNet这样的复杂数据集中成功生成高分辨率,多样化的样本仍然是一个难以实现的目标。为此,我们以最大规模训练了生成性对抗网络,并研究了这种规模所特有的不稳定性。我们发现将正交正则化应用于生成器使得它适合于简单的“截断技巧”,允许通过截断潜在空间来精确控制样本保真度和多样性之间的权衡。我们的修改导致模型在类条件图像合成中达到了新的技术水平。当我们在ImageNet上以128×128分辨率进行训练时,我们的模型(BigGAN)的初始得分(IS)为166.3,Frechet初始距离(FID)为9.6,比之前的最优IS为52.52,FID为18.65有了显著的提升。

  • 2017年,以斯坦福大学为首、包括吴恩达、李开复等一众大咖专家团队齐力打造的人工智能指数(AI Index)重磅年度报告首次发布。从学术、业界发展、政府策略等方面对全年的人工智能全球发展进行了回顾,堪称全年人工智能最强报告。 该重点介绍了人工智能领域的投资和工作岗位前所未有的增长速度,尤其是在游戏和计算机视觉领域进展飞速。

  • 18年12月12日,哈佛大学,麻省理工学院,斯坦福大学以及OpenAI等联合发布了第二届人工智能指数(AI Index)年度报告。 人工智能领域这一行业的发展速度,不仅仅是通过实际产品的产生以及研究成果来衡量,还要考虑经济学家和政策制定者的预测和担忧。这个报告的目标是使用硬数据衡量人工智能领域的发展。 报告中多次提及了中国人工智能的发展以及清华大学: 美国仅占到全球论文发布内容的17%,欧洲是论文最高产的国家,18年发表的论文在全球范围内占比28%,中国紧随其后,占比25%。; 大学人工智能和机器学习相关课程注册率在全球范围都有大幅提升,其中最瞩目的是清华大学,相关课程2017年的注册率比2010年高出16倍,比2016年高出了将近3倍; 各国对人工智能应用方向重视不同。中国非常重视农业科学,工程和技术方面的应用,相比于2000年,2017年,中国加大了对农业方面的重视。 吴恩达也在今天的推特中重磅推荐了这份报告,称“数据太多了”,并划重点了两个报告亮点:人工智能在业界和学界都发展迅速;人工智能的发展仍需要更加多样包容。