申请试用
HOT
登录
注册
 
如何开发开源研究数据:以医学图像为例
4 点赞
0 收藏
1下载
白玉兰开源
/
发布于
/
547
人观看

支撑人工智能和机器学习快速发展的一大支柱是数据。在学术界、工业界的努力下,目前业界已经有了各式各样的数据集;但考虑到研究问题的广泛性和演进性,总是需要源源不断的标准数据集来支撑新的研究。

杨健程 上海交通大学博士生

主要研究医学图像分析、3D计算机视觉和可信机器学习,已发表10余篇(共同)一作顶刊顶会论文,包括Cancer Research,EBioMedicine,CVPR,MICCAI,NeurIPS等。担任10余个学术期刊、会议审稿人,多次在国际AI挑战赛中名列前茅,并作为主要组织者举办了MICCAI 2020肋骨骨折挑战赛。
个人主页:https://jiancheng-yang.com

展开查看详情

1.How to Develop Open Research Dataset: Examples of Medical Images 如何开发开源研究数据:以医学图像为例 杨健程 Jiancheng Yang Shanghai Jiao Tong University Jan 26, 2021

2. Biography l BEng’11-15, MEng’15-18, PhD’18- @SJTU l Diplôme d'ingénieur (Master)’14-16 @IMT, FR l Visiting research fellow’20-21 @Harvard (remotely) l Incoming visiting researcher’21-22 @EPFL, CH Medical Image Analysis Clinical Science Methodology Data & Benchmark 3D Vision Trustworthy ML Introduction – Reasons – Steps – Examples – Keys

3. Open Data Makes a Difference Deep learning research is driven by datasets! • Accelerate Research • Benchmarking • Quantitative • Practicality • … Introduction – Reasons – Steps – Examples – Keys

4. Contents • Introduction: Open Data Makes a Difference • Reasons Why You Should Develop New Datasets • Steps to Develop New Datasets • Examples of Medical Images • RibFrac Dataset • MICCAI 2020 RibFrac Challenge • MedMNIST Dataset • Keys to the Success Introduction – Reasons – Steps – Examples – Keys

5. Why You Should Develop New Datasets • Asking new research questions • No existing solution, how about developing a new one? • Improving your own applications • Extending existing datasets for your own purpose • Benchmarking existing methods • Which method is best-performing? • Building influence to advance your career • Datasets and benchmarks are generally highly-cited • Understanding the pitfalls of existing materials and methods • Are existing methods good enough for real-world applications? • Are existing datasets enough for different aspects of model performance (e.g., subtle details, domain generalization, model calibration, …) Introduction – Reasons – Steps – Examples – Keys

6. Steps to Develop New Datasets I. Finding Research Questions V. Benchma- II. Data rking & Collection Evaluation IV. Quality III. Control Annotation Introduction – Reasons – Steps – Examples – Keys

7. Examples of Medical Images RibFrac Dataset MICCAI 2020 RibFrac Challenge MedMNIST Dataset Introduction – Reasons – Steps – Examples – Keys

8. Deep-Learning-Assisted Detection and Segmentation of Rib Fractures from CT Scans: Development and Validation of FracNet Liang Jin*, Jiancheng Yang*, Kaiming Kuang, Bingbing Ni, et al. EBioMedicine 2020 https://m3dv.github.io/FracNet/

9. RibFrac Dataset Introduction – Reasons – Steps – Examples – Keys

10. Network Architecture of FracNet Introduction – Reasons – Steps – Examples – Keys

11. Model Performance Introduction – Reasons – Steps – Examples – Keys

12. Human-computer collaboration Introduction – Reasons – Steps – Examples – Keys

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis Jiancheng Yang, Rui Shi, Bingbing Ni ISBI 2021 https://medmnist.github.io/

24. Motivation MedMNIST Classification Decathlon Educational. Standardized. Diverse. Lightweight. Massive Data Formats: DICOM, NII, nrrd, … Massive Data Modalities: X-Ray, CT, OCT, DR, … Various Licenses Various Resolution 2D or 3D Non-Standardized Pre-Processing Various Data Sizes Introduction – Reasons – Steps – Examples – Keys

25. MedMNIST Overview Introduction – Reasons – Steps – Examples – Keys

26. MedMNIST Overview Tasks (# Name Data Modality # Training # Validation # Test Classes/Labels) PathMNIST Pathology Multi-Class (9) 89,996 10,004 7,180 Multi-Label (14) ChestMNIST Chest X-ray 78,468 11,219 22,433 Binary-Class (2) DermaMNIST Dermatoscope Multi-Class (7) 7,007 1,003 2,005 OCTMNIST OCT Multi-Class (4) 97,477 10,832 1,000 PneumoniaMNIST Chest X-ray Binary-Class (2) 4,708 524 624 Ordinal Regression RetinaMNIST Fundus Camera 1,080 120 400 (5) BreastMNIST Breast Ultrasound Binary-Class (2) 546 78 156 OrganMNIST_Axial Abdominal CT Multi-Class (11) 34,581 6,491 17,778 OragnMNIST_Coro Abdominal CT Multi-Class (11) 13,000 2,392 8,268 nal OrganMNIST_Sagitt Abdominal CT Multi-Class (11) 13,940 2,452 8,829 al Introduction – Reasons – Steps – Examples – Keys

27. Benchmarking AutoML Algorithms Standard ResNets with Early-Stopping Strategy AutoML Tools Introduction – Reasons – Steps – Examples – Keys

28. Benchmarking AutoML Algorithms Introduction – Reasons – Steps – Examples – Keys

29. Keys to the Success I. Finding Research Questions 5 Steps I. Finding Research Questions II. Data Collection V. Benchma- II. Data III. Annotation rking & Collection Evaluation IV. Quality Control V. Benchmarking & Evaluation IV. Quality III. Control Annotation Introduction – Reasons – Steps – Examples – Keys

4 点赞
0 收藏
1下载