- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 视频嵌入链接 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Rikai Core Design in 2022
Rikai是一个用于分析并理解大规模视频内容的开源引擎。使用Rikai,用户可以使用标准的Spark SQL直接调用AI模型,像分析结构化数据一样,分析大规模的视频数据,理解视频内容。
我们将以Rikai为例,探讨数据和AI的融合这个话题。
展开查看详情
1 .Core Design
2 .Parquet-based ML data format optimized for working with unstructured data ● https://github.com/eto-ai/rikai ○ PySpark ○ PyTorch / Tensorflow / Apache MXNet / Oneflow / scikit-learn ○ Pandas ○ Parquet Processing videos on Apache Spark ● https://github.com/eto-ai/spark-video ○ Apache Spark ■ https://github.com/bytedeco/javacv ● https://github.com/bytedeco/javacpp ○ https://github.com/FFmpeg/FFmpeg 上海素当信息技术有限公司
3 .Rikai Core Design in 2022 ● Rikai Intro ● Rikai Architecture ● Rikai Visualization ● Rikai Model Type 上海素当信息技术有限公司
4 . 1 Rikai Intro 上海素当信息技术有限公司
5 .Model Data Stack: Rikai 上海素当信息技术有限公司 From https://getdbt.com
6 . 2 Rikai Architecture 上海素当信息技术有限公司
7 .Architecture ● Storage: ○ Apache Parquet-based persistent storage ○ Extend Spark User Defined Type (UDT) to support Sementic Types ■ Tensors (numpy.ndarray, tf.Tensor, torch.Tensor) ■ Vision / Video based ML objects, i.e., 2D/3D Bounding Box, Polygon, Lidar Point Clouds ● SDK ○ Native loader and tensor converters for Pandas, Pytorch and Tensorflow. ○ Jupyter notebook ● SQL-ML Extension ○ Support arbitrary Pytorch/Tensorflow/Sklearn Model inference via Rikai SQL-ML extensions ○ Model Registry integration (i.e., Mlflow) 上海素当信息技术有限公司
8 .Spark Extension 上海素当信息技术有限公司
9 . 3 Rikai Visualization 上海素当信息技术有限公司
10 .Mojito: DSL for Image Transformation ● How to add layers to the base Image? ○ Box2d ○ Text ○ (more to come…) image | Box2d(xmin, ymin, xmax, ymax) image | Text(“label 100%”, (xmin, ymin-10)) ● How to add masks to the base Image? image | Mask ● How to scale the base Image? (Good First Issue) image * 2 image * (2, 1) image * (1, 2) image * (0.5, 1) image * (1, 0.5) image * (0.6, 0.8) 上海素当信息技术有限公司
11 .Mojito: Rikai SQL on Single Image CREATE (OR REPLACE)? MODEL (IF NOT EXISTS)? model=qualifiedName (FLAVOR flavor=identifier)? (MODEL_TYPE modeltype=qualifiedName)? (OPTIONS optionList)? (RETURNS datatype=dataType)? (USING uri=STRING) array<struct<box:box2d, score:float, label:string>> 上海素当信息技术有限公司
12 .Mojito: Rikai SQL on Video Data Available Options: ● fps ● scalerFlag ● imageWidth ● imageHeight Sampling Options: ● frameStepSize ● frameStepOffset 0 100 200 300 400 500 … 1 101 201 301 401 501 … 2 102 202 302 402 502 … 上海素当信息技术有限公司
13 . 4 Rikai Model Type 上海素当信息技术有限公司
14 .Model Type v0: preprocessors/postprocessors/OUTPUT_SCHEMA pre_processing pre_processing pre_processing Model post_processing pre_processing Application pre_processing array< struct< box:box2d, score:float, DataLoader label:string ● CPU >> ● GPU 上海素当信息技术有限公司
15 .Real World Rikai: MLflow Registry CREATE OR REPLACE MODEL yolov5s USING 'mlflow:///da-yolov5s-model'; CREATE OR REPLACE MODEL yolov5s USING 'mlflow:///da-yolov5s-model/12'; CREATE OR REPLACE MODEL yolov5s OPTIONS ( ‘iou_thres’ 0.5 ) USING 'mlflow:///da-yolov5s-model'; CREATE OR REPLACE MODEL yolov5s FLAVOR pytorch_6 USING ‘mlflow:///da-yolov5s-model’ 上海素当信息技术有限公司
16 .Flavor and Pandas UDF Codegen Built-in flavors ● rikai.spark.sql.codegen.pytorch.generate_udf ● rikai.spark.sql.codegen.tensorflow.generate_udf ● rikai.spark.sql.codegen.sklearn.generate_udf SQL Customized flavors ● rikai.contrib.flavor_name.codegen.generate_udf select ML_PREDICT(sklearn_m, to_image(uri)) SQL ai.eto.rikai.sql.spark.parser.RikaiExtSqlParser → ai.eto.rikai.sql.model.Registry.resolve → → ai.eto.rikai.sql.model.mlflow.MlflowRegistry rikai.spark.sql.codegen.mlflow_registry.MlflowRegistry → rikai.spark.sql.codegen.base.Registry.resolve → → rikai.spark.sql.codegen.sklearn.generate_udf 上海素当信息技术有限公司
17 .PySpark Pandas UDF in one slide CPU JVM JNI GPU https://github.com/bytedeco/javacpp CPU JVM Python GPU https://github.com/py4j/py4j 上海素当信息技术有限公司
18 .Model Type v1: Building Abstraction using Python class in one class and one python file pre_processing pre_processing pre_processing pre_processing pre_processing Model Application post_processing array< struct< box:box2d, score:float, label:string 上海素当信息技术有限公司 >>
19 .Real World Rikai: Model Type Develop Test & Document Deployment 上海素当信息技术有限公司
20 . 关于示说网 示说网站 示说公众号 上海素当信息技术有限公司