主要介绍阿里云MongoDB服务使用上的一些最佳实践,以及对MongoDB的部署、参数调优

注脚

1.Alibaba Cloud MongoDB looks inside 2018.12.01

2.Agenda u What’s the pain in MongoDB & best practice u MongoDB improvement at AliCloud u Valuable Services at AliCloud

3. About Me { "name": " ", "company": "Alibaba Aliyun", "title": "NoSQL database developer", "work": "MongoDB kernel develop and services", "interests": ["MongoDB", "MySQL", "Rocksdb", ”TiDB"], "team": { "responsibility": "support public & internal NoSQL database", "contributes": "over 15 pull requests and issue in 2016" }, "find_me": "https://www.facebook.com/michaelliuxin" }

4.MongoDB Architecture

5.MongoDB * ReplicaSet * Mongos * Shard cluster * Config server

6.MongoDB Internal

7.Painful u Short connection u hard to optimize & Too much Parameters: directoryPerDB cacheSizeGB journal.enabled, oplogSize u Read(Write)Preference / Read(Write)Majority

8.Improvements u Rewrite AUTH logical code. abandon /dev/urandom usages u DON’T use local & admin & config u cacheSize uses fixed size u journal is always true u WriteMajority for significant data. ReadMajority for non-rollback data

9.Painful – how to deploy (long-term) u Why to using ReplicaSet ? Why Sharding ? u Evaluation: capacity, IO, CPUs u Correct MongoDB connection string u Use WiredTiger as default engine or else Feature wiredtiger mmapv1 Lock granularity Document level Collection level Write performance Excellent Good Read performance Excellent Excellent Compression Yes (snappy, zlib…) No support

10.Improvements & Suggestion u Use Shard cluster only if you need u Secondary / Hidden / Arbiter best usages u WiredTiger can support 95% workload. Forget MMap please u Make specification upgrade easier than easy u connection string use vip to avoid updating configuration u oplog adaptive size (conform to replication)

11.Painful – how to scale u Efficient Hash or Range sharding u How to choose a proper ShardKey u Add secondary

12.Improvements & Suggestion u Think seriously about Hash or Range. (Hot Server) u ShardKey => cardinality & frequency. (avoid balancing) u Adding new Secondary via recovery. from backup u Rebuild a exist Secondary via recovery. from backup

13.Painful – Understand your Query u Explain() u update, delete, find u which indexes were used u covered index (for projection) u slow Query stats u Optimizer u IDHack, Fetcher, IXScan, CollScan

14.Improvements & Suggestion u No more fields should be fetched than exactly you need u Collects slow queries and monitoring u Indexing analyze. Remove non-used indexes for IO/Optimizer

15.Painful – Profiling your DB u Monitor u hardware u conn, driver, read queue u replication u storage (cache, session) u Audit u CRUD, DDL, Authentic u Statement & latency & Time

16.Suggestion u Inspect everything you should care about u Prometheus or Grafana integration u Indexing analyze. only DON’T create useless index (write sensitive)

17.Painful – Backup your data u Tools : restore, dump u Full-Backup & Incremental-Backup u Physical and Logical Backup u Hot & Cold

18.Improvements & Suggestion u fine grained backup policy u mongodump “—oplog” is always set. u mongorestore can be resume for large dataset. u Hot-Physical backup’s performance is 3x~100x faster: u 1). WiredTiger checkpoint. physical files copy. u 2). Improve official checkpoint performance.

19.MongoDB Cloud Services

20.Ecosystem Services

21.Services – Index Recommend https://help.aliyun.com/document_detail/98239.html u Collects slow quires as sample u Rules based indexing analyze u Generate detail reports and the clear recommends u Without build indexes automatically

22.Services – Index Recommend

23.Services - Monitoring u 1s granularity. can be shrink to 2s, 5s, 15s, 1m as well u Metrics : mongo, wiredTiger, cpu, memory … u Metrics aggregate & Integrate with alarm policy

24.Services - Monitoring

25.Services – Disaster Recovery

26. MongoShake u Support multi data-center. everyone has complete data u Apps can write anywhere(non-conflict). Routing is decided via application u A portion of servers crash or lost entire data-center. Apps just route the requests to other data-center. u Performance : replicated 30w/s, latency ~1s

27.MongoShake open-source : https://github.com/aliyun/mongo-shake

28.

user picture
  • 蓝色的海牛
  • 一个幽灵,共产主义的幽灵,在欧洲大陆徘徊。

相关Slides

  • 随着数据规模越来越大,存储和运维成本逐渐增加,有人认为MySQL架构的分布式数据库已经过时,现在是NewSQL的天下,本次分享把分布式一致性协议Raft与MySQL高可用集群相结合,打造一款新式分布式数据库架构(MyNewSQL)。 听众受益:如何做到高可用、如何做到强一致、如何做到可扩展、如何设计Binlog,并行/串行回放、数据如何压缩及快速检索。

  • 大规模实践基于Docker的MySQL私有云平台。集成高可用、快速部署、自动化备份、性能监控、故障分析、过载保护、扩容缩容等多项自动化运维功能。数据库高可用是不容忽视的,在Docker容器分配时如何保障主从不在同一宿主机上呢?我们通过自研Docker容器调度平台,自定义Docker容器的分配算法。实现了MySQL的高密度、隔离化、高可用化部署。同时结合我们自研的数据库中间件,支持了分片集群及无感知的高可用切换功能。截止目前平台支撑了目前总量90%以上的MySQL服务(实际数量超过2000个),资源利用率提升30倍,数据库交付能力提升70倍。并且经受住了十一黄金周、春节票务业务高峰期的考验。未来将致力于数据库自动化向智能化的推进。

  • 在云时代的今天,企业数据库面临着复杂的选择,数据库异构迁移往往达不到预期效果,樊文凯想大家分享了ADAM数据库和应⽤用迁移(Advanced Database & ApplicationMigration, 以下简称ADAM),ADAM是阿里云结合阿里巴巴多年年内部业务系统数据库和应⽤用异构迁移的经验(去IOE),⾃自主研发的、迁移ORACLE数据库和应⽤用⾄至阿⾥里里云相关云产品的专业产品,分享了ADAMA的结构、高性能、数据库割接、智能分析、所用的生态工具等,典型的数据库中出现的痛点。

  • 主要介绍阿里云MongoDB服务使用上的一些最佳实践,以及对MongoDB的部署、参数调优