Showing posts from February, 2021

Index Responsibly

Indexes make the execution of queries efficient in MongoDB because without indexes, MongoDB must scan every document in a collection. However, having indexes adds overhead to update indexes for write operations. Keyhole prints the details of all indexes and their usages using the -index option. This is useful to identify redundant and/or unused indexes of collections. Indexes and Usages Below is an example of showing indexes and their usages of collection mdb.numbers from a 2-shard sharded cluster.  Keyhole highlights redundant indexes in red with a leading ‘ x ’ and unused indexes since the mongo server started in blue with a leading ‘ ? ’.  A shard key is with a leading ‘ * ’ character.  Usually, redundant indexes are those that can be covered by another compound index.  On the other hand, before dropping any unused indexes, check with the development teams on the purpose of creating those in the first place. See example outputs below: $ keyhole --index "mongodb+srv://user:

Feel The Pulse of Mongo

When diagnosing performance issues of a MongoDB cluster, I usually begin with reviewing logs.  The logs truly keep a history of the server operations over time.  In this post, I am writing particularly about the slow database operations.  Keyhole, with the -loginfo option, reads mongo logs and prints a summary of slow operations grouped by query patterns.  Below is an example. keyhole -loginfo mongod.log It outputs the results to two files.  One file is in compressed bson format (with a -log.bson.gz suffix) and the other stores results as tab-separated values (tsv).  You can either import the tsv output to a spreadsheet to view the results, or use Maobi to generate an HTML report. The -redact option can be used together with the -loginfo option to exclude top n slowest database operations in the output file.  This feature prevents exposing sensitive data such as PII or PHI. There are other options available and the usage is as below: keyhole -loginfo [-collscan] [-regex {regex}] [-r

Survey Your Mongo Land

Keyhole was still under active development after I published the Peek at your MongoDB Clusters like a Pro with Keyhole: Part 1 post in December 2019.  Since then, I have added many features and also changed the format of output files from json to bson.  I’ll use this post to summarize new features and modified functions not available in the previous post.  Recap A couple months into Keyhole development, a clear direction was surfacing: I needed a tool to fetch all information of a MongoDB cluster in one shot.  I joked about this function the “Know the enemy and hit them where it hurts” feature.  The -allinfo option is perhaps the most commonly used one among users.  Using the -allinfo option, Keyhole collects host and build info, replication stats, server status, size and storage stats of databases and collections, index details and usages, and sharding configurations.  The information was collected from the primary member of a replica set or a mongos of a sharded cluster.  Later,

MongoPush - Push-Based MongoDB Atlas Migration Tool

MongoPush tool is another bold stroke in my serendipitous MongoDB career. It is a push based migration tool for MongoDB clusters. It provides high parallelism, supports topology transformation, is resumable, offers progress monitoring, and supports migrating subsets of data.  With MongoDB Atlas gaining popularity and the success of the Keyhole , I was frequently asked whether a migration feature can be added to the Keyhole. The other existing migration solutions often fail because of firewall restrictions or other limitations.  Keyhole serves a different purpose as an analysis tool, so rather than confuse its function, I created MongoPush as a separate tool.  Push-Based Solution My goal was simple, to develop a tool that can perform a sharded cluster migration when the Atlas Live Migration Service is not suitable.  MongoDB provides the mongomirror tool to push data to MongoDB Atlas from behind firewalls, but mongomirror can only migrate data between replica sets. It does not yet p

Peek into WiredTiger Cache

WiredTiger is the default storage engine starting in MongoDB v3.2.  In the diagram below, the red arrows show the flow of data through compression/decompression steps among the WiredTiger cache, file system cache and data files. Ideally, if all data, including indexes, can fit into the WiredTiger cache, accessing documents would be more efficient than having to access disk files frequently.  In reality, memory resources are shared to support other tasks such as file system cache, user connections, and database operations, for example, sorting.  At startup, if the WiredTiger cache size is not defined, the maximum memory allocated to the WiredTiger cache size is about a little less than 50% of the RAM. When experiencing performance degradation, users always intend to tune the WiredTiger cache size.  It is recommended to use the default value unless you have a small working set that can possibly fit into WiredTiger cache or the compressed data size can fit into file system cache.  In mos

Maobi - Reports Generator for Keyhole

The Maobi project was created to generate HTML reports by reading stats collected by Keyhole.  The word maobi is a phonetic translation of Chinese paintbrush or ink brushes, which are commonly used in Chinese calligraphy as well as Japanese calligraphy.  They are also used in Chinese painting and other brush painting styles.  When I was little, it was required in school to use an ink brush to write or to paint pictures.  It took practice and effort to create something pleasant to the eyes. The idea of naming the tool Maobi was a reflection of the time we all spent on preparing reports and presentations. Keyhole started as a command line data collecting and analytics tool. With more data to be presented, it became difficult to include all metrics within a terminal window. So, the project began. Maobi is like a coworker Keyhole can only dream of, one who is smart, friendly, stylish but not flashy, and has contributed greatly to Keyhole’s popularity. Installation The easiest way to