According to sqlite3_analyzer tool the 3 largest tables are CppDocComment (50.3%), CppAstNode (33.1%) and FileContent (5.0%). Measurement is done on a Xerces parsing.
CppDocComment is this huge, because we're storing the comments in HTML format with a lot of styling. We should remove HTML formatting. We should also consider storing these doc comments as compressed .zip.
Related issues: #417, #21.