|
| 1 | + |
1 | 2 | # TypeScript Code Extractor and Analyzer |
2 | 3 |
|
3 | | -This project provides an advanced toolkit for parsing TypeScript code using the TypeScript Abstract Syntax Tree (AST) to extract, analyze, and map code structures. |
| 4 | +The **TypeScript Code Extractor and Analyzer** is a robust library designed to parse and analyze TypeScript and JavaScript codebases using the TypeScript Abstract Syntax Tree (AST). It generates a structured, hierarchical representation of your codebase, detailing modules, classes, functions, properties, interfaces, enums, and dependencies. This tool is perfect for developers creating code analysis tools, documentation generators, or AI-driven systems like Retrieval-Augmented Generation (RAG) for codebases. |
| 5 | + |
| 6 | +## Table of Contents |
| 7 | +- [TypeScript Code Extractor and Analyzer](#typescript-code-extractor-and-analyzer) |
| 8 | + - [Table of Contents](#table-of-contents) |
| 9 | + - [Key Features](#key-features) |
| 10 | + - [Installation](#installation) |
| 11 | + - [Getting Started](#getting-started) |
| 12 | + - [Basic Example](#basic-example) |
| 13 | + - [API Reference](#api-reference) |
| 14 | + - [`TypeScriptCodeMapper`](#typescriptcodemapper) |
| 15 | + - [Data Structures](#data-structures) |
| 16 | + - [Sample `ICodebaseMap` Structure](#sample-icodebasemap-structure) |
| 17 | + - [Examples](#examples) |
| 18 | + - [Analyzing a Single File's Dependencies](#analyzing-a-single-files-dependencies) |
| 19 | + - [Handling Errors](#handling-errors) |
| 20 | + - [Notes](#notes) |
| 21 | + - [Contributing](#contributing) |
| 22 | + - [License](#license) |
| 23 | + |
| 24 | +## Key Features |
| 25 | + |
| 26 | +- **AST-based Class Metadata Extraction**: Captures detailed metadata about classes, including methods, properties, interfaces, and enums. |
| 27 | +- **Function and Method Signature Analysis**: Parses function signatures to extract parameters, return types, and JSDoc comments. |
| 28 | +- **Interface and Enum Parsing**: Extracts TypeScript-specific constructs for comprehensive type system analysis. |
| 29 | +- **Dependency Graph Construction**: Builds a graph of file dependencies by analyzing import declarations. |
| 30 | +- **JavaScript Support**: Analyzes JavaScript files with type inference from JSDoc comments when `"allowJs": true` is set in `tsconfig.json`. |
| 31 | + |
| 32 | +## Installation |
| 33 | + |
| 34 | +Install the library using npm: |
| 35 | + |
| 36 | +```bash |
| 37 | +npm install @traversets/code-extractor |
| 38 | +``` |
| 39 | + |
| 40 | +Ensure your project includes a `tsconfig.json` file. For JavaScript projects, add the following to enable parsing: |
| 41 | + |
| 42 | +```json |
| 43 | +{ |
| 44 | + "compilerOptions": { |
| 45 | + "allowJs": true |
| 46 | + } |
| 47 | +} |
| 48 | +``` |
4 | 49 |
|
5 | | -TypeScript Code Extractor and Analyzer is a robust system that utilizes a TypeScript parser to navigate through the codebase's AST, extracting structured metadata about various components such as modules, classes, functions, interfaces, properties, and enums. |
| 50 | +## Getting Started |
6 | 51 |
|
7 | | -Key features: |
| 52 | +To begin analyzing your codebase, create an instance of `TypeScriptCodeMapper` and use the `buildCodebaseMap` method to generate a comprehensive map of your codebase. This map is returned as a `Result<ICodebaseMap>`, which you can inspect for success or errors. |
8 | 53 |
|
9 | | -- AST-based Class Metadata Extraction: Utilizes TypeScript's AST to gather comprehensive metadata on class methods, properties, interfaces, and enums. |
10 | | -- Function and Method Signature Analysis: Parses function signatures from the AST for details on parameters, return types, and inferred type information. |
11 | | -- Interface and Enum Parsing: Extracts information from AST nodes representing interfaces and enums in TypeScript. |
12 | | -- Dependency Graph Construction: Builds a graph of file dependencies by analyzing import declarations within the AST. |
| 54 | +### Basic Example |
13 | 55 |
|
14 | | -### Installation |
15 | | -To integrate this tool into your project, install it via npm: |
16 | | -``` |
17 | | -npm i @traversets/code-extractor |
| 56 | +```typescript |
| 57 | +import { TypeScriptCodeMapper } from '@traversets/code-extractor'; |
| 58 | + |
| 59 | +async function analyzeCodebase() { |
| 60 | + const codeMapper = new TypeScriptCodeMapper(); |
| 61 | + const result = await codeMapper.buildCodebaseMap(); |
| 62 | + if (result.isOk()) { |
| 63 | + console.log(JSON.stringify(result.getValue(), null, 2)); |
| 64 | + } else { |
| 65 | + console.error('Error:', result.getError()); |
| 66 | + } |
| 67 | +} |
| 68 | + |
| 69 | +analyzeCodebase(); |
18 | 70 | ``` |
19 | 71 |
|
20 | | -### Code Analysis |
| 72 | +This example outputs a JSON structure representing your codebase, including modules, classes, functions, and dependencies. |
21 | 73 |
|
22 | | -Below is an example of how to use the AST parser for code analysis: |
| 74 | +## API Reference |
23 | 75 |
|
24 | | -```typescript |
| 76 | +### `TypeScriptCodeMapper` |
25 | 77 |
|
26 | | -const codeMapper: TypeScriptCodeMapper = new TypeScriptCodeMapper(); |
| 78 | +The primary class for codebase analysis, offering methods to extract and navigate metadata. |
27 | 79 |
|
28 | | -// Get Root files |
29 | | -const rootFiles: readonly string[] = codeMapper.getRootFileNames(); |
| 80 | +| Method | Description | Parameters | Return Type | |
| 81 | +| --- | --- | --- | --- | |
| 82 | +| `getRootFileNames()` | Retrieves the list of root file names from the TypeScript program, as specified in `tsconfig.json`. | None | `readonly string[] | undefined` | |
| 83 | +| `getSourceFile(fileName: string)` | Retrieves the source file object for a given file name. | `fileName: string` | `ts.SourceFile | undefined` | |
| 84 | +| `buildDependencyGraph(sourceFile: ts.SourceFile)` | Builds a dependency graph by extracting import statements from a source file. | `sourceFile: ts.SourceFile` | `string[]` | |
| 85 | +| `buildCodebaseMap()` | Generates a hierarchical map of the codebase, including modules, classes, functions, properties, interfaces, enums, and dependencies. | None | `Promise<Result<ICodebaseMap>>` | |
| 86 | +| `getProgram()` | Returns the current TypeScript program instance. | None | `ts.Program | undefined` | |
| 87 | +| `getTypeChecker()` | Retrieves the TypeScript TypeChecker instance for type analysis. | None | `ts.TypeChecker | undefined` | |
30 | 88 |
|
31 | | -// Convert a rootFile into a sourceFile |
32 | | -const sourceFile: ts.SourceFile = codeMapper.getSourceFile(rootFiles[5]); |
| 89 | +**Note**: For `buildCodebaseMap`, check `result.isOk()` to confirm success before accessing `result.getValue()`. Use `result.getError()` to handle errors. |
33 | 90 |
|
34 | | -// Build a dependency graph |
35 | | -const getSourceFileDepencies: string[] = codeMapper.buildDependencyGraph(sourceFile); |
| 91 | +## Data Structures |
36 | 92 |
|
37 | | -// Build a codebase map |
38 | | -const codebaseMap = await codeMapper.buildCodebaseMap().getValue(); |
39 | | -``` |
| 93 | +The library uses interfaces to represent extracted metadata: |
40 | 94 |
|
41 | | -### Sample Response Structure |
42 | | -The resulting JSON structure reflects the TypeScript AST's hierarchical representation: |
43 | | -``` |
| 95 | +| Interface | Description | |
| 96 | +| --- | --- | |
| 97 | +| `IClassInfo` | Represents a class with its name, functions, properties, interfaces, and enums. | |
| 98 | +| `IModuleInfo` | Represents a module (file) with its path, classes, functions, interfaces, enums, and dependencies. | |
| 99 | +| `IFunctionInfo` | Represents a function with its name, content, parameters, return type, and comments. | |
| 100 | +| `IProperty` | Represents a property with its name and type. | |
| 101 | +| `IInterfaceInfo` | Represents an interface with its name, properties, and summary. | |
| 102 | +| `IEnumInfo` | Represents an enum with its name, members, and summary. | |
| 103 | +| `ICodebaseMap` | A hierarchical map of the codebase, mapping project names to modules. | |
| 104 | + |
| 105 | +### Sample `ICodebaseMap` Structure |
| 106 | + |
| 107 | +```json |
44 | 108 | { |
45 | | - "MyProject": { |
| 109 | + "projectName": { |
46 | 110 | "modules": { |
47 | | - "src/utils/logger.ts": { |
| 111 | + "src/index.ts": { |
| 112 | + "path": "src/index.ts", |
48 | 113 | "classes": [ |
49 | 114 | { |
50 | | - "name": "Logger", |
| 115 | + "name": "ExampleClass", |
51 | 116 | "functions": [ |
52 | 117 | { |
53 | | - "name": "log", |
54 | | - "parameters": [{ "name": "message", "type": "string" }], |
| 118 | + "name": "exampleMethod", |
| 119 | + "content": "function exampleMethod(param: string) { ... }", |
| 120 | + "parameters": [ |
| 121 | + { |
| 122 | + "name": "param", |
| 123 | + "type": "string" |
| 124 | + } |
| 125 | + ], |
55 | 126 | "returnType": "void", |
56 | | - "content": "", |
57 | | - "comment": "Logs application Error" |
| 127 | + "comments": "Example method description" |
58 | 128 | } |
59 | 129 | ], |
60 | 130 | "properties": [ |
61 | | - { "name": "logLevel", "type": "LogLevel" } |
62 | | - ] |
| 131 | + { |
| 132 | + "name": "exampleProperty", |
| 133 | + "type": "number" |
| 134 | + } |
| 135 | + ], |
| 136 | + "interfaces": [], |
| 137 | + "enums": [] |
63 | 138 | } |
64 | 139 | ], |
65 | 140 | "functions": [], |
66 | 141 | "interfaces": [], |
67 | 142 | "enums": [], |
68 | | - "dependencies": ["import { LogLevel } from './types';"] |
| 143 | + "dependencies": [ |
| 144 | + "import * as fs from 'fs';" |
| 145 | + ] |
69 | 146 | } |
70 | 147 | } |
71 | 148 | } |
72 | 149 | } |
| 150 | +``` |
| 151 | + |
| 152 | +## Examples |
| 153 | + |
| 154 | +### Analyzing a Single File's Dependencies |
73 | 155 |
|
| 156 | +```typescript |
| 157 | +import { TypeScriptCodeMapper } from '@traversets/code-extractor'; |
| 158 | + |
| 159 | +const codeMapper = new TypeScriptCodeMapper(); |
| 160 | +const rootFiles = codeMapper.getRootFileNames(); |
| 161 | +if (rootFiles && rootFiles.length > 0) { |
| 162 | + const sourceFile = codeMapper.getSourceFile(rootFiles[0]); |
| 163 | + if (sourceFile) { |
| 164 | + const dependencies = codeMapper.buildDependencyGraph(sourceFile); |
| 165 | + console.log('Dependencies:', dependencies); |
| 166 | + } |
| 167 | +} |
74 | 168 | ``` |
75 | 169 |
|
76 | | -### Usage for Agentic RAG Systems |
77 | | -This tool enhances Retrieval-Augmented Generation (RAG) systems by: |
| 170 | +### Handling Errors |
| 171 | + |
| 172 | +```typescript |
| 173 | +import { TypeScriptCodeMapper } from '@traversets/code-extractor'; |
| 174 | + |
| 175 | +async function analyzeWithErrorHandling() { |
| 176 | + const codeMapper = new TypeScriptCodeMapper(); |
| 177 | + try { |
| 178 | + const result = await codeMapper.buildCodebaseMap(); |
| 179 | + if (result.isOk()) { |
| 180 | + console.log('Codebase Map:', JSON.stringify(result.getValue(), null, 2)); |
| 181 | + } else { |
| 182 | + console.error('Failed to build codebase map:', result.getError()); |
| 183 | + } |
| 184 | + } catch (error) { |
| 185 | + console.error('Unexpected error:', error); |
| 186 | + } |
| 187 | +} |
| 188 | + |
| 189 | +analyzeWithErrorHandling(); |
| 190 | +``` |
| 191 | + |
| 192 | +## Notes |
| 193 | + |
| 194 | +- **JavaScript Support**: The library supports JavaScript parsing by enabling `"allowJs": true` in `tsconfig.json`. Use JSDoc comments (e.g., `/** @returns {number} */`) to enhance type inference. |
| 195 | +- **Error Handling**: Methods like `buildCodebaseMap` return a `Result` type. Always check `isOk()` before accessing `getValue()` to handle errors gracefully. |
| 196 | +- **Performance**: For large codebases, optimize `tsconfig.json` to include only necessary files, reducing processing time. |
| 197 | + |
| 198 | +## Contributing |
| 199 | + |
| 200 | +Contributions are welcome! Please submit issues or pull requests to the [GitHub Repository](https://github.com/olasunkanmi-SE/ts-codebase-analyzer). Follow the contribution guidelines in the repository for coding standards and testing requirements. |
78 | 201 |
|
79 | | -- Parsing the TypeScript AST into embeddings for semantic code search and similarity matching |
80 | | -- Leveraging AST metadata for advanced code analysis, query resolution, or to aid in code generation, thereby improving the understanding and manipulation of TypeScript codebases within AI systems. |
| 202 | +## License |
81 | 203 |
|
| 204 | +This library is licensed under the MIT License. See the [LICENSE](https://github.com/olasunkanmi-SE/ts-codebase-analyzer/blob/main/LICENSE) file for details. |
0 commit comments