diff --git a/README.md b/README.md index 155752a..678987b 100644 --- a/README.md +++ b/README.md @@ -1,81 +1,204 @@ + # TypeScript Code Extractor and Analyzer -This project provides an advanced toolkit for parsing TypeScript code using the TypeScript Abstract Syntax Tree (AST) to extract, analyze, and map code structures. +The **TypeScript Code Extractor and Analyzer** is a robust library designed to parse and analyze TypeScript and JavaScript codebases using the TypeScript Abstract Syntax Tree (AST). It generates a structured, hierarchical representation of your codebase, detailing modules, classes, functions, properties, interfaces, enums, and dependencies. This tool is perfect for developers creating code analysis tools, documentation generators, or AI-driven systems like Retrieval-Augmented Generation (RAG) for codebases. + +## Table of Contents +- [TypeScript Code Extractor and Analyzer](#typescript-code-extractor-and-analyzer) + - [Table of Contents](#table-of-contents) + - [Key Features](#key-features) + - [Installation](#installation) + - [Getting Started](#getting-started) + - [Basic Example](#basic-example) + - [API Reference](#api-reference) + - [`TypeScriptCodeMapper`](#typescriptcodemapper) + - [Data Structures](#data-structures) + - [Sample `ICodebaseMap` Structure](#sample-icodebasemap-structure) + - [Examples](#examples) + - [Analyzing a Single File's Dependencies](#analyzing-a-single-files-dependencies) + - [Handling Errors](#handling-errors) + - [Notes](#notes) + - [Contributing](#contributing) + - [License](#license) + +## Key Features + +- **AST-based Class Metadata Extraction**: Captures detailed metadata about classes, including methods, properties, interfaces, and enums. +- **Function and Method Signature Analysis**: Parses function signatures to extract parameters, return types, and JSDoc comments. +- **Interface and Enum Parsing**: Extracts TypeScript-specific constructs for comprehensive type system analysis. +- **Dependency Graph Construction**: Builds a graph of file dependencies by analyzing import declarations. +- **JavaScript Support**: Analyzes JavaScript files with type inference from JSDoc comments when `"allowJs": true` is set in `tsconfig.json`. + +## Installation + +Install the library using npm: + +```bash +npm install @traversets/code-extractor +``` + +Ensure your project includes a `tsconfig.json` file. For JavaScript projects, add the following to enable parsing: + +```json +{ + "compilerOptions": { + "allowJs": true + } +} +``` -TypeScript Code Extractor and Analyzer is a robust system that utilizes a TypeScript parser to navigate through the codebase's AST, extracting structured metadata about various components such as modules, classes, functions, interfaces, properties, and enums. +## Getting Started -Key features: +To begin analyzing your codebase, create an instance of `TypeScriptCodeMapper` and use the `buildCodebaseMap` method to generate a comprehensive map of your codebase. This map is returned as a `Result`, which you can inspect for success or errors. -- AST-based Class Metadata Extraction: Utilizes TypeScript's AST to gather comprehensive metadata on class methods, properties, interfaces, and enums. -- Function and Method Signature Analysis: Parses function signatures from the AST for details on parameters, return types, and inferred type information. -- Interface and Enum Parsing: Extracts information from AST nodes representing interfaces and enums in TypeScript. -- Dependency Graph Construction: Builds a graph of file dependencies by analyzing import declarations within the AST. +### Basic Example -### Installation -To integrate this tool into your project, install it via npm: -``` -npm i @traversets/code-extractor +```typescript +import { TypeScriptCodeMapper } from '@traversets/code-extractor'; + +async function analyzeCodebase() { + const codeMapper = new TypeScriptCodeMapper(); + const result = await codeMapper.buildCodebaseMap(); + if (result.isOk()) { + console.log(JSON.stringify(result.getValue(), null, 2)); + } else { + console.error('Error:', result.getError()); + } +} + +analyzeCodebase(); ``` -### Code Analysis +This example outputs a JSON structure representing your codebase, including modules, classes, functions, and dependencies. -Below is an example of how to use the AST parser for code analysis: +## API Reference -```typescript +### `TypeScriptCodeMapper` -const codeMapper: TypeScriptCodeMapper = new TypeScriptCodeMapper(); +The primary class for codebase analysis, offering methods to extract and navigate metadata. -// Get Root files -const rootFiles: readonly string[] = codeMapper.getRootFileNames(); +| Method | Description | Parameters | Return Type | +| --- | --- | --- | --- | +| `getRootFileNames()` | Retrieves the list of root file names from the TypeScript program, as specified in `tsconfig.json`. | None | `readonly string[] | undefined` | +| `getSourceFile(fileName: string)` | Retrieves the source file object for a given file name. | `fileName: string` | `ts.SourceFile | undefined` | +| `buildDependencyGraph(sourceFile: ts.SourceFile)` | Builds a dependency graph by extracting import statements from a source file. | `sourceFile: ts.SourceFile` | `string[]` | +| `buildCodebaseMap()` | Generates a hierarchical map of the codebase, including modules, classes, functions, properties, interfaces, enums, and dependencies. | None | `Promise>` | +| `getProgram()` | Returns the current TypeScript program instance. | None | `ts.Program | undefined` | +| `getTypeChecker()` | Retrieves the TypeScript TypeChecker instance for type analysis. | None | `ts.TypeChecker | undefined` | -// Convert a rootFile into a sourceFile -const sourceFile: ts.SourceFile = codeMapper.getSourceFile(rootFiles[5]); +**Note**: For `buildCodebaseMap`, check `result.isOk()` to confirm success before accessing `result.getValue()`. Use `result.getError()` to handle errors. -// Build a dependency graph -const getSourceFileDepencies: string[] = codeMapper.buildDependencyGraph(sourceFile); +## Data Structures -// Build a codebase map -const codebaseMap = await codeMapper.buildCodebaseMap().getValue(); -``` +The library uses interfaces to represent extracted metadata: -### Sample Response Structure -The resulting JSON structure reflects the TypeScript AST's hierarchical representation: -``` +| Interface | Description | +| --- | --- | +| `IClassInfo` | Represents a class with its name, functions, properties, interfaces, and enums. | +| `IModuleInfo` | Represents a module (file) with its path, classes, functions, interfaces, enums, and dependencies. | +| `IFunctionInfo` | Represents a function with its name, content, parameters, return type, and comments. | +| `IProperty` | Represents a property with its name and type. | +| `IInterfaceInfo` | Represents an interface with its name, properties, and summary. | +| `IEnumInfo` | Represents an enum with its name, members, and summary. | +| `ICodebaseMap` | A hierarchical map of the codebase, mapping project names to modules. | + +### Sample `ICodebaseMap` Structure + +```json { - "MyProject": { + "projectName": { "modules": { - "src/utils/logger.ts": { + "src/index.ts": { + "path": "src/index.ts", "classes": [ { - "name": "Logger", + "name": "ExampleClass", "functions": [ { - "name": "log", - "parameters": [{ "name": "message", "type": "string" }], + "name": "exampleMethod", + "content": "function exampleMethod(param: string) { ... }", + "parameters": [ + { + "name": "param", + "type": "string" + } + ], "returnType": "void", - "content": "", - "comment": "Logs application Error" + "comments": "Example method description" } ], "properties": [ - { "name": "logLevel", "type": "LogLevel" } - ] + { + "name": "exampleProperty", + "type": "number" + } + ], + "interfaces": [], + "enums": [] } ], "functions": [], "interfaces": [], "enums": [], - "dependencies": ["import { LogLevel } from './types';"] + "dependencies": [ + "import * as fs from 'fs';" + ] } } } } +``` + +## Examples + +### Analyzing a Single File's Dependencies +```typescript +import { TypeScriptCodeMapper } from '@traversets/code-extractor'; + +const codeMapper = new TypeScriptCodeMapper(); +const rootFiles = codeMapper.getRootFileNames(); +if (rootFiles && rootFiles.length > 0) { + const sourceFile = codeMapper.getSourceFile(rootFiles[0]); + if (sourceFile) { + const dependencies = codeMapper.buildDependencyGraph(sourceFile); + console.log('Dependencies:', dependencies); + } +} ``` -### Usage for Agentic RAG Systems -This tool enhances Retrieval-Augmented Generation (RAG) systems by: +### Handling Errors + +```typescript +import { TypeScriptCodeMapper } from '@traversets/code-extractor'; + +async function analyzeWithErrorHandling() { + const codeMapper = new TypeScriptCodeMapper(); + try { + const result = await codeMapper.buildCodebaseMap(); + if (result.isOk()) { + console.log('Codebase Map:', JSON.stringify(result.getValue(), null, 2)); + } else { + console.error('Failed to build codebase map:', result.getError()); + } + } catch (error) { + console.error('Unexpected error:', error); + } +} + +analyzeWithErrorHandling(); +``` + +## Notes + +- **JavaScript Support**: The library supports JavaScript parsing by enabling `"allowJs": true` in `tsconfig.json`. Use JSDoc comments (e.g., `/** @returns {number} */`) to enhance type inference. +- **Error Handling**: Methods like `buildCodebaseMap` return a `Result` type. Always check `isOk()` before accessing `getValue()` to handle errors gracefully. +- **Performance**: For large codebases, optimize `tsconfig.json` to include only necessary files, reducing processing time. + +## Contributing + +Contributions are welcome! Please submit issues or pull requests to the [GitHub Repository](https://github.com/olasunkanmi-SE/ts-codebase-analyzer). Follow the contribution guidelines in the repository for coding standards and testing requirements. -- Parsing the TypeScript AST into embeddings for semantic code search and similarity matching -- Leveraging AST metadata for advanced code analysis, query resolution, or to aid in code generation, thereby improving the understanding and manipulation of TypeScript codebases within AI systems. +## License +This library is licensed under the MIT License. See the [LICENSE](https://github.com/olasunkanmi-SE/ts-codebase-analyzer/blob/main/LICENSE) file for details. \ No newline at end of file diff --git a/package.json b/package.json index ad0fca1..597b605 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { - "name": "@ts-toolbox/code-extractor", - "version": "0.0.1", + "name": "@traversets/code-extractor", + "version": "0.0.8", "description": "The TypeScript Code Extractor and Analyzer can be handy for RAG (Retrieval-Augmented Generation) systems for codebases. It provides a detailed and structured representation of the codebase that can be converted into embeddings, enabling more effective advanced code analysis, retrieval, and generation tasks.", "main": "dist/index.js", "types": "dist/index.d.ts",