Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
122 changes: 122 additions & 0 deletions docs/ComprehensiveCodeAnalysisReport.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Comprehensive Code Analysis Report

## Top-Level Files

### `./README.md`
* **Purpose**: Main entry point for project information, setup, usage.
* **Completion Status/Key Observations**: Largely up-to-date with recent "Sprint 2" achievements (class support, Union types). Details installation, usage for simple and class examples, output structure, and basic project structure. Mentions supported features and areas for development.
* **Key Relations**: Links to LICENSE, references `requirements.txt`, `examples/`, `src/main.py`.
* **Potential Enhancements/Improvements**:
* Explicitly state that `class_example.py` is the primary example for current advanced features.
* Link to or summarize key findings from `docs/` for a fuller picture of limitations.

### `./requirements.txt`
* **Purpose**: Lists Python dependencies.
* **Completion Status/Key Observations**: Contains standard tools for analysis, testing, formatting (`astroid`, `pylint`, `mypy`, `pytest`, `black`, `networkx`, `typing-extensions`). Appears complete for current needs.
* **Key Relations**: Used in `CONTRIBUTING.md` for setup, essential for development environment.
* **Potential Enhancements/Improvements**: Consider version pinning for more reproducible builds if issues arise.

### `./CONTRIBUTING.md`
* **Purpose**: Provides guidelines for contributing to the project.
* **Completion Status/Key Observations**: Outlines setup, coding standards (Black, Pylint, Mypy), testing procedures, and commit message format. Appears comprehensive.
* **Key Relations**: References `requirements.txt`, `tox.ini`.
* **Potential Enhancements/Improvements**: None apparent at this time.

### `./LICENSE`
* **Purpose**: Specifies the legal terms under which the project is distributed.
* **Completion Status/Key Observations**: Uses the MIT License, a permissive open-source license.
* **Key Relations**: Referenced in `README.md`.
* **Potential Enhancements/Improvements**: None.

### `./tox.ini`
* **Purpose**: Configuration file for tox, an automation tool for Python testing.
* **Completion Status/Key Observations**: Defines test environments for linting (Pylint, Mypy, Black) and unit testing (pytest). Includes commands and dependencies for each environment.
* **Key Relations**: Used by `tox` for automated testing and linting. Crucial for CI/CD.
* **Potential Enhancements/Improvements**: Could be expanded with more specific test targets or coverage analysis.

### `./.gitignore`
* **Purpose**: Specifies intentionally untracked files that Git should ignore.
* **Completion Status/Key Observations**: Includes common Python-related files/directories (`__pycache__`, `*.pyc`, `.env`), virtual environment directories (`venv`, `env`), build artifacts (`dist`, `build`), and IDE-specific files. Seems well-configured.
* **Key Relations**: Standard Git configuration file.
* **Potential Enhancements/Improvements**: None apparent.

## `src/` Directory

### `src/main.py`
* **Purpose**: Main executable script for the Python to DOT graph conversion. Handles command-line arguments, file processing, and DOT graph generation.
* **Completion Status/Key Observations**: Core logic for parsing Python code using `astroid`, building a graph with `networkx`, and outputting DOT format. Supports basic types, functions, classes, and modules. Recent additions include handling of Union types and improved class member representation.
* **Key Relations**: Uses `astroid` for AST parsing, `networkx` for graph representation. Interacts with `src/output_graphs.py`. Reads Python files from `examples/`.
* **Potential Enhancements/Improvements**:
* Refactor large functions for better modularity.
* Enhance error handling for malformed Python inputs.
* Add support for more complex type hints and Python features.

### `src/output_graphs.py`
* **Purpose**: Responsible for generating the DOT language output from the `networkx` graph.
* **Completion Status/Key Observations**: Contains functions to format nodes and edges according to DOT syntax, including styling for different Python constructs (classes, functions, modules, variables, types).
* **Key Relations**: Consumes `networkx` graph objects generated by `src/main.py`.
* **Potential Enhancements/Improvements**:
* Offer more customization options for graph appearance (colors, shapes).
* Support different output formats beyond DOT (e.g., GML, GraphML).

## `examples/` Directory

### `examples/simple_example.py`
* **Purpose**: Provides a basic Python script for demonstrating the tool's functionality with simple functions, variables, and type hints.
* **Completion Status/Key Observations**: Contains straightforward examples of global variables, functions with typed arguments and return values.
* **Key Relations**: Used as an input for `src/main.py` for testing and demonstration.
* **Potential Enhancements/Improvements**: Could include a slightly more complex function or a basic class to showcase more features.

### `examples/class_example.py`
* **Purpose**: Demonstrates the tool's capabilities with Python classes, including methods, attributes, inheritance, and Union type hints.
* **Completion Status/Key Observations**: Contains classes with constructors, methods (with `self`), class variables, instance variables, and inheritance. Uses `Union` and `Optional` type hints. This is the primary example for current advanced features.
* **Key Relations**: Used as a key input for `src/main.py` for testing class-related feature support.
* **Potential Enhancements/Improvements**: Add examples of multiple inheritance or more complex class interactions if those features are further developed.

### `examples/module_example/`
* **Purpose**: Directory containing multiple Python files (`module1.py`, `module2.py`) to demonstrate inter-module dependencies and imports.
* **Completion Status/Key Observations**: `module1.py` defines functions and classes, `module2.py` imports and uses them.
* **Key Relations**: Shows how `src/main.py` handles imports and represents module relationships in the graph.
* **Potential Enhancements/Improvements**: Could include more complex import scenarios (e.g., `from ... import ... as ...`, wildcard imports if supported).

## `tests/` Directory

### `tests/test_main.py`
* **Purpose**: Contains unit tests for the core functionality in `src/main.py`.
* **Completion Status/Key Observations**: Uses `pytest`. Tests cover graph generation for simple types, functions, classes, and basic module imports. Mocks file system operations and `astroid` parsing where necessary. Checks for expected nodes and edges in the generated `networkx` graph.
* **Key Relations**: Tests the logic within `src/main.py`. Relies on example files in `examples/` as input for some tests.
* **Potential Enhancements/Improvements**:
* Increase test coverage, especially for error conditions and edge cases.
* Add tests for newly supported features (e.g., specific Union type scenarios).
* Test DOT output validation more rigorously if `src/output_graphs.py` becomes more complex.

## `docs/` Directory

### `docs/DevelopmentLog.md`
* **Purpose**: Tracks development progress, decisions, and future plans.
* **Completion Status/Key Observations**: Contains entries for "Sprint 1" and "Sprint 2", detailing features implemented (basic types, functions, classes, Union types, module handling), bugs fixed, and next steps.
* **Key Relations**: Internal development document.
* **Potential Enhancements/Improvements**: Maintain regular updates as development progresses.

### `docs/Limitations.md`
* **Purpose**: Documents known limitations and unsupported features of the tool.
* **Completion Status/Key Observations**: Lists issues like lack of support for decorators, generators, context managers, advanced `typing` features (Generics, Protocols), and dynamic aspects of Python.
* **Key Relations**: Important for users to understand the current scope of the tool.
* **Potential Enhancements/Improvements**: Update as new limitations are discovered or existing ones are addressed.

### `docs/sprint2_notes.md`
* **Purpose**: Contains detailed notes and findings from the "Sprint 2" development cycle, focusing on class and Union type support.
* **Completion Status/Key Observations**: Records observations about `astroid` behavior with classes, methods, attributes, inheritance, and Union types. Discusses how to represent these in the graph.
* **Key Relations**: Informal notes supporting `DevelopmentLog.md` and guiding implementation in `src/main.py`.
* **Potential Enhancements/Improvements**: Key insights should be summarized and moved to more permanent documentation like `DevelopmentLog.md` or design documents if they exist.

## `generated/` Directory

### `generated/example_graphs/`
* **Purpose**: Stores the output DOT graph files generated by `src/main.py` when run on the example Python scripts.
* **Completion Status/Key Observations**: Contains `.dot` files like `simple_example.dot`, `class_example.dot`, `module_example.dot`. These serve as visual references and can be used for regression testing (though not formally done yet).
* **Key Relations**: Outputs of `src/main.py` using inputs from `examples/`.
* **Potential Enhancements/Improvements**:
* Implement automated visual diffing or structural comparison of DOT files for regression testing.
* Ensure graphs are kept up-to-date with code changes.
```
6 changes: 3 additions & 3 deletions docs/implementation_gaps_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,8 @@ The current implementation can only handle a limited subset of Python features:
| **Method overriding** | ✅ **Complete** | **Proper virtual methods and polymorphism** |
| Error handling | ⚠️ Minimal | Simple try/except structure |
| Context managers | ⚠️ Minimal | Basic structure without resource management |
| List comprehensions | ❌ Missing | Not implemented |
| Dictionary operations | ⚠️ Partial | Simple creation and access |
| List comprehensions | ✅ **Implemented** | Vector creation with push_back |
| Dictionary operations | ⚠️ Partial | Basic creation plus comprehension support |
| String operations | ✅ **Improved** | Advanced f-string support, concatenation |
| Regular expressions | ❌ Missing | Not implemented |
| File I/O | ❌ Missing | Not implemented |
Expand Down Expand Up @@ -143,7 +143,7 @@ The current implementation can only handle a limited subset of Python features:

**Prioritized Based on Recent Progress:**
1. Support for Python standard library mapping to C++ equivalents
2. Add support for list comprehensions and dictionary comprehensions
2. ~~Add support for list comprehensions~~ ✓ **List comprehensions implemented**; add dictionary comprehensions
3. Implement regular expression pattern translation
4. Add code generation for file I/O operations
5. Develop optimized C++ code patterns for common Python idioms
Expand Down
12 changes: 10 additions & 2 deletions src/analyzer/code_analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,17 @@ def _infer_variable_type(self, node: ast.Assign) -> None:
elt_types.append(f'std::tuple<{", ".join(nested_types)}>')
else:
elt_types.append(self._infer_expression_type(elt))
self.type_info[node.targets[0].id] = f'std::tuple<{", ".join(elt_types)}>'
if isinstance(node.targets[0], ast.Name):
self.type_info[node.targets[0].id] = (
f"std::tuple<{', '.join(elt_types)}>"
)
elif isinstance(node.targets[0], ast.Tuple):
for tgt, typ in zip(node.targets[0].elts, elt_types):
if isinstance(tgt, ast.Name):
self.type_info[tgt.id] = typ
else:
self.type_info[node.targets[0].id] = 'std::tuple<>'
if isinstance(node.targets[0], ast.Name):
self.type_info[node.targets[0].id] = 'std::tuple<>'
elif isinstance(node.value, ast.Call):
# Try to infer type from function call
if isinstance(node.value.func, ast.Name):
Expand Down
11 changes: 11 additions & 0 deletions src/analyzer/code_analyzer_fixed.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,10 @@ def _infer_variable_type(self, node: ast.Assign) -> None:
self._store_type_for_target(node.targets[0], f'std::map<{key_type}, {value_type}>')
else:
self._store_type_for_target(node.targets[0], 'std::map<std::string, int>') # Default
elif isinstance(node.value, ast.DictComp):
key_type = self._infer_expression_type(node.value.key)
value_type = self._infer_expression_type(node.value.value)
self._store_type_for_target(node.targets[0], f'std::map<{key_type}, {value_type}>')
elif isinstance(node.value, ast.Set):
# Try to infer set element type
if node.value.elts:
Expand Down Expand Up @@ -474,6 +478,13 @@ def _infer_expression_type(self, node: ast.AST) -> str:
elt_type = self._infer_expression_type(node.elts[0])
return f'std::vector<{elt_type}>'
return 'std::vector<int>'
elif isinstance(node, ast.ListComp):
elt_type = self._infer_expression_type(node.elt)
return f'std::vector<{elt_type}>'
elif isinstance(node, ast.DictComp):
key_type = self._infer_expression_type(node.key)
value_type = self._infer_expression_type(node.value)
return f'std::map<{key_type}, {value_type}>'
elif isinstance(node, ast.Dict):
if node.keys and node.values:
key_type = self._infer_expression_type(node.keys[0])
Expand Down
25 changes: 18 additions & 7 deletions src/converter/code_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def generate_code(self, analysis_result: AnalysisResult, output_dir: Path) -> No
with open(output_dir / "setup.py", "w") as f:
f.write('\n'.join(setup_content))

def _generate_header(self, analysis_result: Dict) -> str:
def _generate_header(self, analysis_result: AnalysisResult) -> str:
"""Generate C++ header file."""
header = """#pragma once

Expand All @@ -103,22 +103,29 @@ def _generate_header(self, analysis_result: Dict) -> str:
namespace pytocpp {

"""
type_info = analysis_result.type_info if hasattr(
analysis_result, "type_info"
) else analysis_result.get("functions", {})

# Add function declarations
for func_name, func_info in analysis_result.get('functions', {}).items():
for func_name, func_info in type_info.items():
if func_name.startswith('calculate_'):
# Get return type
return_type = func_info.get('return_type', 'int')
return_type = (
func_info.get('return_type', 'int') if isinstance(func_info, dict) else 'int'
)
# Get parameter types
params = []
for param_name, param_type in func_info.get('params', {}).items():
params.append(f"{param_type} {param_name}")
if isinstance(func_info, dict):
for param_name, param_type in func_info.get('params', {}).items():
params.append(f"{param_type} {param_name}")
# Add function declaration
header += f" {return_type} {func_name}({', '.join(params)});\n\n"

header += "} // namespace pytocpp\n"
return header

def _generate_implementation(self, analysis_result: Dict) -> str:
def _generate_implementation(self, analysis_result: AnalysisResult) -> str:
"""Generate C++ implementation file."""
impl = """#include "generated.hpp"
#include <vector>
Expand All @@ -132,8 +139,12 @@ def _generate_implementation(self, analysis_result: Dict) -> str:
namespace pytocpp {

"""
type_info = analysis_result.type_info if hasattr(
analysis_result, "type_info"
) else analysis_result.get("functions", {})

# Add function implementations
for func_name, func_info in analysis_result.get('functions', {}).items():
for func_name, func_info in type_info.items():
if func_name.startswith('calculate_'):
impl += self._generate_function_impl(func_name, func_info)

Expand Down
20 changes: 19 additions & 1 deletion src/converter/code_generator_fixed.py
Original file line number Diff line number Diff line change
Expand Up @@ -1083,8 +1083,26 @@ def _translate_expression(self, node: ast.AST, local_vars: Dict[str, str]) -> st
# Try to infer element type from the first element if available
if node.elts:
element_type = self._infer_cpp_type(node.elts[0], local_vars)

return f"std::vector<{element_type}>{{{', '.join(elements)}}}"
elif isinstance(node, ast.ListComp):
elt_type = self._infer_cpp_type(node.elt, local_vars)
if not node.generators:
raise ValueError("List comprehension node has no generators. Malformed AST.")
Copy link

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message should be more specific about which comprehension type is being processed. Consider changing to 'List comprehension has no generators' for clarity.

Suggested change
raise ValueError("List comprehension node has no generators. Malformed AST.")
raise ValueError("List comprehension has no generators")

Copilot uses AI. Check for mistakes.

target = self._translate_expression(node.generators[0].target, local_vars)
iterable = self._translate_expression(node.generators[0].iter, local_vars)
expr = self._translate_expression(node.elt, local_vars)
return self._generate_list_comprehension(elt_type, target, iterable, expr)
elif isinstance(node, ast.DictComp):
key_type = self._infer_cpp_type(node.key, local_vars)
value_type = self._infer_cpp_type(node.value, local_vars)
target = self._translate_expression(node.generators[0].target, local_vars)
iterable = self._translate_expression(node.generators[0].iter, local_vars)
key_expr = self._translate_expression(node.key, local_vars)
value_expr = self._translate_expression(node.value, local_vars)
Comment on lines +1096 to +1102
Copy link

Copilot AI Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for empty generators list in DictComp. Add the same generator validation as in ListComp to prevent potential IndexError when accessing node.generators[0].

Copilot uses AI. Check for mistakes.

return self._generate_dict_comprehension(
key_type, value_type, target, iterable, key_expr, value_expr
)
elif isinstance(node, ast.Dict):
# Handle dict literals
if not node.keys:
Expand Down
37 changes: 37 additions & 0 deletions tests/test_code_analyzer_fixed.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,43 @@ def test_inference_expressions(self):
values=[ast.Constant(value=True), ast.Constant(value=False)]
)
assert analyzer._infer_expression_type(bool_op) == 'bool'

list_comp = ast.ListComp(
elt=ast.BinOp(
left=ast.Name(id='x', ctx=ast.Load()),
op=ast.Mult(),
right=ast.Constant(value=2),
),
generators=[
ast.comprehension(
target=ast.Name(id='x', ctx=ast.Store()),
iter=ast.Name(id='nums', ctx=ast.Load()),
ifs=[],
is_async=0,
)
],
)
analyzer.type_info['nums'] = 'std::vector<int>'
assert analyzer._infer_expression_type(list_comp) == 'std::vector<int>'

dict_comp = ast.DictComp(
key=ast.Name(id='x', ctx=ast.Load()),
value=ast.BinOp(
left=ast.Name(id='x', ctx=ast.Load()),
op=ast.Mult(),
right=ast.Constant(value=2),
),
generators=[
ast.comprehension(
target=ast.Name(id='x', ctx=ast.Store()),
iter=ast.Name(id='nums', ctx=ast.Load()),
ifs=[],
is_async=0,
)
],
)
analyzer.type_info['nums'] = 'std::vector<int>'
assert analyzer._infer_expression_type(dict_comp) == 'std::map<int, int>'

def test_type_annotation_handling(self):
"""Test handling of Python type annotations."""
Expand Down
Loading