An AI-powered tool to optimize Amazon Bedrock Data Automation (BDA) blueprint instructions using advanced language models. The optimizer analyzes your extraction instructions and generates improved, more specific prompts to enhance data extraction accuracy.
- Professional AWS Cloudscape Design: Clean, modern interface matching AWS Console styling
- Real-time Monitoring: Live log viewing and status updates during optimization
- Blueprint Integration: Direct integration with AWS BDA to fetch and optimize existing blueprints
- Theme Support: Light/dark mode toggle for better user experience
- Instruction Enhancement: Automatically improves extraction instructions using Claude models
- Context-Aware: Analyzes document content to generate more specific prompts
- Iterative Refinement: Multiple optimization rounds for continuous improvement
- Performance Tracking: Monitors extraction accuracy and suggests improvements
- Blueprint Fetching: Direct integration with AWS Bedrock Data Automation APIs
- Schema Management: Automatic schema generation and validation
- Multi-Region Support: Configurable AWS region settings
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ React UI │ │ FastAPI │ │ AI Optimizer │
│ (Port 3000) │◄──►│ Backend │◄──►│ (Claude) │
│ │ │ (Port 8000) │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ AWS Cloudscape │ │ AWS BDA APIs │ │ Local Storage │
│ Components │ │ │ │ (Schemas/Logs) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- AWS Account: Active AWS account with appropriate billing setup
- AWS CLI: Version 2.0+ installed and configured
aws --version aws configure aws sts get-caller-identity # Verify configuration
- Model Access: Request access to Claude models in AWS Bedrock
- Navigate to AWS Bedrock Console → Model Access
- Request access to the following models:
anthropic.claude-3-sonnet-20240229-v1:0
(recommended)anthropic.claude-3-haiku-20240307-v1:0
(faster, lower cost)anthropic.claude-3-opus-20240229-v1:0
(highest quality)
- Wait for approval (typically 1-2 business days)
- Verify access:
aws bedrock list-foundation-models --region us-west-2
- BDA Access: Ensure your AWS account has access to Bedrock Data Automation
- Project Setup: Create a BDA project with appropriate blueprints
- IAM Permissions: Required permissions for BDA operations:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "bedrock-data-automation:GetDataAutomationProject", "bedrock-data-automation:ListDataAutomationProjects", "bedrock-data-automation:GetBlueprint", "bedrock-data-automation:ListBlueprints", "bedrock-data-automation:CreateBlueprint", "bedrock-data-automation:UpdateBlueprint" ], "Resource": "*" } ] }
- S3 Bucket: Dedicated S3 bucket for document storage and processing
- Recommended naming:
your-org-bda-optimizer-{region}-{account-id}
- Enable versioning for document history
- Configure appropriate lifecycle policies
- Recommended naming:
- S3 Permissions: Required IAM permissions:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::your-bda-bucket", "arn:aws:s3:::your-bda-bucket/*" ] } ] }
Create an IAM role or user with the following managed policies:
AmazonBedrockFullAccess
(or custom policy with specific model access)AmazonS3FullAccess
(or custom policy for your specific bucket)- Custom policy for BDA operations (see above)
- VPC Configuration: If running in VPC, ensure:
- Internet gateway for external API calls
- NAT gateway for private subnet access
- Security groups allowing HTTPS (443) outbound
- Endpoint Access: Consider VPC endpoints for:
- S3 (
com.amazonaws.region.s3
) - Bedrock (
com.amazonaws.region.bedrock
) - Bedrock Runtime (
com.amazonaws.region.bedrock-runtime
)
- S3 (
- Python 3.8+
- Node.js 16+ and npm
- AWS CLI configured with appropriate permissions
- AWS Bedrock Data Automation access
- Environment variables configured (see Configuration section)
git clone <repository-url>
cd bda-blueprint-optimizer
pip install -r requirements.txt
cd src/frontend/react
npm install
cd ../../..
Create a .env
file in the root directory:
# AWS Configuration
AWS_REGION=us-west-2
ACCOUNT=your-aws-account-id
AWS_MAX_RETRIES=3
AWS_CONNECT_TIMEOUT=500
AWS_READ_TIMEOUT=1000
# Model Configuration
DEFAULT_MODEL=anthropic.claude-3-sonnet-20240229-v1:0
Start both React frontend and FastAPI backend:
bash run_dev.sh
This will start:
- FastAPI Backend: http://localhost:8000
- React Frontend: http://localhost:3000
- Legacy UI: http://localhost:8000/legacy
Start services individually:
Backend:
python -m uvicorn src.frontend.app:app --host 0.0.0.0 --port 8000 --reload
Frontend:
cd src/frontend/react
npm run dev
Sample document for testig: https://s2.q4cdn.com/299287126/files/doc_financials/2025/ar/2024-Shareholder-Letter-Final.pdf
- Project ARN: Enter your AWS BDA project ARN
- Blueprint ID: Specify the blueprint you want to optimize
- Output Location: S3 location for results
The application now includes a built-in document upload feature:
- Select S3 Bucket: Choose from your available S3 buckets
- Set S3 Prefix: Optionally specify a folder path (e.g.,
documents/input/
) - Bucket Validation: Automatic validation of read/write permissions
- File Upload: Drag and drop or select files up to 100MB
- Supported Formats: PDF, DOC, DOCX, TXT, PNG, JPG, JPEG, TIFF
- Auto-Configuration: Uploaded document S3 URI is automatically set in configuration
Click "Fetch Blueprint" to download the current blueprint schema from AWS BDA. This populates the instructions table with existing extraction fields.
Fill in the "Expected Output" column with sample values for each field. This helps the AI understand what you're trying to extract.
- Threshold: Similarity threshold for optimization (0.0-1.0)
- Max Iterations: Maximum number of optimization rounds
- Model: Claude model to use for optimization
- Use Document Strategy: Whether to analyze document content
- Clean Logs: Clear previous run logs
Click "Run Optimizer" to start the AI optimization process. Monitor progress in real-time through:
- Status Indicator: Shows current optimization state
- Live Logs: Real-time log output with auto-refresh
- Progress Tracking: Iteration progress and performance metrics
Once complete, the "Final Schema" section displays the optimized blueprint with improved instructions.
- Connects directly to AWS Bedrock Data Automation APIs
- Downloads existing blueprint schemas
- Auto-populates configuration fields
- Supports multiple project stages (LIVE, DEVELOPMENT)
- Analysis: AI analyzes your current instructions and expected outputs
- Enhancement: Generates more specific, context-aware prompts
- Validation: Tests improved instructions against sample data
- Iteration: Refines instructions through multiple rounds
- Finalization: Produces optimized schema ready for deployment
- Live Status Updates: Automatic status polling every 2 seconds
- Log Streaming: Real-time log viewing with 1-second refresh
- Progress Indicators: Visual feedback on optimization progress
- Error Handling: Clear error messages and troubleshooting guidance
bda-blueprint-optimizer/
├── src/
│ ├── frontend/
│ │ ├── react/ # Modern React UI
│ │ │ ├── src/
│ │ │ │ ├── components/ # React components
│ │ │ │ ├── contexts/ # State management
│ │ │ │ └── services/ # API services
│ │ │ └── package.json
│ │ ├── app.py # FastAPI backend
│ │ └── templates/ # Legacy UI templates
│ ├── aws_clients.py # AWS API integration
│ └── ... # Core optimization logic
├── output/ # Generated schemas and results
├── logs/ # Optimization logs
├── requirements.txt # Python dependencies
├── run_dev.sh # Development startup script
└── README.md
POST /api/update-config
- Update optimization configurationPOST /api/fetch-blueprint
- Fetch blueprint from AWS BDA
POST /api/upload-document
- Upload document to S3GET /api/list-s3-buckets
- List available S3 bucketsPOST /api/validate-s3-access
- Validate S3 bucket access and permissions
POST /api/run-optimizer
- Start optimization processGET /api/optimizer-status
- Check optimization statusPOST /api/stop-optimizer
- Stop running optimization
GET /api/final-schema
- Get optimized schemaGET /api/list-logs
- List available log filesGET /api/view-log/{log_file}
- View specific log file
CORS Errors
- Ensure FastAPI backend is running on port 8000
- Check that CORS middleware is properly configured
AWS Authentication
- Verify AWS CLI is configured:
aws sts get-caller-identity
- Ensure proper IAM permissions for Bedrock Data Automation
- Check region configuration matches your project ARN
Blueprint Fetching Fails
- Verify project ARN and blueprint ID are correct
- Ensure AWS region matches the project region
- Check IAM permissions for
bedrock-data-automation:GetDataAutomationProject
Document Upload Issues
- Verify S3 bucket exists and is accessible
- Check IAM permissions for S3 operations (GetObject, PutObject, ListBucket)
- Ensure file size is under 100MB limit
- Verify supported file formats: PDF, DOC, DOCX, TXT, PNG, JPG, JPEG, TIFF
S3 Access Validation Fails
- Check bucket permissions and policies
- Verify AWS credentials have S3 access
- Ensure bucket is in the same region as your AWS profile
- Check for bucket-level restrictions or VPC endpoint configurations
Optimization Hangs
- Check Claude model availability in your region
- Verify sufficient AWS Bedrock quotas
- Monitor logs for specific error messages
- Use "Auto-Refresh" toggle for real-time log monitoring
- Check
logs/
directory for detailed optimization traces - Review
output/schemas/
for generated schema files
- File Path Validation: All file operations are restricted to project subdirectories
- S3 Access Control: Bucket validation ensures proper read/write permissions
- Input Sanitization: File names and paths are validated to prevent directory traversal
- Size Limits: File uploads are limited to 100MB to prevent resource exhaustion
- Fork the repository
- Create a feature branch:
git checkout -b feature-name
- Make your changes
- Test thoroughly with both UIs
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check the troubleshooting section above
- Review log files for specific error messages
- Ensure all prerequisites are properly configured
- Contact the development team for additional support