Package detail

xobuilder-extractor

hancrypto10k3.1.4

A sophisticated web application that converts HTML and CSS to JSON layout through an intelligent multi-step flow process with AI validation.

readme

XOBuilder Extractor

A sophisticated web application that converts HTML and CSS to JSON layout through an intelligent multi-step flow process with AI validation.

Flow Diagram

🔥 Overview

XOBuilder Extractor is an advanced web application designed to transform HTML and CSS code into structured JSON layouts through a visual, multi-step processing flow. The application leverages ReactFlow for interactive UI representation and integrates with OpenAI for intelligent content conversion and validation.

🏗️ Architecture

The project follows Clean Architecture principles with clear separation of concerns:

src/
├── application/          # Business logic and use cases
│   ├── dto/             # Data Transfer Objects
│   └── use-case/        # Application use cases
├── domains/             # Core business entities and models
│   └── entities/        # Domain entities
├── utils/               # Utility functions and helpers
├── constants/           # Application constants
└── index.ts            # Main entry point and exported functions

🎯 Core Features

📊 Visual Flow Processing

  • Interactive Flow Visualization: Built with ReactFlow for intuitive process monitoring
  • Real-time Status Updates: Visual feedback with animated borders and progress indicators
  • Step-by-Step Execution: Clear progression through processing stages

🤖 AI-Powered Conversion

  • OpenAI Integration: Intelligent HTML/CSS to JSON conversion
  • Validation Loop: Automatic comparison and re-generation for accuracy
  • Smart Retry Logic: Up to 5 automatic retries with improvement iterations

🎨 Advanced UI States

  • Processing State: Animated running border with real-time counter
  • Success State: Green success border indicating completion
  • Failure State: Red error border for failed operations
  • Connection Types:
    • Solid arrows for step-to-step flow
    • Dashed arrows for AI agent connections

🔄 Processing Flow

The application follows a sophisticated 4-step process:

Step 1: Data Extraction 📥

Purpose: Extract and prepare source materials

  • Input: JSON file containing array of objects
  • Processing: Extract HTML, CSS, and screenshot data
  • Output: Structured data ready for conversion
  • Status: Foundation step - must complete successfully

Step 2: AI Conversion 🤖

Purpose: Transform HTML/CSS to raw JSON layout

  • Input: HTML and CSS from Step 1
  • AI Agent: OpenAI integration (dashed connection)
  • Processing: Intelligent conversion using machine learning
  • Output: Raw JSON layout structure
  • Retry Logic: Supports up to 5 regeneration attempts

Step 3: Validation & Comparison 🔍

Purpose: Ensure conversion accuracy through visual comparison

  • Input: Raw JSON from Step 2 + Screenshot from Step 1
  • Processing:
    • Render JSON layout as image
    • Compare with original screenshot
    • Calculate similarity score
  • Decision Logic:
    • Similar: Proceed to Step 4
    • Different: Return to Step 2 (max 5 times)
  • Smart Retry: Iterative improvement with feedback

Step 4: Finalization ✅

Purpose: Complete the conversion process

  • Success Path: From Step 3 validation success
  • Failure Path: When Step 3 reaches maximum retry limit
  • Output: Final result status and processed data

🎨 Visual Flow Design

Connection Types

  • Step Flow: Step 1 ──→ Step 2 ──→ Step 3 ──→ Step 4
  • AI Integration: Step 2 ┈┈┈→ OpenAI Agent
  • Retry Loop: Step 3 ←──── Step 2 (conditional)

Visual States

State Visual Indicator Description
Processing 🔄 Animated border + timer Step currently executing
Success ✅ Green border Step completed successfully
Failure ❌ Red border Step failed to complete
Pending ⚪ Default border Step waiting to execute

🛠️ Technology Stack

Core Dependencies

  • Build System: Vite + TypeScript
  • DOM Processing: jsdom, parse5
  • CSS Processing: css, csso
  • Utilities: lodash
  • Runtime: Node.js with TypeScript support

Development Tools

  • Linting: ESLint with TypeScript support
  • Type Checking: TypeScript 5.7+
  • Package Manager: pnpm
  • Build Tool: Rollup with TypeScript plugin

📦 API Reference

Core Functions

parseHtml(html: string, css: string)

Parses HTML and CSS into structured format for processing.

const result = parseHtml(htmlContent, cssContent);

getEntities({ html, raw }: GetEntitiesProps)

Fills element attributes using parsed HTML and raw JSON data.

const entities = getEntities({
  html: htmlString,
  raw: sectionRawData,
});

getStyles(input: GetStylesInput)

Extracts and processes CSS styles from input data.

const styles = getStyles(styleInput);

getStylesAEP(type: string, cssText: string)

Analyzes element properties from CSS text.

const analyzedStyles = getStylesAEP("div", cssText);

getEEP(data: RawDataEEPV2[])

Expands element properties from raw data.

const expandedProperties = getEEP(rawDataArray);

🚀 Getting Started

Prerequisites

  • Node.js 18+
  • pnpm (recommended) or npm

Installation

# Clone the repository
git clone <repository-url>
cd xobuilder-extractor

# Install dependencies
pnpm install

Development

# Start development server
pnpm dev

# Build for production
pnpm build

# Run production build
pnpm start

🔧 Configuration

Build Configuration

The project uses Vite for modern, fast builds with TypeScript support. Configuration can be found in:

  • vite.config.ts - Build configuration
  • tsconfig.json - TypeScript settings
  • eslint.config.js - Code quality rules

Environment Setup

Ensure your environment supports:

  • ES2020+ features
  • TypeScript 5.7+
  • Modern DOM APIs through jsdom

📈 Performance & Scalability

Optimization Features

  • Tree Shaking: Eliminates unused code in production builds
  • Code Splitting: Modular architecture supports lazy loading
  • Type Safety: Full TypeScript coverage prevents runtime errors
  • Efficient Parsing: Optimized HTML/CSS processing with specialized libraries

Scalability Considerations

  • Clean Architecture: Easy to extend with new processing steps
  • Modular Design: Independent use cases and entities
  • Retry Logic: Robust error handling and recovery
  • AI Integration: Pluggable AI providers through abstraction layer

🐛 Troubleshooting

Common Issues

AI Conversion Failures

  • Cause: Network issues or API limitations
  • Solution: Check API keys and network connectivity
  • Mitigation: Automatic retry logic handles temporary failures

Visual Comparison Mismatches

  • Cause: Rendering differences or screenshot quality
  • Solution: Adjust similarity threshold in validation logic
  • Monitoring: Track retry patterns for optimization

Memory Usage

  • Cause: Large HTML/CSS files or many concurrent processes
  • Solution: Implement chunking for large files
  • Prevention: Monitor and limit concurrent processing

🤝 Contributing

Development Workflow

  1. Create feature branch from main
  2. Implement changes following Clean Architecture
  3. Add/update tests for new functionality
  4. Ensure TypeScript compliance
  5. Submit pull request with detailed description

Code Standards

  • Follow existing architectural patterns
  • Maintain type safety throughout
  • Document complex business logic
  • Use meaningful variable and function names

📄 License

This project is licensed under the terms specified in the package.json file.


Built with ❤️ by HaiUTC

For questions or support, please refer to the project documentation or create an issue in the repository.