✦ Extraction Engine
How PIE uses Google Gemini to transform raw permit data into structured operational intelligence.
Enrichment Pipeline
Permit Ingestion
Raw permit data enters the system from jurisdiction APIs, manual entry, or AI auto-populate. Each permit includes a work description that serves as the primary intelligence input.
Gemini Analysis
The work description is sent to Google Gemini 2.0 Flash with a structured prompt that instructs the model to act as a PIE intelligence analyst. The model analyzes the text to identify:
- Building systems being modified (HVAC, electrical, fire, plumbing)
- Specific asset classes involved (chiller, VFD, generator, fire pump)
- Lifecycle event type (replace, install, repair, upgrade)
- Criticality assessment based on system importance
- Insurance and operational relevance scoring (0-10 scale)
Structured Output
Gemini returns a JSON object with 15+ fields including summary, systems impacted, asset classes, lifecycle event, criticality, compliance domains, relevance scores, probable existing and new assets, and reasoning.
Database Persistence
The enrichment result is stored in the permit_enrichments table, linked back to the normalized permit. The AI confidence score and model version are recorded for audit purposes.
Human Review
All AI-generated intelligence enters the review queue with a "pending" status. Human analysts approve, reject, or correct the enrichment before it influences downstream insurance signals and operational decisions.
AI Auto-Populate System
How It Works
Every create form includes a "✦ AI Auto-Populate" bar. Users type a natural language clue and Gemini generates structured data for all form fields based on real-world knowledge.
Supported Entity Types
| Entity | Example Clue | Generated Fields |
|---|---|---|
| Properties | "Park Hyatt NYC" | Address, type, year built, sq ft, floors, owner, class |
| Assets | "Trane 300 ton chiller" | Name, type, manufacturer, model, serial #, condition |
| Systems | "Building automation for data center" | Name, type, domain, criticality, install year |
| Permits | "Elevator modernization in SF" | Permit #, address, type, description, valuation |
| Contractors | "Large HVAC contractor in LA" | Company, license #, license type, contact |
| Jurisdictions | "Orange County California" | Name, state, type, data source URL |
| Inspections | "Failed fire sprinkler test" | Type, date, inspector, result, notes |
| Findings | "Low refrigerant charge on CRAC unit" | Title, type, severity, domain, description |
| Evidence | "Generator load bank test certificate" | Title, type, strength, relevance scores |
| Insurance Signals | "Major chiller replacement completed" | Title, signal type, direction, severity, narrative |
| Obligations | "Re-inspect failed CRAC unit by April" | Title, type, domain, priority, due date |
| Lifecycle Events | "25-year-old elevator modernized" | Event type, date, title, summary |
| Locations | "Rooftop mechanical penthouse" | Name, floor, zone, location type |
Technical Details
- Model: Google Gemini 2.0 Flash
- Temperature: 0.3 (low randomness for consistent results)
- Max Tokens: 2,048
- Response Format: Structured JSON with field-level validation
- Latency: Typical response in 1-3 seconds
- API Endpoint:
POST /api/ai-populate
Confidence Scoring & Human Review
AI Confidence Scores
Every AI-generated data point carries a confidence score:
Review Statuses
All AI-generated intelligence goes through a human review workflow:
Insurance Signal Generation Logic
Signal Derivation Rules
| Trigger Condition | Signal Type | Direction | Typical Severity |
|---|---|---|---|
| New equipment installed (value > $100K) | equipment_upgrade | positive | medium-high |
| Generator/UPS capacity increased | capacity_change | positive | high |
| Inspection result = fail | failed_inspection | negative | high |
| Finding type = code_violation | compliance_gap | negative | high-critical |
| Asset condition = poor, age > 20 years | deferred_maintenance | negative | medium |
| Permit type = Building, value > $1M | major_renovation | positive | medium |
| Emergency/after-hours permit filed | emergency_repair | negative | high |