✦ Extraction Engine

How PIE uses Google Gemini to transform raw permit data into structured operational intelligence.

Enrichment Pipeline

1

Permit Ingestion

Raw permit data enters the system from jurisdiction APIs, manual entry, or AI auto-populate. Each permit includes a work description that serves as the primary intelligence input.

2

Gemini Analysis

The work description is sent to Google Gemini 2.0 Flash with a structured prompt that instructs the model to act as a PIE intelligence analyst. The model analyzes the text to identify:

  • Building systems being modified (HVAC, electrical, fire, plumbing)
  • Specific asset classes involved (chiller, VFD, generator, fire pump)
  • Lifecycle event type (replace, install, repair, upgrade)
  • Criticality assessment based on system importance
  • Insurance and operational relevance scoring (0-10 scale)
3

Structured Output

Gemini returns a JSON object with 15+ fields including summary, systems impacted, asset classes, lifecycle event, criticality, compliance domains, relevance scores, probable existing and new assets, and reasoning.

4

Database Persistence

The enrichment result is stored in the permit_enrichments table, linked back to the normalized permit. The AI confidence score and model version are recorded for audit purposes.

5

Human Review

All AI-generated intelligence enters the review queue with a "pending" status. Human analysts approve, reject, or correct the enrichment before it influences downstream insurance signals and operational decisions.

AI Auto-Populate System

How It Works

Every create form includes a "✦ AI Auto-Populate" bar. Users type a natural language clue and Gemini generates structured data for all form fields based on real-world knowledge.

Supported Entity Types

EntityExample ClueGenerated Fields
Properties"Park Hyatt NYC"Address, type, year built, sq ft, floors, owner, class
Assets"Trane 300 ton chiller"Name, type, manufacturer, model, serial #, condition
Systems"Building automation for data center"Name, type, domain, criticality, install year
Permits"Elevator modernization in SF"Permit #, address, type, description, valuation
Contractors"Large HVAC contractor in LA"Company, license #, license type, contact
Jurisdictions"Orange County California"Name, state, type, data source URL
Inspections"Failed fire sprinkler test"Type, date, inspector, result, notes
Findings"Low refrigerant charge on CRAC unit"Title, type, severity, domain, description
Evidence"Generator load bank test certificate"Title, type, strength, relevance scores
Insurance Signals"Major chiller replacement completed"Title, signal type, direction, severity, narrative
Obligations"Re-inspect failed CRAC unit by April"Title, type, domain, priority, due date
Lifecycle Events"25-year-old elevator modernized"Event type, date, title, summary
Locations"Rooftop mechanical penthouse"Name, floor, zone, location type

Technical Details

  • Model: Google Gemini 2.0 Flash
  • Temperature: 0.3 (low randomness for consistent results)
  • Max Tokens: 2,048
  • Response Format: Structured JSON with field-level validation
  • Latency: Typical response in 1-3 seconds
  • API Endpoint: POST /api/ai-populate

Confidence Scoring & Human Review

AI Confidence Scores

Every AI-generated data point carries a confidence score:

70-100% — High confidence. AI is certain about the enrichment.
40-69% — Medium confidence. Review recommended.
0-39% — Low confidence. Manual verification required.

Review Statuses

All AI-generated intelligence goes through a human review workflow:

pending Initial AI output
needs_review Flagged for human attention
approved Human confirmed
rejected Human rejected
corrected Human edited and approved

Insurance Signal Generation Logic

Signal Derivation Rules

Trigger ConditionSignal TypeDirectionTypical Severity
New equipment installed (value > $100K)equipment_upgradepositivemedium-high
Generator/UPS capacity increasedcapacity_changepositivehigh
Inspection result = failfailed_inspectionnegativehigh
Finding type = code_violationcompliance_gapnegativehigh-critical
Asset condition = poor, age > 20 yearsdeferred_maintenancenegativemedium
Permit type = Building, value > $1Mmajor_renovationpositivemedium
Emergency/after-hours permit filedemergency_repairnegativehigh