# Memory

## Me
DaveO (David). Runs the MJ Hickey Plant Hire portal (portal.hickeyplanthire.co.uk).

## Conventions / Preferences
- **SQL migration & one-off import scripts go in `/migrations` at the repo root — NEVER inside the live `CRUD/` folder structure.** Name them `YYYY-MM-DD_description.sql`. `/migrations` is gitignored, so these stay local-only; run them on the live DB manually via phpMyAdmin.
- This local folder (C:\xampp\htdocs\...) is the dev copy; changes mirror to the live server via git.
- Concise, direct communication preferred.

## Environment
| Thing | Detail |
|-------|--------|
| Live server | Linux cPanel, `/home/hickeyhub/public_html/portal.hickeyplanthire.co.uk`, user `hickeyhub` |
| Live PHP | `exec()` disabled, `popen()` allowed, no COM; terminal is CageFS-jailed (can't see web-spawned processes) |
| Local dev | Windows XAMPP (this folder) — code must work on BOTH (e.g. no hardcoded Windows paths, guard with `os.name`/`PHP_OS_FAMILY`) |
| Live Python | 3.9 at `/usr/bin/python3` — **no PEP 604 `str | None` syntax** in extractor services; venv with uvicorn/fastapi at `/home/hickeyhub/ocr_app/venv` |
| Tesseract | Live: `/usr/bin/tesseract`; local: `C:\Program Files\Tesseract-OCR\` |

## Key Systems
| System | Where |
|--------|-------|
| OCR extractors | FastAPI/uvicorn daemons: invoice on 127.0.0.1:8011, POD on 8012; sources in `CRUD/suppliers/supplier_invoices/extractor/`; status page `CRUD/suppliers/supplier_invoices/ocr_status.php` |
| Supplier invoices | `CRUD/suppliers/supplier_invoices/` — upload → run_extract → review → save; duplicate invoice numbers blocked per supplier |
| Invoice review (internal) | `CRUD/invoice_review/` — queue sorted by invoice number (middle segment of `CUST - NUM - PO.pdf`) |
| Hire updates | `CRUD/hire_list/hire_updates/update.php` & `update_pl.php` — additions/buckets use shared "pill" UI, kept in sync |
| GHG emissions | `CRUD/ghg/scope2_index.php` (table `ghg_scope2_electricity`, auto-creates; FY Apr–Mar) |

## Gotchas
- PHP that shells out to Python for OCR/PDF work must use `findOcrPython()` from `CRUD/functions/ocr_python.php` — NEVER hardcode `/usr/bin/python3` (system Python has no pymupdf/pytesseract; the libs live in `~/ocr_app/venv`).
- PHP session locking: any slow endpoint must call `session_write_close()` after auth or it hangs the whole site for the user.
- Long-running daemons must NOT be spawned from web requests without `setsid` + `</dev/null` (pins LiteSpeed connections).
- Browser caching bites: data fetches need `cache:'no-store'` + `?t=` cache-buster; pages restored from bfcache need `pageshow` handlers.
