RFQ Aurora DB — Full Data Audit

Cluster rfq-aurora-prod · Aurora PostgreSQL 17.7 Serverless v2 · DB rfq · 40 tables / 532 columns · generated 2026-07-02

3.60M

buyer_records (total buyers)

99.9% loaded in June (one-shot) · all status=new

43,040

buyers with an email

1.2% of all · verified 1,827 (0.05%)

32.6%

buyer HS-code fill rate

all AI-derived · rfqs HS = 0%

$4,635

agent_runs total spend

classifier 83% · match 84% failed

Scale distribution

Table size (Top 8, ~28 GB total)

Buyer country distribution (Top 12)

Field fill rates — how complete is the data

buyer_records key-column fill rate (n=3.60M)

Email acquisition funnel — reachability

Of 3.6M buyers, only 43,040 (1.2%) have an email and only 1,827 pass verification. The actionable outreach base is extremely thin.

HS-code & email quality detail

HS-code source distribution (buyer_records)

Authoritative sources (customs DB etc.) = 0. The filled 32.6% is entirely AI derived (estimated) — see the box below for exactly what that means. In rfqs, hs_codes / product_category are 100% empty.

Email verification result distribution

85% of enrichment email_verification is null (unverified). Of the 8.4k separate email_verifications: valid (ok) 4,541 / invalid·unknown 3,914.

What "AI derived (estimated)" HS codes actually means

hs_source = 'derived' does not mean the HS code came from any trade, invoice, or customs record. It is produced by a two-step guess and only reaches 2-digit chapter granularity:

① AI classification

An LLM reads the company name / brief and assigns industry + category (e.g. beauty / skincare). Any misclassification propagates downstream.

→

② Static lookup

A fixed industry_taxonomy table maps that industry to an HS chapter — e.g. beauty→33, home_appliances→84,85, food_beverage→02–22. No product-level detail.

③ 2-digit guess

Result is a coarse chapter, not a real 6–10 digit HS code. 618,683 rows match the taxonomy map exactly — proof it is a pure lookup, not evidence-based.

Granularity: chapter-level (2 digits) only. Customs / tariff classification needs 6–10 digits.
Trust: inherits the AI classifier's error rate; no product, invoice, or shipment evidence backs it.
Usable for: rough segmentation / filtering. Not usable for customs declarations, duty calc, or compliance.

Deep dive — why HS-code analysis is not fit for purpose

Chapters per record — ambiguity (n=1.17M with HS)

Only 15.4% resolve to a single chapter. 41% span 5–16 chapters. food_beverage records all get the same 16-chapter blob (02–22) = "somewhere in food" — analytically worthless.

Usability tiers across all 3.60M buyers

Only 181k (5.0%) of all buyers have a clean single-chapter HS — and even those are chapter-level, not a real 6–10 digit code.

Five quantified reasons it fails

Root cause

Coverage capped by classification, not HS logic

2,322,715 buyers (64.5%) have industry = unknown / null, so no HS can be derived at all. The limiter is the upstream classifier, not the HS mapping.

Granularity

2-digit chapter only — never a usable code

100% of 6.73M stored values are 2 digits (61 distinct chapters). Customs / tariff / duty needs HS6–HSK10. Zero records qualify.

Breadth

Most "codes" are broad blobs

481,427 HS-filled records (41%) carry 5+ chapters; 276k food_beverage rows each carry 16. A 16-chapter "code" cannot discriminate products.

Trust

Inherits low classifier confidence

~758k classifications sit at confidence ≤0.35, yet HS is derived from them unconditionally — no confidence gate.

Pipeline gap

rfqs never got HS at all

Of 257k rfqs, 52,493 already have an industry and 60,815 a beauty flag — yet hs_codes fill = 0. The derivation step simply never ran on the rfqs side.

Nuance

Still OK for one thing: beauty vs non-beauty

The core vertical is clean — 134,438 beauty records map to the single chapter 33. Chapter-level HS is adequate for coarse beauty/non-beauty segmentation, but not for cross-industry matching or trade use.

Pipeline outputs

agent_runs: type × status (2.37M total)

Of 1.74M failures, most have error=null — ambiguous whether "failed" means a real error or a filtered-out verdict (schema issue).

buyer_supplier_matches: score distribution (943k total)

850k high-score (80%+) matches exist, yet status: pending 943,348 / accepted 2 — output never consumed downstream.

Per-source pipeline health (classify % / embed %)

Data-quality issues found

Critical

Reachable base collapse

Of 3.6M buyers, 1.2% (43k) have an email and 0.05% (1.8k) pass verification. Almost no data is actually usable for outreach.

Critical

rfqs HS-code & product_category entirely empty

Across all 234k rfqs, hs_codes / product_category fill = 0. The columns underpinning classification/matching are blank.

Critical

Match output not consumed

Of 940k buyer_supplier_matches, only 2 are accepted. 850k high-score matches sit in pending — the pipeline dead-ends.

Warning

Agent failure rate 83–84% with error=null

classifier failed 1.39M, match_haiku failed 327k, mostly with no error message. The "failed" status is semantically ambiguous (filter vs error). Much of the $4,635 spend went to failed runs.

Warning

HS codes all AI-derived

Zero authoritative-source codes. Not trustworthy for customs/tariff use. A verification layer is needed.

Warning

Uneven embedding coverage

bizmaps_jp 42% vs usaspending 0.1% etc. Large per-source embedding gaps can bias vector search results.

Minor

Legacy / unused tables

enrichments (rfq side) has 1 row, identity_backup_20260603 and other backup/duplicate schemas remain. Cleanup candidates.

Minor

One-shot bulk ingestion

Concentrated in June bulk imports (bizmaps_jp, australia_asic_abr, etc.). No continuous collection pipeline is running.