HSBC FRAUD PILOT · IEEE-CIS PUBLIC DATA · IBM FEZ EXECUTION

QPC PC-QRC-FD: Real-World Fraud Task Pilot

A transparent pilot for fraud-risk modeling: task framing, data preparation protocol, simulator and IBM run outcomes, and clear interpretation boundaries.

Reader's guide

Start here for the narrative and metric tables on this page. Follow the chain below for outcomes and job IDs.

This report — executive summary: task, data contract, simulator/classical/K-ablation, IBM Fez max-q pilot (156 qubits), interpretation boundaries.
Pilot report — workspace paths, reproducibility commands, IBM platform pitfalls, link back to artifacts.
Machine-readable evidence — JSON produced alongside scripts on your machine (not hosted here): e.g. HSBC IBM evidence bundle, HSBC tuning bundle, HSBC classical match bundle under partner workspace / . Publish copies with your submission bundle if the jury requires downloads.
Highlights — short HSBC card with cross-links.

Navigation: Home

Task and Goal

The task is HSBC-style fraud detection on card transactions, using public IEEE-CIS Fraud Detection data as an accessible real-world benchmark. The core objective is not immediate production superiority, but to evaluate whether QPC's polycontextural reservoir architecture can produce meaningful fraud-risk features under a strict, leakage-controlled protocol.

Goal of this pilot:

Run PC-QRC-FD end-to-end with time-respecting splits and train-only preprocessing fits.
Compare against classical references honestly (full-split and matched-slice).
Execute the same architecture family on IBM hardware and archive real job ids.

Data Preparation and Computation Contract

Dataset source: Kaggle IEEE-CIS Fraud Detection (`train_transaction.csv` + `train_identity.csv`). Records are sorted by `TransactionDT` to keep chronological structure.

Step	Protocol used in this pilot
Time split	80% train timeline / 20% test timeline (`time_split_quantile=0.8`)
Pilot train pool	1800 rows total = 1500 train-fit + 300 time-ordered validation
Test slice	500 rows (stratified subsample on post-cutoff timeline)
Feature selection	Mutual information fit on first 1500 train rows only, top 16 numeric columns
Scaling / angle mapping	Fit on train only; apply to val/test (no fit leakage)
Reservoir readout	Primarily Z+ZZ expectations with balanced logistic readout

All numbers shown below are pulled from generated artifacts in `Competition 2/Cursor`.

Architecture	Test ROC-AUC	Test PR-AUC
8q / 2c / d4	0.7226	0.2648
12q / 3c / d6	0.7435	0.1509
16q / 4c / d6	0.7102	0.2230

Classical References and Matched-Slice Check

Reference	Protocol	ROC-AUC	PR-AUC
Full classical XGBoost	~472k train / 118k test (full split)	0.9004	0.5036
Matched XGBoost	Same 1800/500 protocol as quantum slice	0.7196	0.2883
PC-QRC best (12q/3c/d6)	Same 1800/500 protocol	0.7435	0.1509

Matched-slice readout: in this small 500-row test slice, PC-QRC leads on ROC-AUC while XGBoost leads on PR-AUC. This is useful for fair protocol disclosure, not for production ROI claims.

K (contextures)	Block size	Test ROC-AUC	Test PR-AUC
1	12	0.7391	0.2229
2	6	0.7345	0.1858
3	4	0.7435	0.1509
4	3	0.7232	0.1783
6	2	0.7456	0.0919

IBM Hardware Pilot (ibm_fez, max logical width)

Run archived as HSBC IBM evidence bundle — IBM Quantum Platform instance routed as open-instance (open-plan quota); Runtime pinned-name discovery fell through to aggregate listing where needed.

156q/12c

Architecture label (depth 6)

0.7717

IBM capped-test ROC-AUC

0.1750

IBM capped-test PR-AUC

2048

Shots per circuit

48/12/48

Capped train/val/test rows

Sampler jobs (12 train + 3 val + 12 test batches)

IBM pilot result	Value
Backend	ibm_fez
Mode	`[internal mode]`, readout proprietary readout
Test ROC-AUC	0.7717
Test PR-AUC	0.1750
Reservoir seed	1000
Sample job IDs	`d7uc5kcinasc738smc80` (train), `d7uc6nnmrars73d6puqg` (val), `d7uc707mrars73d6pv50` (test); full list in JSON (`ibm_job_ids_all`)
Post-processing	Proprietary QPC readout aggregation (optional)

Execution friction (why this took iterations). IBM Quantum Platform authentication is sensitive to instance routing (short names vs CRNs), stale credential files, paid-vs-open quotas, and Runtime SDK changes (ibm_quantum_platform channel). Early attempts showed “invalid instance” or empty backends until the workspace script prioritized open-instance, aggregate fallback where pinning fails, and consistent environment cleanup. That operational noise is normal for multi-instance IBM Cloud accounts — it is not a weakness of the quantum circuit math itself.

Interpretation boundary: this is job-backed hardware execution at full device logical width on a small capped slice (48/12/48) — not the same row counts as the 1800/500 simulator tuning contract. Metrics reflect shots noise plus stratified sampling variance; they support feasibility and archival credibility for Phase I, not production ROI.

Data and Source Artifacts

Main data folder used by scripts: Competition 2/data

train_transaction.csv, train_identity.csv (primary modeling data)
test_transaction.csv, test_identity.csv, sample_submission.csv (Kaggle package files)

Numbers in this page come from:

HSBC tuning bundle
HSBC classical match bundle
HSBC ablation bundle
HSBC IBM evidence bundle

These artifacts are generated in the `Competition 2/Cursor` workspace and should be versioned together with this page when publishing evidence snapshots.

Pilot report ← Home Highlights