A transparent pilot for fraud-risk modeling: task framing, data preparation protocol, simulator and IBM run outcomes, and clear interpretation boundaries.
Start here for the narrative and metric tables on this page. Follow the chain below if you audit or reproduce the work.
cursor_hsbc_pcqrc_fd_ibm.json, cursor_hsbc_pcqrc_fd_tuning.json, cursor_hsbc_matched_subset_xgb.json under Competition 2/Cursor/. Publish copies with your submission bundle if the jury requires downloads.Navigation: Home
The task is HSBC-style fraud detection on card transactions, using public IEEE-CIS Fraud Detection data as an accessible real-world benchmark. The core objective is not immediate production superiority, but to evaluate whether QPC's polycontextural reservoir architecture can produce meaningful fraud-risk features under a strict, leakage-controlled protocol.
Goal of this pilot:
Dataset source: Kaggle IEEE-CIS Fraud Detection (`train_transaction.csv` + `train_identity.csv`). Records are sorted by `TransactionDT` to keep chronological structure.
| Step | Protocol used in this pilot |
|---|---|
| Time split | 80% train timeline / 20% test timeline (`time_split_quantile=0.8`) |
| Pilot train pool | 1800 rows total = 1500 train-fit + 300 time-ordered validation |
| Test slice | 500 rows (stratified subsample on post-cutoff timeline) |
| Feature selection | Mutual information fit on first 1500 train rows only, top 16 numeric columns |
| Scaling / angle mapping | Fit on train only; apply to val/test (no fit leakage) |
| Reservoir readout | Primarily Z+ZZ expectations with balanced logistic readout |
All numbers shown below are pulled from generated artifacts in `Competition 2/Cursor`.
| Architecture | Test ROC-AUC | Test PR-AUC |
|---|---|---|
| 8q / 2c / d4 | 0.7226 | 0.2648 |
| 12q / 3c / d6 | 0.7435 | 0.1509 |
| 16q / 4c / d6 | 0.7102 | 0.2230 |
| Reference | Protocol | ROC-AUC | PR-AUC |
|---|---|---|---|
| Full classical XGBoost | ~472k train / 118k test (full split) | 0.9004 | 0.5036 |
| Matched XGBoost | Same 1800/500 protocol as quantum slice | 0.7196 | 0.2883 |
| PC-QRC best (12q/3c/d6) | Same 1800/500 protocol | 0.7435 | 0.1509 |
Matched-slice readout: in this small 500-row test slice, PC-QRC leads on ROC-AUC while XGBoost leads on PR-AUC. This is useful for fair protocol disclosure, not for production ROI claims.
This run keeps total qubits and depth fixed and varies contexture count K. K=1 is a single wide chain; K>1 uses disjoint blocks (polycontextural blocking).
| K (contextures) | Block size | Test ROC-AUC | Test PR-AUC |
|---|---|---|---|
| 1 | 12 | 0.7391 | 0.2229 |
| 2 | 6 | 0.7345 | 0.1858 |
| 3 | 4 | 0.7435 | 0.1509 |
| 4 | 3 | 0.7232 | 0.1783 |
| 6 | 2 | 0.7456 | 0.0919 |
Run archived as cursor_hsbc_pcqrc_fd_ibm.json — IBM Quantum Platform instance routed as open-instance (open-plan quota); Runtime pinned-name discovery fell through to aggregate listing where needed.
| IBM pilot result | Value |
|---|---|
| Backend | ibm_fez |
| Mode | --max-qubits-mode, readout z+ctxpool |
| Test ROC-AUC | 0.7717 |
| Test PR-AUC | 0.1750 |
| Reservoir seed | 1000 |
| Sample job IDs | d7uc5kcinasc738smc80 (train), d7uc6nnmrars73d6puqg (val), d7uc707mrars73d6pv50 (test); full list in JSON (ibm_job_ids_all) |
| Wide-layout readout | z+ctxpool — per-qubit Z plus pooled statistics per contexture so feature size stays tractable at 156 qubits (full pairwise ZZ is not used here). |
| QPC noise reducer | Optional Python helper module qpc_noise_reducer.py (same repo/site bundle), not a separate IBM product. When enabled (--use-qpc-noise-reducer), the HSBC script can average measurement counts across repeated batches (--runs-per-batch). Matrix readout mitigation inside that module applies only up to 16 qubits; at 156 qubits the practical lever is multi-run aggregation + pooled readout + shots. |
Execution friction (why this took iterations). IBM Quantum Platform authentication is sensitive to instance routing (short names vs CRNs), stale credential files, paid-vs-open quotas, and Runtime SDK changes (ibm_quantum_platform channel). Early attempts showed “invalid instance” or empty backends until the workspace script prioritized open-instance, aggregate fallback where pinning fails, and consistent environment cleanup. That operational noise is normal for multi-instance IBM Cloud accounts — it is not a weakness of the quantum circuit math itself.
Interpretation boundary: this is job-backed hardware execution at full device logical width on a small capped slice (48/12/48) — not the same row counts as the 1800/500 simulator tuning contract. Metrics reflect shots noise plus stratified sampling variance; they support feasibility and archival credibility for Phase I, not production ROI.
Main data folder used by scripts: Competition 2/data
train_transaction.csv, train_identity.csv (primary modeling data)test_transaction.csv, test_identity.csv, sample_submission.csv (Kaggle package files)Numbers in this page come from:
cursor_hsbc_pcqrc_fd_tuning.jsoncursor_hsbc_matched_subset_xgb.jsoncursor_hsbc_pcqrc_fd_ablation.jsoncursor_hsbc_pcqrc_fd_ibm.jsonThese artifacts are generated in the `Competition 2/Cursor` workspace and should be versioned together with this page when publishing evidence snapshots.
Technical sources page ← Home Highlights