TECHNICAL COMPANION · REPRODUCIBILITY

HSBC Fraud Pilot — Data & Sources

Operational artifact list, run commands, and protocol boundaries for independent technical review.

Reader's guide:

1) HSBC Fraud Pilot Report (story + tables) → 2) this page (commands + pitfalls) → 3) JSON artifacts in Competition 2/Cursor/ (e.g. cursor_hsbc_pcqrc_fd_ibm.json) → 4) Highlights · Home

Workspace Layout

PathPurpose
~/Desktop/Competition 2/data/Kaggle IEEE-CIS source CSV files
~/Desktop/Competition 2/Cursor/Canonical scripts, JSON outputs, concept notes
~/Desktop/Competition 2/venv/Python environment used for all runs

Input Data Files

FileRole
train_transaction.csvPrimary transaction records (contains isFraud)
train_identity.csvIdentity/device join data by TransactionID
test_transaction.csvKaggle test package file (not used for labeled pilot metrics)
test_identity.csvKaggle test identity package file
sample_submission.csvKaggle template file

Primary Artifacts (Generated Numbers)

ArtifactContains
cursor_hsbc_pcqrc_fd_tuning.jsonArchitecture sweep (8q/12q/16q), seed screening, main simulator metrics
cursor_hsbc_matched_subset_xgb.jsonMatched-slice XGBoost baseline (same MI/split protocol)
cursor_hsbc_pcqrc_fd_ablation.jsonK-ablation at fixed 12 qubits/depth (K=1,2,3,4,6)
cursor_hsbc_pcqrc_fd_ibm.jsonIBM Fez run, job IDs, capped-slice hardware metrics
cursor_hsbc_classical_baseline.jsonFull-split classical reference metrics

Run Commands (Reproducibility)

Always execute from ~/Desktop/Competition 2/Cursor with the shared venv:

cd ~/Desktop/Competition\ 2/Cursor
../venv/bin/python cursor_hsbc_pcqrc_fd_tuning.py
../venv/bin/python cursor_hsbc_matched_subset_xgb.py
../venv/bin/python cursor_hsbc_pcqrc_fd_ablation.py

IBM hardware pilot (archived May 2026 — test ROC-AUC 0.7717, PR-AUC 0.1750 on capped 48/12/48, ibm_fez max-qubits mode, 156q/12c/d6, readout z+ctxpool):

unset QISKIT_IBM_INSTANCE
export QISKIT_IBM_INSTANCE=open-instance
cd ~/Desktop/Competition\ 2/Cursor
../venv/bin/python cursor_hsbc_pcqrc_fd_ibm.py \
  --mode ibm --backend ibm_fez \
  --max-qubits-mode \
  --depth 6 \
  --seed 1000 --seed-sweep 1 \
  --readout z+ctxpool \
  --runs-per-batch 1 --use-qpc-noise-reducer \
  --batch-size 4 --shots 2048 \
  --cap-train 48 --cap-val 12 --cap-test 48

Do not type literal ... on the command line. Adjust caps/shots if you need a cheaper smoke test (--ibm-fast-defaults is an alternative preset).

IBM dependency: ../venv/bin/pip install -U 'qiskit-ibm-runtime>=0.46'

IBM credentials: quantum token in QISKIT_IBM_TOKEN or ~/.ibm_quantum_token; IAM key optional on some accounts via QISKIT_IBM_IAM_API_KEY. Instance routing is documented in README_CURSOR_WORKSPACE.md (open-instance, CRN file precedence, aggregate fallback).

Artifact: cursor_hsbc_pcqrc_fd_ibm.json contains metrics and full job-id arrays (ibm_job_ids_all).

156-qubit / noise-reducer stack: --max-qubits-mode matches Fez logical width; readout mode z+ctxpool keeps feature vectors manageable. qpc_noise_reducer.py is an optional in-repo Python helper loaded when you pass --use-qpc-noise-reducer; it supports count aggregation across --runs-per-batch repeats (readout-error mitigation inside it is only practical for smaller widths).

IBM Platform Pitfalls (Honest Run Log)

Teams often lose hours here until routing is stable:

The successful archived pilot used open-instance after pinned discovery quirks and executed 27 Sampler batches at max logical width; job IDs are in cursor_hsbc_pcqrc_fd_ibm.json.

Protocol Notes for Reviewers

Interpretation boundary: capped IBM metrics are evidence of hardware execution and workflow, not direct evidence of production ROI or superiority vs full-data classical systems.

Navigation

HSBC report page Home Highlights