How We Hacked BCG's Data Warehouse — 3.17 Trillion Rows, Zero Authentication
Boston Consulting Group — one of the "Big Three" management consultancies, 32,000 employees, $12B+ in annual revenue. BCG has invested heavily in AI through its GAMMA division: hundreds of data scientists building AI-powered analytics tools, processing workforce intelligence, running M&A due diligence, and powering competitive analysis for the firm's consulting engagements.
After we published our McKinsey Lilli research, we fed the response data and engagement analytics back into our research agent and asked it a simple question: who's likely to have the same problems?
The reasoning it came back with was straightforward. McKinsey, Bain, and BCG are building the same type of offerings at roughly the same time. Internal AI platforms, proprietary data warehouses, client-facing analytics tools, etc. They're competing for the same talent, using the same technical stacks, and generally moving at the same pace. The threat model doesn't change just because the logo does.
Our research agent flagged BCG as a high priority target. We agreed with the assessment and turned it loose under their responsible disclosure guidelines.
One line from the agent's report stopped us:
The agent recalled based on its previous engagements and assessments. In terms of sheer size, it was right.
Mapping the Surface
Using only the name "Boston Consulting Group, Inc", the agent ran deep reconnaissance against BCG's external IT infrastructure, pulling back thousands of subdomains, API endpoints, and application services. Most resolved to login pages or locked-down portals.
But one caught the agent's attention: x.bcg.com.
BCG X Portal — "Where BCG's tools, data, and AI converge" — is dedicated to Analytics, Data science and Artificial Intelligence. But what was inside?
Walking Through the Front Door
The agent started probing the platform's API. Most endpoints were properly locked down — authentication required, access denied. But the platform also exposed its complete API documentation to the public: 372 endpoints with every route and parameter laid out like a blueprint.
The agent worked through them one by one. Most were protected. But a few of them weren't.
One endpoint accepted raw database queries via SQL and returned results directly — no authentication, no API key, no session token. An internal tool, built for analysts, sitting on the public internet with nothing between it and anyone who found it.
If you've read our McKinsey research, this pattern should be familiar: a major consulting firm builds an AI-powered data platform, exposes an API endpoint that executes database queries, and forgets to put authentication in front of it. Different firm, different platform, same fundamental mistake.
3.17 Trillion Rows, 131.2 Terabytes of Data
That's not a typo. If you read one record per second, you'd still be reading 100,000 years from now.
What the agent had access to was BCG's Workforce Analytics (WFA) data warehouse — containing commercially licensed datasets from third-party vendors behind zero authentication:
553 million individual position histories with full-text job descriptions, total compensation, and seniority levels.
8.7 billion employee joiner/leaver records tracking who joined and left which company, in which role, in which country — monthly snapshots going back years.
12.8 billion individual employee skills records mapped to specific users, companies, and job roles.
7.8 billion compensation benchmarks — median base pay and total compensation by job title, company, location, and year.
This is individual-level employment data on hundreds of millions of real people, spanning millions of companies worldwide.
But the workforce data was only one part of what the endpoint exposed.
Full Scope of Exposure
The same unauthenticated endpoint provided access to other database schemas containing other categories of commercially licensed data:
M&A data — 201 billion rows of workforce data aggregated for acquisition target analysis, alongside a GenAI proof of concept using GPT-4o-mini to research and classify 26,000+ companies.
Consumer transaction data — 3 billion purchase receipts with individual order-level detail. 1.87 billion rows of per-company cloud spending data — individual companies' AWS, Azure, and GCP costs broken down by service and region.
Employment data — 252 million compensation records (including BCG Senior Partner compensation) and employee reviews and sentiment analysis.
Orphaned cloud storage — a storage integration still referencing an AWS S3 bucket, but the bucket itself had been deleted. An attacker could recreate that bucket and silently intercept any data exports flowing through the integration.
Also accessible was BCG employee data — real names, access control logs with login timestamps, and the identities of 399 BCG GAMMA employees mapped to the specific consulting cases they work on.
The service account behind the endpoint also held full write privileges — meaning an attacker wouldn't just be reading the data, they could silently alter it. Changing compensation data. Modifying M&A intelligence. And poisoning the inputs that feed BCG's client advice.
The Bigger Picture
The vulnerability we found here was not sophisticated. It was basic — an unauthenticated API endpoint with full read-write database access. The kind of issue that should get caught in a routine security review. The problem is that routine security reviews aren't keeping pace with how fast organisations are moving.
Every company's attack surface is growing whether they're building software or not. New SaaS platforms, cloud migrations, third-party integrations, AI tools adopted by internal teams, vendor portals, APIs connecting partners and clients. You don't need to be a software company to have a software company's attack surface. You just need to operate like a modern business in the fast moving AI era.
And the pace is accelerating. AI-assisted development tools mean the teams building these platforms are deploying code faster than ever. Features that once took quarters now take weeks. That's exciting, but it's widening the gap between how quickly things are built and how thoroughly they're tested.
Traditional pentests can't close that gap. They're slow, expensive, and periodic. A manual engagement might run once or twice a year, take weeks to scope and execute, and deliver a PDF that's partially out of date by the time it lands. Meanwhile, the attack surface changes every time someone pushes a deployment, adds an integration, or spins up a new environment.
This isn't a BCG-specific problem. It applies to every organisation across all industries and scale. To BCG's credit, they remediated the vulnerability within 48 hours of disclosure. That kind of responsiveness matters.
The harder question is the one every organisation should be asking right now: how often is our attack surface changing — and is our security posture keeping up?
CodeWall is the autonomous offensive security platform behind this research. We're currently in early preview and looking for design partners — organisations that want continuous, AI-driven security testing against their real attack surface. If that sounds like you, get in touch: [email protected] or book an intro call.
Responsible Disclosure Timeline
- 2026-03-12 — Autonomous agent begins reconnaissance of BCG's external attack surface and discovers unauthenticated SQL execution endpoint on x.bcg.com.
- 2026-03-12 — Notification sent to BCG's security team via their responsible disclosure policy.
- 2026-03-14 — Their security team confirm receipt and request further details.
- 2026-03-14 — Full evidence pack shared with their team for remediation.
- 2026-03-14 — BCG remediates the vulnerability.
- 2026-03-31 — Publication under responsible disclosure.
Pre-publication Review
Ahead of publication we shared multiple drafts of this post with the BCG team over the course of a week. Feedback provided by BCG has been incorporated throughout. BCG declined to provide a comment for inclusion.
This research was conducted in accordance with responsible disclosure principles and industry-standard security research methodology. All testing was verification-only: access was limited to the minimum necessary to confirm each vulnerability, determine the scope and impact, and document evidence for remediation purposes. No disruption was caused to any production service. All findings were disclosed to BCG's security team prior to publication, and all issues were confirmed remediated before this article was published.
Subscribe
Get notified when we publish new research & news.
Our agents won't hack you, promise.

