The Data Paradox: Why More Isn’t Better in Modern Clinical Research

By Janice Chang, CEO of TransCelerate BioPharma
June 22, 2026

The clinical research ecosystem stands at a precarious crossroads. We are living in an era defined by unprecedented technological capability, yet the industry is simultaneously suffocating under the weight of its own ambition. As the CEO of TransCelerate BioPharma, I have spent years working alongside the heads of R&D from 18 of the world’s largest pharmaceutical companies. Through these collaborations, a sobering realization has emerged: the industry’s long-standing philosophy that “more is better” when it comes to data collection is not merely inefficient—it is fundamentally undermining the integrity and feasibility of the clinical trials that are supposed to deliver life-saving therapies.

The State of the Ecosystem: A Crisis of Complexity

In December 2025, a landmark study conducted by the Tufts Center for the Study of Drug Development in partnership with the TransCelerate team provided empirical weight to a suspicion long held by trial investigators: we are asking too much and gaining too little.

The research revealed a startling inefficiency: nearly 30% of the data collected in modern clinical trials does not directly inform key regulatory or scientific decisions. Despite this, patients—often those facing severe illness—are still subjected to the burden of providing it. This is not just a logistical hurdle; it is a profound ethical concern. When we require patients to undergo unnecessary procedures, provide extra blood samples, or fill out redundant electronic patient-reported outcome (ePRO) forms that provide no actionable insight, we jeopardize trial retention and place an undue toll on the people we are trying to help.

A Chronology of Escalation: How We Got Here

To understand the current state of clinical research, one must look at the rapid trajectory of protocol design over the last two decades. The "more is better" mindset did not happen overnight; it evolved through a combination of risk aversion, technological capability, and the desire to leave no stone unturned in the search for efficacy and safety signals.

  • 2005–2010: The Digital Awakening. As electronic data capture (EDC) systems became the industry standard, the friction of collecting data decreased. If it was easier to collect, researchers reasoned, why not collect everything?
  • 2010–2015: The Rise of Endpoints. Driven by competitive pressures to differentiate drug candidates, pharmaceutical companies began adding more secondary and exploratory endpoints. The goal was to build a comprehensive "value dossier" for payers and regulators.
  • 2015–2020: The Complexity Explosion. Wearables, sensors, and real-world evidence (RWE) integration began to take hold. While these tools offered promising new data streams, they were often added to existing, already bloated protocols rather than replacing outdated legacy procedures.
  • 2020–2025: The Efficiency Gap. The COVID-19 pandemic forced a brief period of trial simplification (decentralized models, remote monitoring). However, as the industry returned to a "new normal," the old habits of maximalist data collection returned with a vengeance.

The data confirms this escalation. Since 2005, the number of procedures per protocol has surged by nearly 140%. Even more alarming, the number of endpoints has increased by more than 200%, and the total volume of data points collected has ballooned by over 600%. We are building clinical trials that are increasingly gargantuan, designed for a world where data was scarce, not one where it is abundant.

Supporting Data: The Cost of "Data Obesity"

The "data obesity" plaguing the pharmaceutical industry is quantifiable, and its symptoms are visible across the entire drug development lifecycle.

The Patient Burden

The primary casualty of this complexity is the patient. Protocol complexity is the single largest driver of site burden and patient dropout rates. When a trial protocol requires a patient to spend six hours at a clinic for procedures that provide no relevant data, the patient experience suffers. This leads to attrition, which in turn necessitates longer recruitment periods and larger sample sizes to maintain statistical power, creating a vicious cycle of cost and time.

The Operational Drag

The industry spends billions annually on data management, cleaning, and reconciliation. When 30% of that data is effectively "dark data"—collected but never used—we are essentially burning capital that could be better spent on diversifying patient populations, accelerating trial timelines, or investing in next-generation therapeutic modalities.

The Quality Trade-off

There is a fundamental psychological and operational limit to human focus. By demanding that site staff collect 600% more data points than in the past, we dilute the quality of the data that actually matters. When a nurse or investigator is overwhelmed by the sheer volume of data entry, the risk of human error increases, potentially compromising the validity of primary endpoints.

Enough, already: the problem with clinical trial data collection

Official Responses and Industry Perspectives

The internal dialogue among global R&D leaders has shifted significantly. In my discussions with executives, there is a growing consensus that the status quo is unsustainable.

"We have been measuring our success by the volume of data we generate rather than the clarity of the insights we produce," noted one R&D head during a recent TransCelerate summit. This sentiment is becoming common. The shift in perspective is moving toward "Fit-for-Purpose" data collection. This philosophy advocates for a rigorous review of every protocol: If we cannot articulate exactly how a data point will be used to support a regulatory filing or clinical decision, it should not be collected.

Regulators, including the FDA and the EMA, have also signaled a willingness to embrace lean protocols. Initiatives like the FDA’s "Complex Innovative Trial Design" program encourage the use of fewer, more meaningful endpoints, provided the scientific rigor is maintained. The challenge lies in the industry’s own risk-averse culture, where legal and clinical teams often fear that omitting a variable could lead to a regulatory rejection—even if that variable has never once tipped the scales in a drug approval.

Implications: The Path Toward "Lean Research"

The future of drug development must be defined by intentionality rather than accumulation. To reverse the trend of the last two decades, the pharmaceutical industry must undergo a systemic transformation.

1. The Protocol Pruning Initiative

Companies must implement "protocol pruning" as a standard stage-gate in the R&D process. This involves questioning every secondary endpoint and procedural requirement. Is this test necessary to ensure patient safety? Is it required for a primary regulatory endpoint? If the answer is no, it must be removed.

2. Prioritizing Patient-Centricity

We must shift the design process to include patient advocates early. Patients are uniquely positioned to identify which procedures are burdensome and which data collection methods are invasive. By centering the trial design on the patient’s reality, we improve recruitment and retention, which are the two most critical drivers of trial success.

3. Embracing Digital Minimalism

The advent of AI and machine learning allows us to synthesize more information from fewer data points. We no longer need to measure everything if we can use predictive modeling to identify the key markers of success. Digital tools should be used to reduce the frequency of site visits and the volume of manual data entry, not to add more layers of digital busywork to an already burdened system.

4. A New Regulatory Partnership

Industry and regulators must work in closer tandem to establish clear expectations regarding what constitutes "sufficient" data. A shared framework for what can be safely omitted would give R&D teams the confidence to design leaner, faster, and more efficient trials.

Conclusion: Quality Over Quantity

The clinical research ecosystem is currently a testament to the dangers of "more." We have optimized for volume at the expense of velocity and human experience. The research published in late 2025 by the Tufts Center is a wake-up call. We have the technology to make clinical trials faster, safer, and more inclusive, but we must first have the courage to simplify.

As we look toward the next decade of pharmaceutical innovation, our success will not be measured by the terabytes of data we archive in our databases. It will be measured by how quickly we can get safe, effective treatments into the hands of the patients who need them. To achieve that, we must stop asking for everything and start asking for what matters. The era of "data obesity" must end; the era of "data clarity" must begin.

More From Author

Digital Transformation in Wales: Powys Teaching Health Board Launches Landmark ePMA System

The Power of Incrementalism: How Small Lifestyle Adjustments Foster Long-Term Stability in Bipolar Disorder Management