Data Collection for Cell and Gene Therapies: Perspectives from the Front Lines


Workshop Summary

Data Collection for Cell and Gene Therapies: Perspectives from the Front Lines

In this Summary

Framing the Challenge
About Cures for Life


We are in the early days of the adoption of cell and gene therapies. With four products approved in the US, seven approved in Europe, and dozens more expected in the coming years, health-care systems are just beginning to grapple with how to ensure patient access to these novel therapies.

Robust long-term data collection is central to discussions about patient access to cell and gene therapies, as there is still much to learn about the safety and efficacy of these therapies once they are approved. Currently, in the US, the Food and Drug Administration (FDA) requires cell and gene therapy developers to monitor patients for up to 15 years. While collecting data on safety and the patient experience during this extended period can help address many uncertainties about cell and gene therapies, the data collection infrastructure is not yet in place to support such activities on a broad scale.

FasterCures held two workshops in 2020 to explore challenges and opportunities related to data collection for long-term follow-up. These workshops brought an array of stakeholders to the table, including patient groups, FDA, health plans, and industry, among others. The first workshop, held in person in Washington, DC, in February, brought approximately 40 stakeholders together to discuss an ideal system for long-term data collection for cell and gene therapies. Through presentations and panel discussions, the group delineated the data needs of different stakeholders and discussed the strengths and limitations of current models for data collection. FasterCures published a report summarizing the findings and learnings that emerged from this work.

In July, FasterCures hosted a second workshop virtually focusing on the perspectives of those directly involved in the collection of long-term data, also with approximately 40 attendees representing a broad range of stakeholder groups. Through presentations, facilitated discussions, and breakout sessions, attendees discussed current practices for data collection and identified outstanding questions and challenges. Some of the issues that the group explored included minimizing the burden of data collection, keeping patients engaged for an extended time, and regulations or guidance that would be necessary to advance the field. The workshop highlighted that there is an urgent need to coordinate data collection activities, as it is currently not a priority for many stakeholders, nor is it widely incentivized in the US health-care system.

Framing the Challenge

Discussions at the February and July workshops, supplemented with interviews and secondary research, illuminated many challenges related to long-term data collection for cell and gene therapies. These include a lack of clarity around who is ultimately responsible for creating data collection platforms, uncertainties related to the operational logistics of collecting the data, and needs for resources and funding to support these efforts. Table 1 below presents a list of the challenges that surfaced during the workshops.

Table 1: Overarching Challenges of Data Collection for Long-Term Follow-Up




It is unclear whose responsibility it is to create data platforms that lay out the necessary data elements to be collected. While product developers must often collect data to satisfy the FDA’s requirements for approval or contractual agreements with payers, the information generated through these and other data collection activities should also be relevant to a greater breadth of stakeholders in order to advance knowledge and inform care decisions. There is a desire to have a neutral third party coordinate data collection activities, but determining the “right” entity to do so can be a challenge in and of itself.

Data Quality

Collecting data so that they are accurate, complete, and efficient (by eliminating duplication across different registries and electronic health records) is resource- and time-intensive.


Tracking a patient over 15 years, as required by FDA for some therapies, poses logistical challenges for providers. Patients may move and receive treatment at other centers. Individual providers, nurses, and other clinical staff may also change positions, locations, or retire over this time.


Providers are not reimbursed for the time spent on data collection activities. As approvals increase in the coming years, this cost to providers may become unsustainable.


There is no continuous feedback loop that enables data collected in clinical settings to inform future decision-making and efficiently disseminates results.



The recommendations by the workshop attendees focused on five key areas: coordination, incentives, core data elements, funding, and information technology (IT).


A common thread throughout the workshop discussion was a need for greater coordination across patients, providers, industry, payers, and regulators to build a data collection system that could address multiple information needs. Current efforts were described as too silo-ed and narrowly focused on satisfying FDA’s requirements. Rather, attendees articulated a need for a more holistic effort that could answer questions of interest to not only regulators but also patients, providers, and payers.

Disease and patient groups were viewed as the best positioned to act in a coordinating role to bring together the different stakeholders, as they already serve as a bridge between stakeholders. There was also a sense that placing this responsibility in the hands of such organizations would facilitate the creation of disease-specific, rather than therapeutic-specific, databases, which would minimize duplication in data collection. At the same time, however, because this work is not core to the mission of disease organizations or patient groups, the ability to secure sufficient and sustainable funding could be a limiting factor. The Center for International Blood and Marrow Transplant Research and the World Federation of Hemophilia have adapted existing data collection systems for cell and gene therapies that could serve as models (see our July report for a full discussion of these activities). CureSMA is another example of a data collection effort led by a patient organization; the SMA Clinical Data Registry, created by CureSMA, collects data about the care and treatments received by individuals with spinal muscular atrophy from 18 partner neuromuscular disease centers. Data are collected directly from the centers’ electronic medical record system, which minimizes the administrative burden on providers and reduces duplicative data entry. In addition, CureSMA analyzes the data collected quarterly and meets with clinical teams at the partnering centers to share their findings.


The workshop attendees viewed a strong incentive structure as a key element to facilitating long-term follow-up data collection.

Product developers. For product developers, the incentives for data collection are built into the conditions for FDA approval; that is, long-term data collection is a commitment manufacturers must make to secure market approval. In addition, to the extent that a manufacturer is engaged in outcomes-based contracts, there are incentives to collect data on the outcomes and metrics that underpin those contracts. However, workshop attendees observed there should be stronger incentives for developers to share data generated from their data collection activities with stakeholders beyond regulators and the payers with which they have contractual arrangements, as those data could be used to improve standards of care. Patients. Long-term data collection activities also rest significantly on patients’ willingness to participate in these activities well after treatment. The durable and potentially curative effects of cell and gene therapies mean that several years after treatment, a patient’s engagement in follow-up activities is likely to wane. This issue may be particularly challenging among patients who receive a therapy as children and do not come to connect to or identify with a patient community in the same way as patients who have lived with a condition for years.

Patient organizations at the workshop described how they seek to provide value back to their communities to promote ongoing engagement. To engage patients, disease organizations and care centers are sharing tools and information with their patient community to illustrate their progress, visualize their data, or demonstrate how they compare to other patients with similar characteristics. In another example, in the COVID-19 context, one organization sent patients small care packages in return for completing surveys, and they relayed survey results through regular community webinars. Another potential approach is to mandate patients report their data, a requirement that is in place for recipients of bone marrow transplants. Providers. Attendees recognized that providers bear a tremendous burden in learning new data platforms, integrating data collection into their workflows, and following up with patients through the years. Providers are expected to take on these activities as part of care delivery, but it was agreed that reimbursement is needed to help defray the costs of data collection. In addition, creating a feedback loop that gives providers access to aggregated and analyzed data could help generate valuable information for the provider to inform current practice or future decision-making.

Core Data Elements

Many databases and registries collect information on patient care and outcomes. But, too often, these disparate efforts are not interoperable, collect duplicative data, and do not answer the research questions that matter most to stakeholders. Workshop participants agreed on the need for core data elements that can capture the performance of a cell or gene therapy from the perspective of patients, providers, regulators, payers, and manufacturers. Because different diseases warrant different core data elements, this should be done on a disease-specific basis and should involve the full disease community, ideally on an international scale. An example of such an approach is the development of a core outcome set with clear definitions by the coreHEM multi-stakeholder project for hemophilia.

To ensure that the patient perspective is seamlessly incorporated, it will be useful to build tools and frameworks that allow for the incorporation of patient-reported outcomes (PROs) into registries. The National Institutes of Health has a standardized set of measures for PROs called Patient-Reported Outcomes Measurement Information System (PROMIS), and patient organizations are exploring the extent to which these measures could be implemented in a cell or gene therapy environment.

Appropriate Resources and Funding

Significant time and resources are required to stand up a long-term follow-up data collection effort. Providers report that the available resources can drop off significantly once the clinical trial phase is complete and a therapy is commercially available. For example, during clinical trials, financial assistance is available for patient travel, as well as funding, training, and support for data entry staff, and registries are pre-established. But once the therapy is on the market, this support diminishes considerably. This issue often translates to increased burden on providers, clinical staff, and data managers who face unclear processes and expectations. A best practice that emerged among providers is to transition the clinical research coordinator or data manager that has experience with the investigational trial to oversee the post-market data collection effort. This individual would be familiar with the patient population and potential toxicities that would likely be encountered, and this experience would facilitate a smooth transition into the commercial space. The staff involved during the clinical trial phase should also be engaged in how databases are built for the commercial space, especially because the data can often be nuanced. Even if these staff cannot transition themselves, at a minimum, they can help train and share learnings with others involved in the process as needed. The roles and responsibilities of each person involved in the data entry piece of patient care should also be made clear, as minimizing the number of people involved in data collection helps to reduce burden overall.

Finally, some providers are able to secure funding to support data collection, enabling them to make deeper investments in their capabilities and to provide support to staff fulfilling this important role. CureSMA and the American Thrombosis and Hemostasis Network provide grants to their partnering providers to support data collection activities.

IT Issues:

Beyond achieving consensus on the specific data elements to collect, automating data transfer between centers and registries will increase efficiency. This process is a significant challenge that many groups have tried to address and that each hospital or care center might deal with differently. In particular, IT teams’ focus on protecting and securing data can result in different thresholds for what they are willing to automate and release to external databases. Patient organizations with experience working with IT departments report that it takes significant back-and-forth to agree upon acceptable ways to release data. As a result, these organizations contract separately with many different centers to ensure that their data transfer practices adhere to the security requirements on the care center’s end. Organizations with experience in doing this would be a useful resource for others in navigating these arrangements.

Table 2. Summary of Recommendations

To operationalize robust data collection systems, stakeholders will need to work together to devise mutually beneficial solutions. The recommendations below are intended to be implemented through collaborative efforts by multiple stakeholder groups, including product developers, providers, payers, and patient organizations.




  • Support patient organizations in playing a coordinating role to build a data collection system that could address multiple information needs of a disease community


  • Create incentives for product developers to share information gleaned from data collection activities more broadly (i.e., beyond the regulators and any payers with which they have contractual arrangements)

  • Create resources, tools, and communication strategies to help patients understand the value of their data and the role it plays in advancing knowledge that can improve care for all patients

  • Explore ways to offset the costs of data collection for providers

Core Data Elements

  • Develop core data elements on a disease-specific basis, with input from the full disease community, ideally on an international scale

Resources and Funding

  • Ensure the continuation of adequate resourcing and funding of data collection activities in the post-market phase

  • Facilitate the sharing of best practices among providers


  • Develop standards for data transfer practices that balance the need to protect and secure data with the need to drive greater efficiency



Having a strong data collection infrastructure is fundamental to ensuring patient access to cell and gene therapies, as data provide the information needed to manage the uncertainties about the benefits and limitations of the new therapies. Because there are only a handful of cell and gene therapies on the market today, with many more to arrive in the coming years, there is an opportunity to design data collection systems in advance and collectively to address the known challenges being confronted in current data efforts. This effort cannot be accomplished by individual stakeholder groups working independently. Rather, it must be achieved through a coalition of disease organizations, researchers, providers, payers, regulators, and other groups working together beginning at the clinical trial stage and continuing through approval to the post-market. A neutral party, like FDA or a patient organization, should convene these groups to lay out a plan for data collection and to articulate explicit roles and responsibilities for each stakeholder group.

All stakeholder groups can contribute to a paradigm shift that emphasizes coordination as well as the importance of engaging patients to contribute their data on an ongoing basis. Improving coordination, integrating appropriate incentives, and properly resourcing data collection for the long term are steps toward ensuring a data collection infrastructure will be ready for the cell and gene therapies of the future.

About Cures for Life

In 2019, FasterCures launched the Cures for Life project to elevate the patient voice in the new and rapidly evolving landscape of cell and gene therapies. We conducted interviews and workshops with a variety of stakeholders, with an emphasis on the participation of patient organizations, to identify common challenges to ensuring that patients have access to life-saving cell and gene therapies. Throughout 2020, FasterCures is hosting a series of issue-specific workshops to discuss emerging areas of focus where we can help amplify the patient perspective. We released a report on the topic of data collection for long-term follow-up in July.

Download the Summary

Published November 11, 2020