Chapter 1 Overview

1.1 Dataset

Extraction

The data contained in Enroll-HD PDS5 was extracted from the Enroll-HD electronic data capture (EDC) database on November 4, 2022, at 10:00 UTC.


Data sources

The PDS6 dataset encompasses data exclusively from Enroll-HD participants, collected from several sources. These sources are the Enroll-HD study, the REGISTRY study, and clinical data collected in adhoc visits outside of the aforementioned studies.

Enroll-HD is an observational cohort study and global clinical research platform designed to facilitate Huntington’s disease (HD) clinical research. It includes participants from North America, Europe, Australasia, and Latin America. The study started in 2012 and is still active and actively recruiting.

REGISTRY is an observational cohort study of HD conducted in Europe. The study started in 2004 and concluded in 2015. As Enroll-HD began, REGISTRY sites and participants began to transition into Enroll-HD. Enroll-HD dataset releases include individuals who initially participated in REGISTRY then consequently enrolled in Enroll-HD and consented to the migration of their REGISTRY data into the Enroll-HD dataset. Registry data are available for a subset of Enroll-HD participants.

Clinical data from additional sources (Ad Hoc data) are available for a subset of Enroll-HD participants. These data were collected at routine clinical visits outside of the Enroll-HD and REGISTRY studies, and comprise HD assessment data (e.g., UHDRS Motor). The date of collection of these data typically pre-date a participant’s enrolment into REGISTRY or Enroll-HD.

Study specific protocols, annotated eCRFs, and data collection guidelines are housed here.


Participant inclusion

To be included in the PDS5 release, participant’sdata had to meet several requirements. Figure 1.1 illustrates the number of participants whose data met each of the predefined inclusion requirement and illustrates how the final sample size of PDS5 was determined.

Participant flow chart for inclusion in PDS6.

Figure 1.1: Participant flow chart for inclusion in PDS6.

Due to data exclusion requirements, not all participants enrolled in Enroll-HD at the time of PDS6 data cut are included in PDS6. Similarly, not all participants included in PDS5 are included in PDS6. Data from 200 participants who were included in PDS5 were not included in the current PDS release. Enroll-HD is an active, longitudinal study. A participant eligible for inclusion for one release may be ineligible the next (e.g., participant data quarantined). Data for PDS5 participants not included in PDS6 may be available through specified dataset (SPS) request.

1.2 Sample Size

PDS5 contains data on 21,116 Enroll-HD participants. Sample size by PDS release is presented in Figure 1.2 .

Enroll-HD sample size by PDS release.

Figure 1.2: Enroll-HD sample size by PDS release.

1.3 Visits

PDS6 contains data from 95,041 visits (baseline and follow-up visits only; all sources). Of these, 78,730 were Enroll-HD visits. The remainder are from Registry (N = 15,292) and ‘Ad Hoc’ sources (N = 1,018). A breakdown of visits by data source is provided in Table 1.1. Number of Enroll-HD visits only by PDS release are illustrated in Figure 1.3.

Table 1.1: Number of visits in PDS5 by constituent data source. Participant number indicates the number of EnrollHD participants with visit data available for the indicated data source (maximum N = 25,550)
Data source Participants Visits
Enroll-HD 25550 78730
Registry 3 4337 10114
Registry 2 2153 5178
Ad Hoc 316 1018
Total 95040

Enroll-HD visits only (baseline and follow-up only) by PDS release.

Figure 1.3: Enroll-HD visits only (baseline and follow-up only) by PDS release.

Considering baseline and follow-up visits from Enroll-HD only, total number of visits per participant in PDS6 ranges from 1 to 11. In Figure 1.4 , we illustrate participant counts by maximum number of Enroll-HD visits. Each participant is represented once, included in the bar indicative of their maximum number of visits. In Figure 1.5, we illustrate maximum participant counts for a specific number of visits. This plot is cumulative, the goal being to illustrate largest available sample size for a specific number of visits. For example, the participant with 11 visits is represented in visit bars 1 through 11, the participants with 10 visits are represented in visit bars 1 through 10, and so on.

Participant counts by maximum number of Enroll-HD visits (baseline and follow-up visits only, unscheduled visits and phone contacts excluded). Full sample represented (N = 25,550, Missing N = 0).

Figure 1.4: Participant counts by maximum number of Enroll-HD visits (baseline and follow-up visits only, unscheduled visits and phone contacts excluded). Full sample represented (N = 25,550, Missing N = 0).


Maximum participant counts for a specific number of Enroll-HD visits (baseline and follow-up visits only; unscheduled visits and phone contacts excluded). Cumulative plot. Full sample represented (N = 25,550; Missing N = 0).

Figure 1.5: Maximum participant counts for a specific number of Enroll-HD visits (baseline and follow-up visits only; unscheduled visits and phone contacts excluded). Cumulative plot. Full sample represented (N = 25,550; Missing N = 0).

Considering baseline and follow-up visits from all data sources (Enroll-HD, REGISTRY, Ad Hoc visits), total number of visits per participant ranges from 1 to >20. Maximum participant counts by visit number are presented in Figure 1.6.

Maximum participant counts for a specific number of visits; Enroll-HD, REGISTRY, Adhoc (baseline and follow-up visits only; unscheduled visits and phone contacts excluded). Cumulative plot. Full sample represented (N = 25,550; Missing N = 0).

Figure 1.6: Maximum participant counts for a specific number of visits; Enroll-HD, REGISTRY, Adhoc (baseline and follow-up visits only; unscheduled visits and phone contacts excluded). Cumulative plot. Full sample represented (N = 25,550; Missing N = 0).

1.4 Sample Characteristics

The PDS6 sample is characterized below with respect to participant category, sociodemographic variables, and clinical characteristics (Figures 1.7 to 1.19).

Participant category at baseline Enroll-HD visit (hdcat_0). Full sample represented (N = 25,550; Missing N = 0)

Figure 1.7: Participant category at baseline Enroll-HD visit (hdcat_0). Full sample represented (N = 25,550; Missing N = 0)


Participant category at latest Enroll-HD visit (hdcat_l). Full sample represented (N = 25,550; Missing N = 0)

Figure 1.8: Participant category at latest Enroll-HD visit (hdcat_l). Full sample represented (N = 25,550; Missing N = 0)


Geographical region (region). Full sample represented (N = 25,550; Missing N = 0)

Figure 1.9: Geographical region (region). Full sample represented (N = 25,550; Missing N = 0)


Sex (sex). Full sample represented (N = 25,550; Missing N = 0)

Figure 1.10: Sex (sex). Full sample represented (N = 25,550; Missing N = 0)


ISCED (isced) at baseline Enroll-HD visit. Full sample represented (N = 25,435; Missing N = 115)

Figure 1.11: ISCED (isced) at baseline Enroll-HD visit. Full sample represented (N = 25,435; Missing N = 115)


HD integrated staging system (HD-ISS) imputed stage (hdiss_stage_imp) at baseline Enroll-HD visit. Individuals with research CAG length of = 40 represented only (N = 17,594).

Figure 1.12: HD integrated staging system (HD-ISS) imputed stage (hdiss_stage_imp) at baseline Enroll-HD visit. Individuals with research CAG length of = 40 represented only (N = 17,594).