Research Design and Biostatistics
- Overview and Vision
- Organization
- Scope of Support
- Research Areas
- Biostatistical Tools
- Research Design Tools
- Integration with other CTSA Activities
Dept. Web Page| E-Mail | Publications
Overview and Vision:
The Rockefeller University Hospital Department for Research Design and Biostatistics supports the Center's mission to provide the safest and highest quality research for advancing scientific knowledge to improve the care of future patients around the world. This mission is achieved through collaboration with both investigators and other Clinical Research Resources, as outlined in "New Statistical Paradigms Leading to Web-Based Tools for Clinical/Translational Science" presented at the May 23, 2005 NIH Roadmap meeting "Enhancing the Discipline of Clinical and Translational Research."
Organization:
The Department of Research Design and Biostatistics is headed by the Center's Biostatistician, Dr. Knut M. Wittkowski, who acts as a member of the Advisory Committee for Clinical and Translational Science (ACCTS), reviews protocols for the IRB, and assists investigators with experimental design and statistical analyses. Support is provided at four levels: (1) biostatistics consultation, (2) application development for individual investigators, (3) project assistance with data entry, graphics, and analyses, and (4) data management. The Center's Research Design and Biostatistics core includes a data manager and an application programmer (biostatistical support for Center investigators) as well as a project manager and a statistical programmer (development of knowledge based Web tools). The Biomedical informatics Core provides additional support, including (1) database design and data management, (2) data safety and security, (3) Web-based education modules, and (4) installation and support of statistical analysis software.
Scope of Support:
The services offered by the Research Design and Biostatistics core include:
- formalizing the research hypothesis,
- assessing the validity of published results,
- identifying the specific aims to be addressed,
- selecting primary and secondary outcome measures,
- evaluating different strategies to address multiplicity issues,
- discussing randomization and stratification strategies,
- fitting adequate statistical models to research questions,
- installing and supporting statistical analysis software,
- balancing sample or selection size vs. power,
- preparing submissions to funding agencies and the ACCTS/IRB,
- designing data bases to facilitate subsequent analyses,
- monitoring data safety and security,
- implementing statistical analysis strategies,
- visualizing multivariate data, and
- interpreting and publishing results
- Current research (see below) aims at facilitating these support functions.
Research Areas:
A novel statistical approach to multivariate data based on u-statistics (µStat) differs from traditional approaches in that it is 'intrinsically valid', so that time consuming steps for empirical validation can be skipped. Moreover, it allows one to search not only for correlated variables along individual pathways, but also for identifying collaborating (parallel) pathways. These methods have been used in:
- genetics (haplotypes, epistatic interaction),
- genomics (expression pathways),
- safety (adverse event profiles),
- efficacy (clinical responses, quality of life)
- overall benefit (side effects and effectiveness)
Currently, research focuses on
- Phenotyping
- Microarray Quality control ("Harshlighting")

- Adverse Event profiling
- Longitudinal data
- Predictive Pathology / Toxicology (U.S. Patent 7.072.794)

To facilitate use of statistical methods in multivariate settings, we are developing knowledge based user interfaces based on an ontology for describing not only the variables, but also conditions under which data were collected. One possible ontology that can be used for this propose was developed by Dr. Wittkowski as part of the PANOS project.
- knowledge based support for clinical trials management

- knowledge based support for statistical analyses

Biostatistical tools:
Based on the emerging results from these research projects, several tools are integrated and developed. The tools made available for traditional statistical analyses include:
- S-Plus, under a campus license for power calculations and analyses using traditional statistical methods.
- Specialized S-Plus components for genomics (S+ArrayAnalzyer), data mining (Insightful Miner), and text analysis (InFact).
In addition, several tools implementing recently developed statistical methods based on u-statistics for multivariate ordinal data (µStat) are currently available to assist clinical investigators in addressing the novel challenges posed by haplotypes, epistatic interaction, and genomic pathways. The µStat tools received the 2005 Insightful Impact Award and include:
- R / S-Plus scripts for a replacement of the "TDT" (Wittkowski, Liu 2002 Human Heredity)
- The R / Bioconductor package "Harshlight" for improving sensitivity of Affymetrix GeneChip Microarrays (Suarez-Farinas et al. 2005 BMC Bioinformatics)
- Spreadsheets for various partial orderings (Wittkowski et al. 2003 Computing Science and Statistics)
- Access to a grid server (hgrid) for the analysis of large data sets, e.g., in genetic and genomic screening (Wittkowski et al. 2004 Statistics in Medicine)
Research Design Tools:
WISDOM (Web-based Interface for Statistical Design, Organization, and Management) aims at providing investigators with integrated design, compliance, and biostatistics support during all stages of research:
- Design: WISDOM is designed to assist investigators before the onset of patient-oriented study management, e.g., in protocol writing, power calculations, and submission to clinicaltrials.gov. Once the study is funded and approved, all information relevant to the conduct of the study is handed to the Biomedical Informatics component. Transforming multi-factorial designs into sequential lists of visits allows commercially available study management systems to be used.
- Compliance: At predefined intervals, or upon request by a Data and Safety Monitoring Board (DSMB) WISDOM captures data from the study management system and alerts the IRB/DSMB of potential risks, if necessary. Applying the ?Stat approach to AE profiles increases sensitivity and reduces the risk of false positive results compared to univariate analyses.
- Biostatistics: For analyses, WISDOM restructures the data to reflect all design characteristics to be considered for the primary objective(s), performs the primary analyses, and assists with additional 'exploratory' analyses.
WISDOM is based on an innovative meta data model that allows acquisition of information not only about the process of the study, but also about its background and purpose. WISDOM exerts its support by hosting a comprehensive knowledge base on design and conduct of studies and by exchanging data and knowledge between various software through standard interfaces, thereby facilitating:
- database creation, by automatically setting up a database;
- sample size calculations, by allowing to simulate power for complex designs;
- protocol writing, e.g., by generating pre-populated protocol templates for NIH's ProtoType system;
- review, by providing boards with standardized study descriptions,
- registry, by submitting XML files to authorities (clinicaltrials.gov);
- security and safety, by facilitating the use of centralized servers, thereby reducing the need for decentralized storage, backup, and auditing;
- study management, by supporting 'pathways' providing nurses with workflow information;
- data entry, by generating (paper or Web) case report form templates;
- monitoring, by allowing software to automatically screen for adverse event profiles and alerting DSMBs based on preset criteria;
- inspection, by giving investigators real-time information while protecting them from becoming accidentally unblinded;
- review, by providing the IRB with progress information;
- analysis, by storing primary and secondary objectives and automatically initiating the appropriate analyses according the protocol; and data sharing, by providing standardized dictionaries, allowing data to be related across studies.
Integration with other CTSA Activities:
Together, these integrated innovative approaches in method development, knowledge acquisition, teaching, programming, and data management, enable several CTSA activities (outlined in the Sections indicated):
- new phenotyping methods that are more objective and quantifiable
- biostatistical methods for longitudinal studies
- predictive toxicology in human populations
- support in utilizing preliminary data for submission of a research grant application
- improved clinical design, biostatistics, clinical research ethics, informatics, and regulatory pathways
- new technologies for pilot and collaborative translational and clinical studies
- Biomedical Informatics interoperability, security, workflow, usability, and standards
- integrated tools for protocol and informed consent authoring
- integrated tools for adverse event (AE) reporting, safety, and regulatory management and compliance
- statistical methods for data from the application of advanced technologies
- shorter training in clinical research design, biostatistics, and biomedical informatics
References:
WITTKOWSKI, KM; LIU, X (2002) A statistically valid alternative to the TDT. Hum Hered 54: 157-64. (22513746)
WITTKOWSKI, KM; LEE, E; et al. (2004) Combining several ordinal measures in clinical studies. Stat Med 23: 1579-92
SUAREZ-FARINAS, M; PELLEGRINO, M; et al. (2005) Harshlight: a "corrective make-up" program for microarray chips. BMC Bioinformatics 6: 294
