Data management and sharing plans
The Proteomics Resource Center houses a variety of liquid chromatographic (LC) systems and mass spectrometers (MS). Available MS systems include a Q-Exactive Plus, Q-Exactive HF, Lumos Fusion, Lumos Ascend and IQ-X orbitrap mass spectrometers.
Various software packages and search algorithms are available for data analysis including Proteome Discoverer, Mascot, Sequest, MaxQuant, BYonic, PEAKS, Spectronaut, Skyline, Compound Discoverer, LipidSearch, and Perseus.
Samples submitted for analysis at the PRC are logged and labelled with a tracking number. All primary data are labeled with the tracking number. Internal users can track the progress/status for submitted requests at https://plims.rockefeller.edu/status/. Experimental details related to sample preparation, when available, are stored in the logging system.
The PRC provides data storage via Network-Attached Storage (NAS) systems, with a capacity of 37 TB. An additional 100 TB (RAID5) is available locally at the Center. Each primary data file generated by LCMS systems (‘.RAW’) contains information related to the acquisition. Primary data in addition to generated PowerPoint, Excel, .txt, and similar result files are stored for a minimum of 5 years. Proteome/Compound Discoverer processing files (‘.msf’, ‘.cdResult’, ‘.pdResult’ and similar) are only stored for approximately 3 years. These files can be re-generated by reanalyzing the primary data.
For a period of approximately 3 years after data has been generated, the local storage works as a backup of data stored on the NAS. After this period, primary data in addition to PowerPoint, Excel, .txt, and similar files are stored locally in a single copy.
Making primary data publicly available is required by many journals as well as under the NIH Data Management and Sharing policy. Data generated by the Center (see also The Rockefeller University’ guidelines for acknowledgment) should be made available for reviewers upon submission of a manuscript and made public after the manuscript has been accepted. A common used depository is PRotemics IDEentification database (PRIDE) online repository. A PRIDE account is needed. We recommend to upload using the option: ‘Partial Submission‘ because this does not require tedious conversion of primary data. For a PRIDE upload the following is minimum information required:
- Project Title (title of manuscript/report).
- Keywords.
- Project Description (abstract).
- Sample Processing Protocol prior to sample preparation carried out by the Center (often found in the email reports).
For the Center to make the data available for upload, please provide us with relevant project identifier(s) (MSXXXXXX).
It can take a few business days, after upload, to receive a reviewer-only link. Please provide this link to reviewers of your manuscript along with a text stating that the availability of the data will be provided by the Center. Example: Data have been uploaded to PRIDE Project accession: [identification number]‘, Username: ‘ reviewer_X@ebi.ac.uk‘, Password: [password]. Upon manuscript acceptance, PRIDE should be contacted to make the dataset public.
Additional note regarding PRIDE upload can be found here.