MGH   
CCIB
 
Full-Length Amplicon Sequencing: Data Retrieval

Upon completion of a sequencing run, the data enters our analysis pipeline. Researchers will receive an automated email notification as soon as their data can be accessed through our website.

Please note that your data will only be available for three months after it is released! We strongly encourage our users to download their data as soon as they are available. Our data server is only a temporary storage site which does not allow long-term archiving of sequencing data. All Full-Length Amplicon Sequencing data generated at our core facility is subject to deletion without notice after three months.

Accessing Your Data:

To access our data server, please log into your account and click on the My Results button. You have the following options to download your data files:

  • Download an uncompressed file *(.seq) with a single concatenated text file (in FASTA format) of all nucleotide sequences generated for the corresponding order.

  • Download a compressed file (*.sit) containing one FASTA format sequence file (*.seq), THREE EXCEL files, and one FASTQ format file (containing the raw nanopore data) for each sample of the corresponding order. A detailed explanation of the delivered files is provided below.

Decompression Software:
To open compressed *.sit files (which are actually .zip files) on your computer, you can use Stuffit Expander (free) or WinZip (trial version). To download, please select the appropriate link below. You can also use the Linux unzip command.

Stuffit Expander
WinZip for Windows
WinZip for Mac


Please note:
*.seq files are plain text files containing your sequence in FASTA format and can be opened with any software capable of viewing plain text or FASTA format files (text editor software such as Word, NotePad, etc.). You may also change the file extension from *.seq to *.txt.

Explanation of Delivered Data Files

Five data files will be generated per sample. These files are prefixed with the Tube_ID provided in the sample submission form.

  • FASTA File (.seq) - Final consensus sequence(s) generated.

  • FASTQ File - Raw nanopore data for each sample.

  • Coverage.xlsx - Provides per-base pair depth and coverage metrics calculated from reads mapped to the final consensus sequence.

    This file includes a Sequence Coverage Plot showing the total number of reads supporting each base position across the amplicon, allowing assessment of coverage uniformity, identification of low-coverage regions, and confirmation of full-length amplicon representation.

  • Read_Length_Distribution.xlsx - This file includes a Read Length Plot showing the distribution of sequenced read lengths generated for each sample.

    A dominant peak at the expected amplicon size indicates that most of the reads span the full-length amplicon.

  • Ref.coverage.xlsx - Provides depth and coverage information for each individual sample base based on mapping the sequenced reads to the reference amplicon sequence.

    A Sequence Coverage Plot is generated to visualize the total number of reads supporting each base position across the amplicon which is valuable for ascribing a sequence quality to the read at that position. A descriptive example of a ref.coverage file can be downloaded here.