Science

CloudSeq Platform

CloudSeq is an innovative platform that makes Next-Generation Sequencing (NGS) projects easy to run and manage without clients needing to purchase expensive sequencing instruments or computer hardware. With absolute transparency, security and confidentiality, CloudSeq strives to be your trusted research collaborator for all your sequencing projects.

Initially, we offer sequencing technologies from both Illumina and Pacific Biosciences. A third option using Ion Torrent from Life Technologies will be made available shortly. All our services come with comprehensive bioinformatics support that includes data management and data analysis

We offer a one-stop solution for all NGS services in Singapore and Asia.

How does CloudSeq differ from other NGS service providers?

  • CloudSeq is a novel platform that is driven by the affordability of current NGS technologies with the novelty of cloud computing.
  • We offer researchers seeking NGS services the best option they can hope for i.e. high-throughput sequencing without the need to purchase and maintain costly sequencers to generate voluminous sequence dataor data centres to manage and analyse vast and disparate data.
  • All sequencing platforms and data analysis pipelines available through Cloud Seq are offered on a pay-per-use basis, monthly subscription basis, project basis and in modular form to suit every budget and requirement
  • CloudSeqwill take care of all the necessary laboratory and computing resources necessary to process your samples and deliver your data via the internet securely and in a timely manner.
  • CloudSeq’swebclient will enable you to follow the progress of your project, methodology being used, algorithms being used and the parameter details in real time.

Key Advantages of Cloud Seq

  • Rapid implementation of various sequence related projects.
  • Choose from a wide repository of pre-configured packages of sequencing technologies and data analysis pipelines or customise your own solution.
  • Accessibility to rapidly evolving sequence technologies and latest algorithms for sequence assembly, annotation and analysis.
  • No worrying about setting up an expensive sequencing laboratory with supporting bioinformatics or equipment maintenance/personnel training.
  • You don’t have to be tied down to any one sequencing technology just because you bought that instrument. Utilise what is best for your current project and change when your needs change or when there is a better technology.
  • Secure, transparent and confidential. Highly secure and transparent with utmost confidentiality. No conflict of interest at any level.
  • Able to track project progress from anywhere within the web utilising robust XML and Java/J2EE environment for data management.
  • Cloud based customizable workflows for sequence assembly, annotation and analysis using latest and most relevant algorithms for your project
  • User friendly interface with Web 2.0 technology/.NET.
  • Cost effective and affordable. We offer highly competitive pricing on a global level; keeping within budgets of regional biotech industries and academia.
  • All our processors are compliant with several international standards.

Some Common NGS Applications

Chromatin Immunoprecipitation Sequencing

Chromatin immunoprecipitation sequencing (ChIP-seq) is a method that combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify and map genomic binding sites for DNA-binding proteins. ChIP-seq is used primarily to determine how transcription factors and other chromatin-associated proteins influence phenotype-affecting mechanisms.

CLIP-Seq (cross-linking immunoprecipitation sequencing) is a related method used to screen for RNA sequences that interact with RNA-binding proteins. CLIP-Seq is also known as RIP-Seq (RNA immunoprecipitation sequencing) or HITS-CLIP (high-throughput sequencing - cross-linking immunoprecipitation).

De novo Sequencing

De novo sequencing refers to the sequencing and construction of a previously unknown genome (or transcriptome). The process of assembling the short fragments or reads from this type of sequencing data is complicated because there are no reference sequences with which the reads can be mapped to. Thus long overlapping reads are essential for robust sequence assembly.

Overlapping sequences built up from as few large contigs as possible will greatly ease assembly of the genome. These contigs are best achieved by sequencing technologies that generate long read lengths as well as from those which have paired-end and mate pair reads. Provision of high-coverage read data increase the amount of overlapping sequence and thus increase confidence in the final sequence assembly.

Current NGS platforms optimise for either long reads or high coverage of shorter, paired-end reads. Longer reads tend to suffer from the relatively low coverage they provide for an uncharacterised genome. Short reads alone are insufficient because they are not long enough to encapsulate long blocks of repetitive sequences. Therefore the use of multiple sequencing technologies is essential for de novo sequencing projects. The longer read lengths provide a scaffold to which higher coverage shorter reads can mapped to.

Methylation Analysis

NGS allows the study of the entire methylome instead of just a few genes or small regions within a genome. Current methods for monitoring the methylation status of a genome either rely on bisulfite conversion or some form of methylated DNA enrichment. In general, methylation sequencing applications are most suitable for those platforms which generate a large amount of sequence per run.

Whole genome bisulfite sequencing (WGBS) provides the most complete picture of the methylome at this time but costs much more than standard whole genome sequencing.

Some of the modifications to reduce this cost include:

Reduced representation bisulfite sequencing (RRBS) uses restriction enzymes and fragment size selection to reduce the overall complexity of the genome while enriching for regions of high CpG density.

Methylated-DNA immunoprecipitation” (MeDIP-seq) selects for methylated DNA before the sequencing step. Similar to ChIP-seq, MeDIP-seq is done by first immunoprecipitating methylated DNA with an antibody.

Methyl CpGimmunoprecipitation” (MCIp) is a similar method that uses a methyl-CpG binding domain (MBD) protein to isolate the methylated regions of the genome.

Transcriptome Sequencing

Transcriptome sequencing (RNA-seq) or whole transcriptome shotgun sequencing (WTSS) refers to the use of NGS technology to sequence cDNA for a complete RNA profile. Unlike array based-platforms, no prior knowledge of the transcript sequence is necessary. RNA-seq can also reveal information that may have been missed by array-based studies. It can be used for discovery applications (rare genes, splice junctions, gene fusions), RNA editing, allele specific expression and with novel or poorly studied organisms for which there are no good standard microarrays.

mRNA-seq targets all polyadenylated mRNA transcripts or the coding portion of the transcriptome. It offers deep coverage within the transcriptome to seek out new genes that may have been undetectable due to their low level of expression. The increased depth and reduced cost of NGS (versus arrays) also means that gene expression can now be profiled while differentiating between isoforms of the same gene via paired-end reads. This same depth coverage also allows novel microRNA (miRNA) gene discovery and expression profiling.

Whole Genome Resequencing

Whole genome resequencing, especially with human or microbial samples, is often used to determine the genomic variations of a sample in relation to a common reference sequence. The generated sequences are then aligned and mapped onto a known reference sequence. The aligned genome is then mined for single-nucleotide polymorphisms (SNPs) and copy number variations (CNVs) as well as structural variations such as insertion, deletions, inversions and translocations.

Resequencing studies benefit primarily from generating as much sequence data as possible at the lowest cost possible. While longer reads can definitely be beneficial, the total genomic output (in terms of the number of bases covered) is more important as the sample genome is already known.

Targeted/Exome Resequencing

Targeted resequencing is a variation of resequencing where only a small isolated subset of a genome e.g. exome, a chromosome, a set of genes, mitochondria, etc. is sequenced and mapped on known reference sequence. Alternate methods of sample preparation are used to produce libraries that represent the desired subset of the genome.

Sequencing only a subset of a genome has the added benefit of reducing costs too. By focusing all of the sequencing on just a small region of the genome, it becomes possible to detect low levels of variation that might have otherwise been missed. In genome-wide association studies (GWAS), targeted resequencing is better suited for detecting rare alleles than traditional arrays.

The exome of an individual is often resequenced in medical/gene-related research to identify known genetic variants that could promote a disease phenotype. Rare variants can also be found from the exomes of multiple patients and further analysis on the functional consequences of the exon mutation can be done. Exome resequencing is highly scalable and generally less expensive with better coverage uniformity than arrays especially when large numbers of samples are involved.

Metagenomics

Metagenomics involves analysing multiple microbial genomes found in environmental samples together simultaneously at the same time in direct contrast to isolating and cultivating each individual species prior to sequencing their genomes. This allows for the discovery and study of unique microbial genomes which would otherwise be intractable due to cultivation difficulties. There can be hundreds to thousands of unique microbial species found in a single gramme of an environmental sample (e.g. soil, seawater, gut contents, etc.).

Pool DNA isolated from the sample is used to generate standard sequencing libraries to generate as broad coverage as possible across the entire ‘metagenome’ that was present in the sample. It is also possible to focus on just certain genes (e.g.16S rDNA and 18S rDNA) to get a more accurate picture of what species are present in the sample (at the expense of the more comprehensive view of the genomic sequence from the standard method).

CloudSeq Sequencing Technologies Offered

IlluminaHiSeq&MiSeq

Illumina’sSequencing by Synthesis (SBS) technology uses a cyclic reversible termination (CRT) method based on the use of fluorescent reversible 3’ blocked terminators. DNA molecules (from the target genome) and primers are first attached on single use “flowcells” and bridge amplified with polymerase so that dense local clonal “DNA colonies" are formed.

To determine the sequence, the four types of labeled reversible terminator bases, primers and polymerase are added to the flowcells which extends the DNA chains by a single labeled complementary nucleotide. The label serves as a terminator for polymerisation, so after each incorporation, the fluorescent dye is imaged to identify the base and then enzymatically cleaved to allow incorporation of the next nucleotide.

After each cycle, non-incorporated nucleotides are then washed away and the cycle repeats. The number of cycles determines the read length. Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity.

The IlluminaHiSeq is the current industry-standard NGS platform and is designed for large-scale high-throughput experiments. Outfitted with two 8-lane flow cells, it can quickly generate very large amounts of data (600 Gb per run) at a very low cost per base.

The smaller capacity IlluminaMiSeq is ideal for performing smaller sequencing experiments and pilot studies, including test runs to evaluate samples before performing a more extensive sequencing project. It also offers flexibility that is essential in clinical applications such as HLA allele typing, de novo genome sequence and assembly, and T-cell receptor and immunoglobulin repertoire clone typing.

Your samples can either be run on the high throughput HiSeq or on the faster MiSeq. Both paired-end and single read sequencing modes are available. Both forward and reverse template strands of each cluster can be extended and read during paired read mode which contain long range positional information, allowing for highly precise alignment of reads.

PacBio RS

Pacific Biosciences’s unique and novel single molecule real time sequencing (SMRT) technology allows direct observation of DNA synthesis by a DNA polymerase. This is done through a tiny hole called a zero-mode waveguide (ZMW) that creates an illuminated observation volume small enough to observe this single nucleotide of DNA being incorporated. Sequencing is done in real time in single use cells containing ZMW wells without any template or signal amplification.

At the start of each run, a single DNA polymerase enzyme is affixed at the bottom of each ZMV with a single molecule of template DNA. While laser light is shone through the ZMV, the four types of fluorescent labeled nucleotide bases flood in but the DNA polymerase only detects and incorporates the next complementary nucleotide. Only fluorescence from this incorporated labeled nucleotide at the bottom of the ZMV can be detected and a base call is made according to the corresponding fluorescence of the dye. The fluorescent label is cleaved off upon incorporation and diffuses out of the observation area of the ZMW where its fluorescence will no longer be observable. Incorporation only takes milliseconds and continues until the run time is complete.

SMRT is well suited for applications such as de novo sequencing of bacterial genomes, targeted sequencing, detection of base modifications, and finishing large genome sequencing. Three sequencing modes are possible on this instrument: standard sequencing to generates long continuous reads, circular consensus sequencing (CSS) for higher accuracy and strobe sequencing for physical coverage and read length.

SMRT technology offers four advantages when compared with other platforms:

  • Observation of structural and cell type variation such as methylation is possible;
  • Sequencing cost per run is much lower although cost per base is higher;
  • Provides extremely long reads and unbiased sequences (i.e. balanced coverage and minimal GC-bias);
  • Very fast runs.

Life Technologies Ion Torrent PGM & Proton

Ion Torrent uses ion semiconductor sequencing technology which detects the hydrogen ions released during DNA polymerisation. Sequencing is very fast as it occurs in real time. This technology differs from other sequencing technologies in that no modified nucleotides or optics are needed or utilised. While this technology uses a sequencing-by-synthesis method and emulsion PCR (emPCR) similar to other platforms, it differs in that it doesn’t use fluorescence or chemiluminescence. Samples are amplified via emulsion PCR to be loaded into microwells on single use chips.

Each microwell contains a template DNA strand and a DNA polymerase enzyme is flooded with one of the four types of nucleotide bases at the start of a run. If the introduced nucleotide is complementary to the leading template nucleotide, it will be incorporated into the growing complementary strand. Incorporation will cause the release of a hydrogen ion that triggers an ion-sensitive field-effect transistor (ISFET) sensor indicating that a reaction has occurred. If homopolymer repeats are present in the template sequence, multiple nucleotide molecules will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal. The unattached dNTP molecules are washed out before the next cycle when a different nucleotide base is introduced.

The Ion Personal Genome Machine (PGM) is targeted towards smaller genomes and targeted sequencing. It uses disposable chips which come in three varieties of increasing output. The newer Ion Proton allows for larger chips with higher densities which are needed for exome and whole genome resequencing