About CAP3

The CAP program uses a dynamic programming algorithm to compute the maximal-scoring overlapping alignment between two fragments.
Fragments in random orientations are assembled into contigs by a greedy approach in order of the overlap scores. A multiple alignment of fragments in each contig is produced (Huang, X. A contig assembly program based on sensitive detection of fragment overlaps, Genomics 14, 18-25, 1992).
The input to the CAP program is a list of fragments in FASTA format. A string after ">" is the name of the fragment that follows.
Only the five letters A, C, G, T and N are allowed to appear in fragment data. No other characters are allowed. Lowercase letters will be automatically translated in uppercase. A file of input fragments looks like:
>G019uabh
ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
AATTAAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTG
ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCCATCAGTCTTAATC
AGTCTTGTTACGTTATGACTAATCTTTGGGGATTGTGCAGAATGTTATTTTAGATAAGCA
AAACGAGCAAAATGGGGAGTTACTTATATTTCTTTAAAGC
>G028uaah
CATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTGAATTAAAGACTTGTTTAAACACAAA
ATTTAGACTTTTACTCAACAAAAGTGATTGATTGATTGATTGATTGATTGATGGTTTACA
GTAGGACTTCATTCTAGTCATTATAGCTGCTGGCAGTATAACTGGCCAGCCTTTAATACA
TTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTTGGTATGATTTATCTTTTTGGTCTTCT
ATAGCCTCCTTCCCCATCCCATCAGTCT
>G022uabh
TATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGATTG
GGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATGCCC
ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
AATTAAAGACTTGTTTAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTGA
TTGATTGATTGATTGAT
>G023uabh
AATAAATACCAAAAAAATAGTATATCTACATAGAATTTCACATAAAATAAACTGTTTTCT
ATGTGAAAATTAACCTAAAAATATGCTTTGCTTATGTTTAAGATGTCATGCTTTTTATCA
GTTGAGGAGTTCAGCTTAATAATCCTCTACGATCTTAAACAAATAGGAAAAAAACTAAAA
GTAGAAAATGGAAATAAAATGTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACAT
GAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCCTATTTTGCTCGTTTTTGCTT
ATCTAAAATACATTCTGCACAATCCCCAAAGATTGATCATACGTTAC
>G006uaah
ACATAAAATAAACTGTTTTCTATGTGAAAATTAACCTANNATATGCTTTGCTTATGTTTA
AGATGTCATGCTTTTTATCAGTTGAGGAGTTCAGCTTAATAATCCTCTAAGATCTTAAAC
AAATAGGAAAAAAACTAAAAGTAGAAAATGGAAATAAAATGTCAAAGCATTTCTACCACT
CAGAATTGATCTTATAACATGAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCC
	
The output consists of three parts: overview of contigs at fragment level, detailed display of contigs at nucleotide level, and consensus sequences. The output of CAP on the sample input data looks like:
Number of segment pairs = 12; number of pairwise comparisons = 2
'+' means given segment; '-' means reverse complement

Overlaps            Containments  No. of Constraints Supporting Overlap

******************* Contig 1 ********************
G019UABH+
G028UAAH+
******************* Contig 2 ********************
G023UABH+
                    G006UAAH+ is in G023UABH+

DETAILED DISPLAY OF CONTIGS
******************* Contig 1 ********************
                          .    :    .    :    .    :    .    :    .    :    .    :
G019UABH+             ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
G028UAAH+                                      CATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG
                      ____________________________________________________________
consensus             ATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGCTTG

                          .    :    .    :    .    :    .    :    .    :    .    :
G019UABH+             AATTAAAGACTTGTTTAAACACAAAAATTTAGAGTTTTACTCAACAAAAGTGATTGATTG
G028UAAH+             AATTAAAGACTTGTTTAAACACAAAA-TTTAGACTTTTACTCAACAAAAGTGATTGATTG
                      ____________________________________________________________
consensus             AATTAAAGACTTGTTTAAACACAAAAATTTAGACTTTTACTCAACAAAAGTGATTGATTG

                          .    :    .    :    .    :    .    :    .    :    .    :
G019UABH+             ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
G028UAAH+             ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC
                      ____________________________________________________________
consensus             ATTGATTGATTGATTGATGGTTTACAGTAGGACTTCATTCTAGTCATTATAGCTGCTGGC

                          .    :    .    :    .    :    .    :    .    :    .    :
G019UABH+             AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
G028UAAH+             AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT
                      ____________________________________________________________
consensus             AGTATAACTGGCCAGCCTTTAATACATTGCTGCTTAGAGTCAAAGCATGTACTTAGAGTT

                          .    :    .    :    .    :    .    :    .    :    .    :
G019UABH+             GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCC              
G028UAAH+             GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCATCAGTCT>G022U
                      ____________________________________________________________
consensus             GGTATGATTTATCTTTTTGGTCTTCTATAGCCTCCTTCCCCATCCCATCAGTCTNGNNNN

                          .    :    .    :    .    :    .    :    .    :    .    :
G028UAAH+             ABHTATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGA
                      ____________________________________________________________
consensus             ANNTATTTTAGAGACCCAAGTTTTTGACCTTTTCCATGTTTACATCAATCCTGTAGGTGA

                          .    :    .    :    .    :    .    :    .    :    .    :
G028UAAH+             TTGGGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATG
                      ____________________________________________________________
consensus             TTGGGCAGCCATTTAAGTATTATTATAGACATTTTCACTATCCCATTAAAACCCTTTATG

                          .    :    .    :    .    :    .    :    .    :    .    :
G028UAAH+             CCCATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGC
                      ____________________________________________________________
consensus             CCCATACATCATAACACTACTTCCTACCCATAAGCTCCTTTTAACTTGTTAAAGTCTTGC

                          .    :    .    :    .    :    .    :    .    :    .    :
G028UAAH+             TTGAATTAAAGACTTGTTTAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGAT
                      ____________________________________________________________
consensus             TTGAATTAAAGACTTGTTTAAACACAAAATTTAGACTTTTACTCAACAAAAGTGATTGAT

                          .    :    .    :    .    :    .    :    .    :    .    :
G028UAAH+             TGATTGATTGATTGATTGAT
                      ____________________________________________________________
consensus             TGATTGATTGATTGATTGAT

******************* Contig 2 ********************
                          .    :    .    :    .    :    .    :    .    :    .    :
G023UABH+             AATAAATACCAAAAAAATAGTATATCTACATAGAATTTCACATAAAATAAACTGTTTTCT
G006UAAH+                                                    ACATAAAATAAACTGTTTTCT
                      ____________________________________________________________
consensus             AATAAATACCAAAAAAATAGTATATCTACATAGAATTTCACATAAAATAAACTGTTTTCT

                          .    :    .    :    .    :    .    :    .    :    .    :
G023UABH+             ATGTGAAAATTAACCTAAAAATATGCTTTGCTTATGTTTAAGATGTCATGCTTTTTATCA
G006UAAH+             ATGTGAAAATTAACCTANNA-TATGCTTTGCTTATGTTTAAGATGTCATGCTTTTTATCA
                      ____________________________________________________________
consensus             ATGTGAAAATTAACCTAAAAATATGCTTTGCTTATGTTTAAGATGTCATGCTTTTTATCA

                          .    :    .    :    .    :    .    :    .    :    .    :
G023UABH+             GTTGAGGAGTTCAGCTTAATAATCCTCTACGATCTTAAACAAATAGGAAAAAAACTAAAA
G006UAAH+             GTTGAGGAGTTCAGCTTAATAATCCTCTAAGATCTTAAACAAATAGGAAAAAAACTAAAA
                      ____________________________________________________________
consensus             GTTGAGGAGTTCAGCTTAATAATCCTCTAAGATCTTAAACAAATAGGAAAAAAACTAAAA

                          .    :    .    :    .    :    .    :    .    :    .    :
G023UABH+             GTAGAAAATGGAAATAAAATGTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACAT
G006UAAH+             GTAGAAAATGGAAATAAAATGTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACAT
                      ____________________________________________________________
consensus             GTAGAAAATGGAAATAAAATGTCAAAGCATTTCTACCACTCAGAATTGATCTTATAACAT

                          .    :    .    :    .    :    .    :    .    :    .    :
G023UABH+             GAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCCTATTTTGCTCGTTTTTGCTT
G006UAAH+             GAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCC                    
                      ____________________________________________________________
consensus             GAAATGCTTTTTAAAAGAAAATATTAAAGTTAAACTCCCCTATTTTGCTCGTTTTTGCTT

                          .    :    .    :    .    :    .    :    .    :    .    :
G023UABH+             ATCTAAAATACATTCTGCACAATCCCCAAAGATTGATCATACGTTAC
                      ____________________________________________________________
consensus             ATCTAAAATACATTCTGCACAATCCCCAAAGATTGATCATACGTTAC