Bioperl Scripts of cDNA analysis tools (CLAT) [Biology] — Tank @ 7:48 pm May 25, 2007
Several weeks ago I wrote some perl and bioperl scripts to analysis a large quantity of cDNA seqs or ESTs. I named my scripts as CLAT which is short for cDNA libary analysis tools. These scripts are good so I would like to share them here.
When getting thousands of sequences that usually after a cDNA libary sequencing, CLAT uses EMBOSS vectorstrip program to automatically clean the vectors that your clone vector carrys, then applys NCBI blastall program to Blast the sequences against the NCBI databases to gets the result and extracts the interested information from them to form a well organized database (in the OpenOffice.org spreadsheet or Microsoft Office Excel sheet. ) that facilitate downstream analysis, as well as sets up your own Blast database against which each sequence of the library is then Blasted. Similar sequences are clustered in one file of fasta format that can be easily analyzed by Clustal program or other phylogenetic analysis.