FASTA/FASTQ Scripts¶

Scripts involving

This script will check the .fasta file that if there are duplicate IDs in the sequence.

Input:¶

-i : .fasta file to check

Output:¶

If duplicate IDs, output will be “Id repeated: bad fasta file”.

If no duplicated IDs, output will be the number of sequence.

Usage:¶

python dump-fasta-stats.py –version

This is the option that show you the program’s version.*

python dump-fasta-stats.py -h

This can show you some help information.*

python dump-fasta-stats.py -i <filename.fasta>

This will check the .fasta file that if there are duplicate IDs in the sequence. *

dump_fasta_stats.count_sequences(file_to_check)¶: Count number of sequences. Inform the user if a ID repeats.

dump_fasta_stats.main()¶: Prints number of sequences.

fasta_parser.py

This script will print (std out) the sequence of a record with specified ID.

Input:¶

-i : .fasta file to search for record

-v : ID to search for

Output:¶

Sequence of record with specified ID

Usage:¶

python fasta-parser.py –version *

This is the option that show you the program’s version. *

python fasta-parser.py -h

This can show you some help information.

python fasta-parser.py -i <filename.fasta> -v <ID>

Runs program with specified file and ID

fasta_parser.main()¶: Find values with valid ID

fastq_parser.py

This script will print (std out) the read_id, read_seq and read_qual from the input fastq file.

Input:¶

-i : fastq file to print values from

Output:¶

read_id, read_seq and read_qual

Usage:¶

python fastq-parser.py –version

This is the option that show you the program’s version.

python fastq-parser.py -h

This can show you some help information.

python fastq-parser.py -i <filename.fastq>

Runs program with specified fastq file

fastq_parser.main()¶: Print read_id, read_seq and read_qual

parse_big_fasta.py

Parses RVDB formatted FASTA headers so they can be interperated by HIVE-hexagon’s tablequery

Input:¶

-i : input FASTA file to reformat

-o : specified output file

Output:¶

Reformatted FASTA file

Usage:¶

*python parse_big_fasta.py –version

This is the option that show you the program’s version.

*python parse_big_fasta.py -h

This can show you some help information

*python parse_big_fasta.py -i <filename.fasta> -o <output_file>

Runs program with specified FASTA file and output file

parse_big_fasta.create_arg_parser()¶: Creates and returns the ArgumentParser object.

parse_big_fasta.format_header(parsed_args)¶: Parse the RVDB formatted FASTA headers and re-writes in desired format

parse_big_fasta.main()¶: Write reformatted .fasta to specified file