Bioinformatics Made Easy

Discover the future of bioinformatics with our cutting-edge HTML-based tools designed to enhance research efficiency and accelerate scientific breakthroughs.
Coming Soon

SequoraDNA is Under Construction

The tools displayed on this page are currently being tested. The end goal is to have a program that can easily perform phylogenomics alignments and analysis. However, a lot of work still needs to be done to ensure the accuracy of the current toolset. If you are seeing this page, please use the following tools at your own risk. I look forward to hearing feedback! 
Sequora dna

Part 1-2 - FASTA Splitting

Parts 1 and 2 have been combined into one step. For this step you take one FASTA file containing several small genomes (such as prokaryotes or mitochondria). You then upload this file as the Query FASTA and then you upload it again as the Reference FASTA.
This part of the program will divide each genome into "chunks" that can then be analyzed later on in the pipeline. By default, the Query file is divided into 1,000 base pair chunks, and the Reference file is divided into 10,000 base pair chunks.
Reference chunks are then concatenated accordingly, with the default option to combine the first, second to last, and last chunk for circularity purposes. Reverse compliment chunks are handled in the same way so that the complimentary sequences are not overlooked.
When I first designed this tool, I used one FASTA file with 6 bacterial genomes as both the Query and Reference file. This meant that there were 6 FASTA headers in each file. I suggest using the same file for both the Query and Reference, however this is by no means mandatory.
FASTA Processing and CSV Export Tool

FASTA Processing and CSV Export Tool

(On by default)

Download Processed FASTA and CSVs:

Sequora dna

Part 3 - Genome Alignment

(December 01, 2024) Over the last few days, I've noticed some bugs in Part 3. 🐛🐛🐛 Please be patient as I work these out and continue testing...
Part 3 is the Advanced DNA Alignment Tool. This is where the majority of the actual work happens.
To use this part of the pipeline, upload the respective Query and Reference files that were generated in Parts 1-2.
Hit Run Alignment.
Then Download the resulting CSV file.
This section of the program uses k-mer matches from the Query sequence chunks to filter through the Reference k-mer chunks. Those that pass this filter are then aligned using a local alignment method. The final results are reported in the generated CSV grid.
This grid can then be used in Part 4.


Advanced DNA Alignment Tool

Advanced DNA Alignment Tool

Scoring Parameters

Worker Settings

K-mer Filter Settings

File Input

Aligned: 0 | Skipped: 0 | Total: 0
Elapsed Time: 0s

Results:

Sequora dna

Part 4 - Query Highest Scores

In Part 4, you upload the file that was generated in Part 3, and it will find the highest "Score" for each Query chunk against all of the Reference chunks from each original genome.
So if I am comparing 6 bacterial genomes, Chunk 1 from Bacterium A will return the highest score from Bacterium A, Bacterium B, Bacterium C, Bacterium D, Bacterium E and Bacterium F, for a total of 6 columns. Where a Query chunk did not have a k-mer filtered match against a Reference genome, there will be an empty cell. (These unique sequences are noted later.)
CSV Header Cleanup Processor - Part 4

CSV Header Cleanup Processor - Part 4

Sequora dna

Part 5 - Genome Scoring Grid

In Part 5 the user uploads the Filtered CSV which was generated in Part 4. Part 5 will then give a small, convenient grid, with download options, summarizing the average scores between each Query and Reference genome.
Alignment Average and Summary Tool - Part 5

Alignment Average and Summary Tool - Part 5

Sequora dna

Part 6 - Phylogenomics Tree Generation

In Part 5 the user uploads the CSV file from Part 5 (if one of the two does not work, try the other). Two distance matrices, one for the distances with gaps and one without gaps, are generated. Two phylogenomics trees are then generated as well. The images for these trees can be copy/pasted into another document!
Phylogenetic Trees (Auto Resize for Labels) - Part 6

Phylogenetic Trees (Auto Resize for Labels) - Part 6

With Gaps

Without Gaps

Unique/Aligned

Unleash Your Genetic Discovery Potential

Explore our HTML-based bioinformatics tools to advance your research and unravel the mysteries within your genetic data today.