{"author_name":"Jennifer Hillman-Jackson, Garima Singh","author_url":"https://galaxy.training","description":"- Run **FastQC** on your data to make sure the format/content is what you expect. Run more QA as needed.      - Search [GTN](https://training.galaxyproject.org/) tutorials with the keyword \u201cqa-qc\u201d for examples.     - Search [Galaxy Help](https://help.galaxyproject.org/) with the keywords \u201cqa-qc\u201d and \u201cfasta\u201d for more help. - Assembly result?     - Consider filtering by length to remove reads that did not assemble.     - Formatting criteria:         - All sequence identifiers must be unique.         - Some tools will require that there is no description line content, only identifiers, in the fasta title line (\u201c>\u201d line). Use **NormalizeFasta** to remove the description (all content after the first whitespace) and wrap the sequences to 80 bases. - [Custom genome]({% link faqs/galaxy/analysis_add_custom_build.md %}), transcriptome exome?     - Only appropriate for smaller genomes (bacterial, viral, most insects).     - Not appropriate for any mammalian genomes, or some plants/fungi.     ...","height":400,"html":"<iframe width=\"560\" height=\"400\" scrolling=\"yes\" sandbox=\"allow-same-origin allow-scripts\" title=\"FAQ: Working with very large fasta datasets\" src=\"https://training.galaxyproject.org/training-material/faqs/galaxy/datasets_working_with_fasta.html?utm_source=galaxy-help&utm_medium=oembed&utm_campaign=oembed\" frameborder=\"0\" allowfullscreen></iframe>","provider_name":"Galaxy Training Network (GTN)","provider_url":"https://galaxy.training","thumbnail_height":400,"thumbnail_url":"https://training.galaxyproject.org/training-material/assets/images/GTNLogo1000.png","thumbnail_width":560,"title":"FAQ: Working with very large fasta datasets","type":"video","version":"1.0","width":560}