FASTA Converter - Oligonucleotide Sequence Format Converter

Convert oligonucleotide sequences between FASTA, CSV, TSV, and plain text formats for vendor ordering (IDT, Twist Bioscience, GenScript). Features automatic format detection, IUPAC nucleotide validation, reverse complement for primer design, deduplication, and batch processing. Process thousands of oligo pool sequences instantly with privacy-first browser-based conversion.

Generic sequence conversion before vendor-specific export

Example input: FASTA records, CSV rows, TSV rows, or plain sequence lists that need cleanup.

Use this for: FASTA to CSV, CSV to FASTA, deduplication, reverse complement, uppercase, and IUPAC validation.

Before ordering: run Batch QC or open Vendor Format Adapter for order-file columns.

Input & Options

Lines: 1 | Chars: 0

Processing Options

ID Modification (Optional)

Results

No results yet

Select format and convert your sequences

Why Sequence Format Conversion Matters

Bioinformatics sequence data often moves across multiple file formats. FASTA is the standard format for sequence databases and analysis tools, while CSV and TSV are used for spreadsheet review, vendor ordering, and data management. The Format Converter handles bidirectional conversion between these formats with automatic format detection.

FASTA format stores sequences with a header line (starting with ">") followed by the sequence on subsequent lines. CSV (comma-separated values) and TSV (tab-separated values) store sequences in tabular format with columns for name, sequence, and optional metadata. Converting between these formats manually is error-prone, especially with large datasets — our tool processes thousands of sequences instantly and validates each one.

Use this page for FASTA to CSV, CSV to FASTA, TSV cleanup, deduplication, reverse complement, uppercase, and IUPAC validation. When an order needs vendor-specific columns, open Vendor Format Adapter after the generic conversion is clean.

Additional features include reverse complement generation (essential for designing antisense primers), sequence deduplication (removing identical sequences that waste synthesis resources), and IUPAC code validation (catching invalid characters before vendor submission). All processing happens in your browser — no data is transmitted to any server.

How to Use the Format Converter

  1. Paste your sequences in the input field or upload a file (FASTA, CSV, or TSV). The converter auto-detects the input format.
  2. Select the desired output format: FASTA, CSV, or TSV.
  3. Enable optional processing: reverse complement, deduplication, uppercase conversion.
  4. Click "Convert" to process all sequences with automatic validation.
  5. Review any warnings (invalid characters, duplicate sequences) before downloading.
  6. Download the converted file or copy the output directly from the text area.

Frequently Asked Questions

What is FASTA format?
FASTA format is a text-based format for representing nucleotide or protein sequences. Each entry starts with a ">" symbol followed by a description line (header), and the sequence itself on subsequent lines. Example: >Primer_1\nATCGATCGATCG. The format was originally developed for the FASTA alignment tool and is now the most widely used sequence format in bioinformatics. Our converter handles both single-line and multi-line FASTA sequences.
How do I prepare sequences for IDT or Twist orders?
IDT requires sequences in plate-map format (CSV or Excel with specific column names: Name, Sequence, Scale, Purification). Twist Bioscience accepts CSV with Name and Sequence columns. Our Vendor Format Adapter tool generates vendor-specific formats directly. However, if you have sequences in FASTA format and need a quick CSV for vendor ordering, this Format Converter is the first step — convert to CSV, then use the Vendor Format Adapter for final formatting.
What are IUPAC ambiguity codes?
IUPAC codes extend the standard DNA alphabet (A, T, C, G) with ambiguity codes that represent multiple possible bases: R = A or G (purine), Y = C or T (pyrimidine), S = G or C (strong), W = A or T (weak), K = G or T, M = A or C, B = not A, D = not C, H = not G, V = not T, N = any base. Our converter validates against the full IUPAC alphabet and flags non-standard characters.
Can I convert multi-line FASTA to single-line?
Yes. Multi-line FASTA (where long sequences are wrapped at 60 or 80 characters per line) is common in genomic databases. Our converter automatically joins multi-line sequences into single-line format during conversion. When outputting FASTA, you can choose between single-line (compact) and multi-line (wrapped at 80 characters) format.
How does deduplication work?
Deduplication identifies and removes sequences that appear more than once in your dataset. Comparison is case-insensitive (ATCG = atcg = AtCg). When duplicates are found, the first occurrence is kept and subsequent copies are removed. The tool reports the number of duplicates found. This is particularly useful for oligo pools where duplicate sequences waste synthesis resources without adding experimental value.

Related Tools

Review request

Check a result or ordering detail

Send the calculation, settings, or pool submission detail that needs a second look.

Related reading

Continue with the page or tool that matches the next decision in your experiment.