'Split .loci file into separate fasta files for desired loci

I have a large .loci file that contains 1 column with multiple rows of sequence data for multiple loci. There are n rows for the first locus with each row containing the name of the individual, a space, then the sequence of nucleotides for that individual locus. Each locus is separated from the next by a row starting with "//" and then a string of spaces, "*"s and "-"s with the locus number in [] at the end of the string. The rows separating loci are not consistent and can have any combination of these characters.

Example:

Ind_1 ACTGACTGACTGACTGACTG
Ind_2 ACTGACTGACTGACTGACTG
//    *   -     *        [1]
Ind_1 ACTGACTGACTGACTGACTG
Ind_3 ACTGACTGACTGACTGACTG
Ind_6 ACTGACTGACTGACTGACTG
//      -     *      -   [2]
Ind_2 ACTGACTGACTGACTGACTG
Ind_4 ACTGACTGACTGACTGACTG
//        *     -     -  [3]

I would like to extract individual .fasta files from this dataset for a vector of desired loci. For example, for locus [2] and locus [3].

How can I do this in R?

r


Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source