Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: Error when parsing FASTX file. Saw unexpected byte 'C' on line 1 #107

Closed
ljournot opened this issue Jul 12, 2023 · 3 comments
Closed

Comments

@ljournot
Copy link

I am using FASTX.jl for the first time. I try to read a FASTA file "Test_txt.txt", which contains

>Acc1
GATC
>Acc2
TGAC

and get the error message mentioned in the title.
The problem is clearly in my file. However, it is readable by other softwares (BioEdit, APE...) able to manage FASTA files. I got the same error message with FASTA files downloaded from the NCBI and with test files saved as MS-DOS or Unicode files. I had a look at the throw_parser_error function in the FASTX.jl source file but it didn't help.
I will appreciate if you could let me know what the "byte 'C' on line 1" might be and how to get rid of it?
Best wishes,
Laurent

julia> using FASTX

julia> validate_fasta(IOBuffer(">Acc1\nGATC\n>Acc2\nTGAC")) === nothing
true

julia> reader = FASTAReader(IOBuffer(">Acc1\nGATC\n>Acc2\nTGAC"))
FASTX.FASTA.Reader{TranscodingStreams.NoopStream{IOBuffer}}(TranscodingStreams.NoopStream{IOBuffer}(<mode=idle>), 1, 1, nothing, FASTX.FASTA.Record:
  description: ""
     sequence: "", true)

julia> collect(reader)
2-element Vector{FASTX.FASTA.Record}:
 FASTX.FASTA.Record:
  description: "Acc1"
     sequence: "GATC"
 FASTX.FASTA.Record:
  description: "Acc2"
     sequence: "TGAC"

julia> validate_fasta(IOBuffer("C:/Users/Laurent/Desktop/Test_txt.txt")) === nothing
false

julia> reader = FASTAReader(IOBuffer("C:/Users/Laurent/Desktop/Test_txt.txt"))
FASTX.FASTA.Reader{TranscodingStreams.NoopStream{IOBuffer}}(TranscodingStreams.NoopStream{IOBuffer}(<mode=idle>), 1, 1, nothing, FASTX.FASTA.Record:
  description: ""
     sequence: "", true)

julia> collect(reader)
ERROR: Error when parsing FASTX file. Saw unexpected byte 'C' on line 1
Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] throw_parser_error(data::Vector{UInt8}, p::Int64, line::Int64)
    @ FASTX C:\Users\Laurent\.julia\packages\FASTX\9Dngy\src\FASTX.jl:124
  [3] macro expansion
    @ C:\Users\Laurent\.julia\packages\FASTX\9Dngy\src\fasta\readrecord.jl:102 [inlined]
  [4] readrecord!(stream::TranscodingStreams.NoopStream{IOBuffer}, record::FASTX.FASTA.Record, state::Tuple{Int64, Int64})
    @ FASTX.FASTA C:\Users\Laurent\.julia\packages\Automa\5enCH\src\Stream.jl:124
  [5] _read!
    @ C:\Users\Laurent\.julia\packages\FASTX\9Dngy\src\fasta\reader.jl:104 [inlined]
  [6] iterate(rdr::FASTX.FASTA.Reader{TranscodingStreams.NoopStream{IOBuffer}}, state::Nothing)
    @ FASTX.FASTA C:\Users\Laurent\.julia\packages\FASTX\9Dngy\src\fasta\reader.jl:79
  [7] iterate
    @ C:\Users\Laurent\.julia\packages\FASTX\9Dngy\src\fasta\reader.jl:79 [inlined]
  [8] _collect(cont::UnitRange{Int64}, itr::FASTX.FASTA.Reader{TranscodingStreams.NoopStream{IOBuffer}}, #unused#::Base.HasEltype, isz::Base.SizeUnknown)
    @ Base .\array.jl:718
  [9] collect(itr::FASTX.FASTA.Reader{TranscodingStreams.NoopStream{IOBuffer}})
    @ Base .\array.jl:707
 [10] top-level scope
    @ REPL[14]:1

Your Environment

  • Package Version used: 2.1.2
  • Julia Version used: 1.9.0
  • Operating System and version (desktop or mobile): Windows 10 (64-bit)
@jakobnissen
Copy link
Member

The problem is that you are trying to not read a file, but the literal text of the path. Replace "IOBuffer" with "open".

@ljournot
Copy link
Author

Dear Jakob,
Thanks a lot for your prompt answer and sorry for the question, which I realize now was trivial. I am relatively new to programing.
Best wishes,
Laurent

@jakobnissen
Copy link
Member

Happy to help. For any other questions, you're welcome to open an issue, but I also recommend checking out the Julia language slack or the Julia language Discourse, both which have biology-related channels.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants