Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copying a tagged or untagged file greater than 8 bytes results in content change and file tagging #85

Open
chrishodgins opened this issue Apr 4, 2024 · 2 comments
Assignees

Comments

@chrishodgins
Copy link

chrishodgins commented Apr 4, 2024

While using the zopen perl, an untagged file with EBCDIC contents was copied. When the file was <= 8 bytes everything worked as expected. When the file was > 8 bytes, the file was copied however the contents had been converted to iso8859-1 and the file had been tagged as IBM-1047.

Both setting the export __UNTAGGED_READ_MODE=ASCII environment variable and tagging the files first also didn't resolve the problem. Files encoded as ISO8859-1 are copied without any additional tagging or conversion.

$ perl --version

This is perl 5, version 39, subversion 8 (v5.39.8*) built for os390
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2024, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl".  If you have access to the
Internet, point your browser at https://www.perl.org/, the Perl Home Page.

### Simple perl script to copy each file
$ cat test.pl
use File::Copy;
copy("./ebcdic.txt", "./ebcdic_copy.txt");
copy("./ebcdic_untagged.txt", "./ebcdic_untagged_copy.txt");

### Files ready to be copied with the correct tags
$ ls -T *.txt
t IBM-1047    T=on  ebcdic.txt
- untagged    T=off ebcdic_untagged.txt
$ cat ebcdic.txt
Hello World!
$ cat ebcdic_untagged.txt
Hello World!

### Run the perl script to copy the files
$ perl test.pl

### Our copy of the untagged file has been given a tag
$ ls -T *.txt
t IBM-1047    T=on  ebcdic.txt
t IBM-1047    T=on  ebcdic_copy.txt
- untagged    T=off ebcdic_untagged.txt
t IBM-1047    T=on  ebcdic_untagged_copy.txt

### The original files are unharmed and remain in EBCDIC
$ od -xc -Ax ebcdic.txt
0000000000      C885    9393    9640    E696    9993    845A    1500
               H   e   l   l   o       W   o   r   l   d   !  \n
000000000D
$ od -xc -Ax ebcdic_untagged.txt
0000000000      C885    9393    9640    E696    9993    845A    1500
               H   e   l   l   o       W   o   r   l   d   !  \n
000000000D

### The copied files are now both ISO8859-1 encoded and both tagged as IBM-1047
$ od -xc -Ax ebcdic_copy.txt
0000000000      4865    6C6C    6F20    576F    726C    6421    0A00
             110 145   %   %   ? 040 127   ? 162   % 144 041 012
000000000D
$ od -xc -Ax ebcdic_untagged_copy.txt
0000000000      4865    6C6C    6F20    576F    726C    6421    0A00
             110 145   %   %   ? 040 127   ? 162   % 144 041 012
000000000D

### Now repeat with __UNTAGGED_READ_MODE=ASCII
$ rm *copy*
$ export __UNTAGGED_READ_MODE=ASCII
$ perl test.pl
$ ls -T *.txt
t IBM-1047    T=on  ebcdic.txt
t IBM-1047    T=on  ebcdic_copy.txt
- untagged    T=off ebcdic_untagged.txt
t IBM-1047    T=on  ebcdic_untagged_copy.txt
$ od -xc ebcdic_copy.txt
0000000000      4865    6C6C    6F20    576F    726C    6421    0A00
             110 145   %   %   ? 040 127   ? 162   % 144 041 012
0000000015
$ od -xc ebcdic_untagged_copy.txt
0000000000      4865    6C6C    6F20    576F    726C    6421    0A00
             110 145   %   %   ? 040 127   ? 162   % 144 041 012
0000000015
@covener
Copy link

covener commented May 22, 2024

@IgorTodorovskiIBM meant to be closed? mentioned in PR but not fully linked

@IgorTodorovskiIBM
Copy link
Collaborator

Yes, if you can help verify that would be great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants