Here is a really simple module for converting fixed-length Cobol data into a Python list … you can find this code and related modules at:
You can pipe the results of copybook2csv.py into this module to quickly parse the data into a list, or once you already know the structure you can call parse_data(struct_fmt_string) directly. If the copybook field and actual record lengths don’t match it will still parse the data, but it will display a warning indicating that the data could be truncated or needed to be padded to fit the field definitions.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__version__ = """COBOL Fixed-length Data Parser ver 0.2
Note: This version does not work with OCCURS in Copybook files,
but is a lot faster than the varaible length data parser modules.
License: GPLv3, Copyright (C) 2010 Brian Peterson
This is free software. There is NO warranty;
not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
"""
USAGE = """copybook2list.py CopybookFile"""
import load
import csv, struct, sys
def parse_data(struct_fmt, lines):
try:
return [ struct.unpack(struct_fmt, i) for i in lines ]
except struct.error:
sys.stderr.write('Record layout vs. record size mismatch\n')
size = sum([ int(i) for i in struct_fmt.split('s')[:-1] ])
return [ struct.unpack(struct_fmt, i.ljust(size)[:size])
for i in lines ]
def main(args):
copybook = load.csv_(args.copybook.readlines(), strip_=True)[1:]
field_lengths = [ int(i[2]) for i in copybook ]
struct_fmt = 's'.join([ str(i) for i in field_lengths ]) + 's'
if args.struct:
print struct_fmt
else:
for record in parse_data(struct_fmt, load.lines(args.datafile)):
print record
if __name__ == '__main__':
from cmd_line_args import Args
args = Args(USAGE, __version__)
args.allow_stdin()
args.add_files('datafile', 'copybook')
args.parser.add_argument('-s', '--struct', action='store_true',
help='show structure format')
main(args.parse())
Advertisement
December 1, 2011 at 11:48 |
Thank you very much for sharing this.
When I tried to read data, the cobol2py does not recognize the BCD keyword (for packed decimal?) as supported. Can you please help with that?
December 6, 2011 at 01:37 |
I see that the copybook parser supports it, but it appears I never got around to coding it the data parser module; probably because I don’t have any BCD fields in my data to test with. If you send me some sample BCD data, I can update the code so that it supports it.