Skip to content

Instantly share code, notes, and snippets.

@dpapathanasiou
Last active April 21, 2022 22:48
Show Gist options
  • Save dpapathanasiou/4ee964b6c6879afd12699402743c98c4 to your computer and use it in GitHub Desktop.
Save dpapathanasiou/4ee964b6c6879afd12699402743c98c4 to your computer and use it in GitHub Desktop.
A simple python script to reduce or reformat a csv file, producing a csv with a specific sub-set of columns, stripping out any undesired characters from the individual row values
'''
A simple python script to reduce or reformat a csv file, producing a
csv with a specific sub-set of columns, stripping out any undesired
characters from the individual row values.
'''
import csv
import re
import sys
# Define the column headers from the original csv to be preserve in the reformatted csv
headers = [
'Account Name/Number',
'Symbol',
'Description',
'Quantity',
'Last Price',
'Current Value',
'Total Gain/Loss Dollar',
'Total Gain/Loss Percent',
'Cost Basis Per Share',
'Cost Basis Total'
]
# Define a regex expression for letters and characters to remove from individual cell values
unneeded = re.compile(r'(\+|\$|%|--|n/a)')
if __name__ == "__main__":
"""Create a command-line main() entry point"""
if len(sys.argv) < 2:
# Define the usage
print(sys.argv[0], '[input csv file] [output csv file]')
else:
with open(sys.argv[1], 'r') as csv_input:
with open(sys.argv[2], 'w') as csv_output:
writer = csv.DictWriter(csv_output, fieldnames=headers)
for row in csv.DictReader(csv_input):
# ignore any row which does not contain all the required columns
if not None in [row[x] for x in headers]:
writer.writerow({x: unneeded.sub('', row[x]) for x in headers})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment