Most of my day I spend my time writing C# and SQL code. Since I do a lot of database data manipulation I am always running into issues with column length of data, when importing data from .csv files into database tables. I started playing around with Python for some projects I have been working on at home.
I have found Python, extremely useful and easy to use. I have not had a chance to use an interpretive language in many years.
So I decided to write a simple Python script to check the max length of data, in a particular column, in a .csv file.
The script takes three arguments, The .csv file name, the delimiter that the .csv file is using and the number of the column (zero based)
that you want to parse.
When the script runs it parsed that column for every row in the .csv file and returns the longest length of text in the file.
The script runs very fast and parses what it is suppose to do nicely.
Python is very cool and something that I will keep on learning.
Here is the code that I wrote to parse .csv files:
#=============================================================================== # Program....: ParseCSV.py # Author.....: Joe Pitz # Date.......: 09/13/2011 # Description: Pass .csv file, which column to parse and delimiter # program will return longest length of column, zero based #=============================================================================== # import sys args = sys.argv[1:] if len(args) != 3: print("ParseCSV takes three arguments\n\r" ) print("<.csv file> <delimiter> <column to parse, zero based>\n\r") print ("Ex: ParseCSV file.csv ; 4") sys.exit(None) try: file = args delimiter = args column = int(args) chrCnt = 0 f = open(file) for line in f: llist = line.split(delimiter) # Check for number of columns numCols = len(llist) if column > numCols: print ("ERROR -> column argument is greater than number of columns in file <- ERROR" ) sys.exit(None) colLen = len(llist[column]) if colLen > chrCnt: chrCnt = colLen print("Longest column is " + str(chrCnt)) except TypeError: print "ERROR -> Check your parameters <- ERROR" f.close() except StandardError: print "ERROR -> Error Parsing .csv File <- ERROR" except IOError: print "ERROR -> Error finding or opening .csv file <- Error" else: f.close() sys.exit(None)