Open large CSV file
Posted on Friday December 05, 2014 by Eric Potvin
Opening large CSV files can be sometimes hard to do. All development languages can import or read CSV files but it might requires some coding or huge memory allocation.
This is why I love Python! I had to parse a very bad formated 5.6 Gig CSV file. For this article I will not put the 5.6 Gig file but instead, here's what we will work with.
first_name,last_name,phone,address,city,zip,first_appearance Clark,Kent,2195550001,"344 Clinton Street, Apartment 3D",Metropolis,11111,"April 18, 1938" Peter Benjamin,Parker,5185550002,"137 Chrystie Street","New York",22222,"1962" Bruce,Wayne,2125550003,"1007 Mountain Drive","Gotham City",10001,"May 1939" ...
To import or read CSV files we need to first import the csv module.
This will allow you to use the
reader function that wlil parse the CSV file properly.
data = csv.reader(open("myfile.csv", "rb"))
Then, we simply need to loop through the data to read them.
for row in data:
Let say, we need need to import the first name, the address and the zip code. We will need to use the index of the array corresponding to the fields from the CSV file.
Here's the full script:
import csv data = csv.reader(open("myfile.csv", "r")) for row in data: print row, print ",", print row, print ",", print row
This will output:
Clark , 344 Clinton Street, Apartment 3D , 11111 Peter Benjamin , 137 Chrystie Street , 22222 Bruce , 1007 Mountain Drive , 10001
Now, to simply output this to another file, simply use this command:
python myScript.py > anotherfile.csv