Learn Python Series (#29) - Handling CSV
Repository
https://github.com/python/cpython
What will I learn?
- You will learn what a CSV file containing tabular data looks like, both in a spreadsheet program, inside a CSV file using a code editor, and using a Python interpreter.
- how to use the
csv
module for reading, parsing and writing CSV files, - how to handle both CSV list and dictionary elements,
- how to change the standard CSV field delimiter
Requirements
- A working modern computer running macOS, Windows or Ubuntu;
- An installed Python 3(.6) distribution, such as (for example) the Anaconda Distribution;
- The ambition to learn Python programming.
Difficulty
- Beginner
Curriculum (of the Learn Python Series
):
- Learn Python Series - Intro
- Learn Python Series (#2) - Handling Strings Part 1
- Learn Python Series (#3) - Handling Strings Part 2
- Learn Python Series (#4) - Round-Up #1
- Learn Python Series (#5) - Handling Lists Part 1
- Learn Python Series (#6) - Handling Lists Part 2
- Learn Python Series (#7) - Handling Dictionaries
- Learn Python Series (#8) - Handling Tuples
- Learn Python Series (#9) - Using Import
- Learn Python Series (#10) - Matplotlib Part 1
- Learn Python Series (#11) - NumPy Part 1
- Learn Python Series (#12) - Handling Files
- Learn Python Series (#13) - Mini Project - Developing a Web Crawler Part 1
- Learn Python Series (#14) - Mini Project - Developing a Web Crawler Part 2
- Learn Python Series (#15) - Handling JSON
- Learn Python Series (#16) - Mini Project - Developing a Web Crawler Part 3
- Learn Python Series (#17) - Roundup #2 - Combining and analyzing any-to-any multi-currency historical data
- Learn Python Series (#18) - PyMongo Part 1
- Learn Python Series (#19) - PyMongo Part 2
- Learn Python Series (#20) - PyMongo Part 3
- Learn Python Series (#21) - Handling Dates and Time Part 1
- Learn Python Series (#22) - Handling Dates and Time Part 2
- Learn Python Series (#23) - Handling Regular Expressions Part 1
- Learn Python Series (#24) - Handling Regular Expressions Part 2
- Learn Python Series (#25) - Handling Regular Expressions Part 3
- Learn Python Series (#26) - pipenv & Visual Studio Code
- Learn Python Series (#27) - Handling Strings Part 3 (F-Strings)
- Learn Python Series (#28) - Using Pickle and Shelve
Additional sample code files
The full - and working! - iPython tutorial sample code file is included for you to download and run for yourself right here:
https://github.com/realScipio/learn-python-series/blob/master/csv-tut01.ipynb
GitHub Account
Learn Python Series (#29) - Handling CSV
Welcome to already episode #29 of the Learn Python Series
! We've already discussed handling files in general, handling JSON, in the previous epispde we talked about pickling, so we must also briefly touch upon a well-known and widely used data format: CSV
.
When working with data, then CSV (Comma Separated Values) is a very commonly used format to import and export tabular data to and from spreadsheets and also databases.
However, because a well-defined CSV standard is missing, a uniform way to implement field delimiters (such as commas) is absent. These subtle differences in CSV formats make self-parsing of CSV files a bit cumbersome. Luckily most Python distributions (such as Anaconda) come with a bundled csv
module help a lot with reading and writing "out of the box".
Let's find out how the csv
module works!
What does a CSV file look like?
When importing / opening a CSV-file in a spreadsheet program (such as MS Excel, OpenOffice Calc), the data contained in the CSV is displayed in a tabular (table) format, like this:
ID | Name | City |
---|---|---|
1 | Jack | New York |
2 | Paula | Dublin |
3 | Marly | Melbourne |
4 | Tony | London |
5 | Andrea | Rome |
6 | Julie | Paris |
7 | Bernhard | Berlin |
8 | Frank | San Francisco |
9 | Johan | Amsterdam |
10 | Maria | Barcelona |
The underlying data structure of the above displayed CSV file, is structured in a human-readable form. It's actually just like a regular text file (you could open a CSV file with your favorite code editor as well), and each field is simply separated by a comma (hence the name CSV), like so:
ID,Name,City
1,Jack,New York
2,Paula,Dublin
3,Marly,Melbourne
4,Tony,London
5,Andrea,Rome
6,Julie,Paris
7,Bernhard,Berlin
8,Frank,San Francisco
9,Johan,Amsterdam
10,Maria,Barcelona
Reading CSV files
Let's suppose we have a file named names.csv
containing the exact data as displayed above. Using Python's csv
module, and the with
keyword we've been using in the previous episode(s), reading a .csv
file is done like so:
First import the csv
module:
import csv
Then we'll open our file names.csv
in read-mode, using flag r
, and we'll assign it to file object f. Then we need to use the csv.reader()
method and pass in our file object. Let's assign it as data
:
with open('names.csv', 'r') as f:
data = csv.reader(f)
print(type(data))
<class '_csv.reader'>
As you can see, the CSV file data is now extracted into variable data
which is a reader
object, which is an iterable. So let's now first create an empty list content
, then iterate over the reader
object, append each data row as a list to the content
list and then print it:
import pprint
with open('names.csv', 'r') as f:
data = csv.reader(f)
content = []
for data_row in data:
content.append(data_row)
pprint.pprint(content)
[['ID', 'Name', 'City'],
['1', 'Jack', 'New York'],
['2', 'Paula', 'Dublin'],
['3', 'Marly', 'Melbourne'],
['4', 'Tony', 'London'],
['5', 'Andrea', 'Rome'],
['6', 'Julie', 'Paris'],
['7', 'Bernhard', 'Berlin'],
['8', 'Frank', 'San Francisco'],
['9', 'Johan', 'Amsterdam'],
['10', 'Maria', 'Barcelona']]
As you can see, each returned data_row
is a list of 3 items.
Writing CSV files
We can apply the same technique to write data to a persistent .csv
file on disk, this time using the csv.writer()
method.
Let's create a list of lists containing some data, again use the with
keyword and this time open a new file in write-mode (using the w
, for write, flag), againi assign f
as a file object, and the apply the csv.writer()
method to create a writer
object.
Then we'll use that writer
object's method writerows()
to actually write each item in the cryptos
list to file, by passing in cryptos
as writerows()
's argument, like so:
cryptos = [
['ID', 'Name', 'Abbreviation'],
['1', 'Bitcoin','BTC'],
['2', 'Litecoin','LTC'],
['3', 'Steem','STEEM'],
['4', 'Steem Backed Dollar','SBD'],
['5', 'IoTeX','IOTX'],
['6', 'Stellar','XLM'],
['7', 'EOS','EOS'],
['8', 'Ethereum','ETH'],
['9', 'Cardano','ADA'],
['10', 'Dash','DASH']
]
with open('cryptos.csv', 'w') as f:
writer = csv.writer(f)
print(type(writer))
writer.writerows(cryptos)
<class '_csv.writer'>
At this point, a valid CSV file named cryptos.csv
is written to disk, and to check if it's contents are valid, we can read it back in like we did before:
import pprint
with open('cryptos.csv', 'r') as f:
data = csv.reader(f)
content = []
for data_row in data:
content.append(data_row)
pprint.pprint(content)
[['ID', 'Name', 'Abbreviation'],
['1', 'Bitcoin', 'BTC'],
['2', 'Litecoin', 'LTC'],
['3', 'Steem', 'STEEM'],
['4', 'Steem Backed Dollar', 'SBD'],
['5', 'IoTeX', 'IOTX'],
['6', 'Stellar', 'XLM'],
['7', 'EOS', 'EOS'],
['8', 'Ethereum', 'ETH'],
['9', 'Cardano', 'ADA'],
['10', 'Dash', 'DASH']]
Works like a charm!
csv
's classes DictReader and DictWriter
Thus far we've been reading and writing lists with strings inside them to and from CSV files. But the csv
module also contains the DictReader and DictWriter classes using Python dictionaries instead of lists containing string items.
Reading with DictReader
DictReader
creates an object and maps it to a dictionary. The dictionary keys are either set using the optional fieldnames
parameter, or - when not passed-in as an argument - read from the first line of the CSV file (and now you know why I have been consistently adding fieldnames in the first row of each CSV file ;-) ).
Let's now use the DictReader
class to read in the names.csv
file again, print a simple multiline F-String (see episode #27 on F-Strings) and please observe that I now use the same dictionary keys as contained in the first line of the names.csv
file for reference.
import csv
content = []
with open('names.csv') as f:
data = csv.DictReader(f)
for data_row in data:
content.append(data_row)
print(
f"Person #{data_row['ID']}, "
f"{data_row['Name']}, "
f"lives in {data_row['City']}"
)
Person #1, Jack, lives in New York
Person #2, Paula, lives in Dublin
Person #3, Marly, lives in Melbourne
Person #4, Tony, lives in London
Person #5, Andrea, lives in Rome
Person #6, Julie, lives in Paris
Person #7, Bernhard, lives in Berlin
Person #8, Frank, lives in San Francisco
Person #9, Johan, lives in Amsterdam
Person #10, Maria, lives in Barcelona
Writing with DictWriter
Let's now try to write a CSV file using the DictWriter
class, where we begin with creating a list of objects utopian_contributors
(instead of a list of lists, or a list of strings). Because we're dealing with tabular data, every object inside the list has the same data structure.
We also create a list containing the fieldnames:
fieldnames = ['name', 'category']
When writing the CSV file, we first call the method writeheader()
, in order to write the first row containing the fieldnames; as you may have noticed, this time I did not include the fieldnames as the first list item in the utopian_contributors
data list.
And finally, we'll call the method writerows()
and pass in the utopian_contributors
data list.
Nota bene: it's also possible to use the method writerow()
, for example in a for
loop, to write each data row individually.
import csv
utopian_contributors = [
{'name': 'scipio', 'category': 'tutorials'},
{'name': 'rosatravels', 'category': 'tutorials'},
{'name': 'holger80', 'category': 'development'},
{'name': 'fabiyamada', 'category': 'graphics'}
]
with open('contributors.csv', 'w') as f:
fieldnames = ['name', 'category']
data = csv.DictWriter(f, fieldnames=fieldnames)
data.writeheader()
data.writerows(utopian_contributors)
Changing delimiters (other than the default comma ,
)
It's also possible to set an alternative field delimiter
prior to executing a reader or writer method. For example you don't want to use the standard comma field delimiter, but a semi-colon, or a space.
This works as follows:
with open('contributors_semicolon.csv', 'w') as f:
fieldnames = ['name', 'category']
data = csv.DictWriter(f, fieldnames=fieldnames, delimiter=';')
data.writeheader()
data.writerows(utopian_contributors)
Please observe the ;
delimiter being applied!
This technique applies to both reading and writing CSV data.
Thank you, scipio. Upvoted and resteemed!
@ArtTurtle is an upvote bot run by @Artopium dedicated to upvoting your art, music, fashion, video and books. Find out how you can get an upvote for every creative post you make by visitng @ArtTurtle and reading the latest report.
thanks! ;-)
Excellent Blog! Your account is actually the reason I stopped doing my python tutorials! You're already so far ahead :P
Really solid info here, as well as the rest of your posts. Will be keeping an eye out to see how far you go with this.
Thank you for the kind words, appreciated!
However, please don't think about stopping your own Python tutorials just because I already posted some! The Python universe is so big, there are plenty of topics to cover still. And as far as I'm concerned, feel free to even cover the same topics if you want to explain them differently; I always do my best to explain things as easy as I can, but I'm sure there are other ways as well.
Enjoy Python! :-)
Excellent Scipio.. useful blog..I really appreciate your move.
Thank you for the kind words! Here, let me get you upward from that 25 reputation level ;-)
Thanks Scipio for your kind cooperation..
Thank you for your contribution.
Your contribution has been evaluated according to Utopian policies and guidelines, as well as a predefined set of questions pertaining to the category.
To view those questions and the relevant answers related to your post, click here.
Need help? Write a ticket on https://support.utopian.io/.
Chat with us on Discord.
[utopian-moderator]
Hey @scipio
Thanks for contributing on Utopian.
We’re already looking forward to your next contribution!
Want to chat? Join us on Discord https://discord.gg/h52nFrV.
Vote for Utopian Witness!
Congratulations @scipio! You have completed the following achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of upvotes
Click on the badge to view your Board of Honor.
If you no longer want to receive notifications, reply to this comment with the word
STOP
Do not miss the last post from @steemitboard:
SteemitBoard and the Veterans on Steemit - The First Community Badge.
Congratulations @scipio! You have completed the following achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of comments received
Click on the badge to view your Board of Honor.
If you no longer want to receive notifications, reply to this comment with the word
STOP
Really cool tutorial! CSV files come really handy sometimes. For example, if you want to store some data but you don't want to create a database. Even though I haven't really used CSV files myself, it looks pretty straight-forward and simple to do so. There is one concern I have regarding this tutorial and that is that this tutorial doesn't follow multiple rules.
You can check out here the documentation for Python 3.6+ and here for Python 2.7. You can see that this topic is well documented elsewhere. The other rule is that.
To be honest, you have just explained some basics functions of this module. Anyways, thanks for making this future. It is really helpful.
You are @umais , @anoniam, @anonfiles, @masterbank, @masterbot, @driplo, and multiple others.
You stole money from people, I tried to stop you, and now you're trying to take it out on me from revealing what you did and for trying to stop you.
Your comment itself is utter nonsense: this episode is part of my entire
Learn Python Series
, and Utopian doesn't have "rules" anymore but guidelines.You created this account a few days go and you go straight to @utopian-io and @steemhunt to get more upvotes with exactly the same topics you covered before and that you learned from me.
I told you to stay away from me and to not harm others either.
Leave
I am not any of those accounts. You are typing that because I told the truth. I don't even know who any of those are. I am not taking any revenge, I just commented this in tipping session. @maverickinvictus told me about Utopian-io and Steemhunt and you can even ask him. Maybe I am wrong about the rules things (I am still new and can make mistakes). I have not harmed anyone. Now I know what happens when you criticize anyone or tell the truth. This really proves how much of a crybaby you are. Get over it.
"Proves" me being a crybaby? I have never interacted with the account @sneakin before.
Why would a newcomer immediately contribute to @utopian-io and @steemhunt ? You just want to grab money form others.
You are @umais , @anoniam, @anonfiles, @masterbank, @masterbot, @driplo, and multiple others. Of course you're not going to admit that, but you are.
I told you to leave me alone and to stop harming others.
LEAVE!
You can downvote all you want. You are literally crying over the comment that I made, is that a mature act?
Why the hell will I admit something that is wrong? Yes, I am not those accounts. Leave you alone? You started this in a first place, DMing and harassing me in the direct messages. I will leave when I want to, I have freedom of speech on what to talk and when to talk. You can't tell me what to do. If you want I will not interact with your account again. You are doing nothing but decreasing your reputation in eyes of others with all this. Have a nice day!
P.S As I said before, the community of SteemGigs told me about SteemHunt and Utopian.