You can save a dataframe into different formats.
Given the column ‘Released’ from the following dataframe:
Index | Artist | Album | Released | Length | Genres | Millions sold |
---|---|---|---|---|---|---|
0 | Michael Jackson | Thriller | 1982 | 0:42:19 | pop, rock, R&B | 46 |
1 | AC/DC | Back in Black | 1980 | 0:42:11 | hard rock | 26.1 |
2 | Pink Floyd | The Dark Side of the Moon | 1973 | 0:42:49 | progressive rock | 24.2 |
3 | Whitney Houston | The Bodyguard | 1992 | 0:57:44 | R&B, soul, pop | 27.4 |
4 | Meat Loaf | Bat Out of Hell | 1977 | 0:46:33 | hard rock, progressive rock | 20.6 |
5 | Eagles | Their Greatest Hits (1971-1975) | 1976 | 0:43:08 | rock, soft rock, folk rock | 32.2 |
6 | Bee Gees | Saturday Night Fever | 1977 | 1:15:54 | disco | 20.6 |
7 | Fleetwood Mac | Rumours | 1977 | 0:40:01 | soft rock | 27.9 |
Say we wanted to know the unique years in the column ‘Released’, we would use the following syntax:
year_list = df['Released'].unique
print(year_list)
—> [1982,1980,1973,1992,1977,1976]
Say we want to only return rows for albums created in the eighties and after. This can be done in one line of code, but in two steps.
First —> df['Released']>=1980
returns a new column of booleans depending on the result of the inequality.
Index | Released |
---|---|
0 | True |
1 | True |
2 | False |
3 | True |
4 | False |
5 | False |
6 | False |
7 | False |
If we then want to create a new dataframe that only has the rows for which the inequality is ==True==, we use the following syntax:
df1 = df[df['Released']>=1980]
Now df1 is a dataframe that looks like this:
|Index|Artist|Album |Released|Length |Genres |Millions sold| |——-||-|—|-|—|-| |0 |Michael Jackson|Thriller |1982 |0:42:19|pop, rock, R&B|46 | |1 |AC/DC |Back in Black|1980 |0:42:11|hard rock |26.1 | |2 |Whitney Houston|The Bodyguard|1992 |0:57:44|R&B, soul, pop|27.4 |
To save the new df1 dataframe to a csv file, we use the to_csv()
method.
df1.to_csv('new_songs.csv')
This will create a new csv file in the working directory with our data stored in it.
There are many other file formats into which you can save your data using Pandas.
Rendering context...