How to know (and display) all columns of a DataFrame in Python
Prerequisites
We will once again use the (famous) pandas library used in Python.
For installation, you have the solutions below:
for Python
pip install pandas
for Python 3
pip3 install pandas
You will only need to install your library once -- Unless you are on different environments/machines
And then you will need to load the library – each time/each script launch – and to do this:
import pandas as pd
Display columns using pandas in Python
To display the columns of a DataFrame in Python, you have two solutions:
Solution 1
for col in df.columns:
print(col)
Where the result would be:
col_A1
col_B2
col_C3
col_D4
Solution 2
print(df.columns)
Where the result would be:
Index(['col_A1', 'col_B2', 'col_C3', 'col_D4'], dtype='object')
For each column (col) of the dataframe df
(in df.columns) and then display (print) the result using print()
This is clearly my favourite solution: easiest, most straight forward ... the best
Both solution are good, it depends what you would like to get as an answer.
Let's see it in details
Create a dataframe or import your data
In this case I will create a sample DataFrame for this article
data = {'col_A1': [1, 2, 3], 'col_B2': [4, 5, 6], 'col_C3': [7, 8, 9],'col_D4': [10, 11, 12]}
df = pd.DataFrame(data)
How to ensure to display all columns?
To display all columns, I would use the set_option
with display.max_columns
and leveraging pandas
: pd.set_option('display.max_columns', None)
.
Would a df.info()
work in this case? Let's see it together!
Display columns with df.info()
The result would be:
RangeIndex: 3 entries, 0 to 2
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 col_A1 3 non-null int64
1 col_B2 3 non-null int64
2 col_C3 3 non-null int64
3 col_D4 3 non-null int64
dtypes: int64(4)
Which gives you the answer, with additional context, such as the data type
After checking df.info()
, let's see for df.describe()
.
Display columns with df.describe()
And if you are trying df.describe()
does it answer the question?
In this case the result is way more "complex", but gives you the name of the columns:
col_A1 col_B2 col_C3 col_D4
count 3.0 3.0 3.0 3.0
mean 2.0 5.0 8.0 11.0
std 1.0 1.0 1.0 1.0
min 1.0 4.0 7.0 10.0
25% 1.5 4.5 7.5 10.5
50% 2.0 5.0 8.0 11.0
75% 2.5 5.5 8.5 11.5
max 3.0 6.0 9.0 12.0
In my opinion, only to have the column name, using Python, this might be a bit overkill. Don't you think?