Sorting is one of the most essential operations in data analysis. Whether you're trying to rank customers by sales, list products by price, or sort dates chronologically — Pandas makes it simple and powerful.
In this guide, you'll learn:
-
✅ How to sort data in Pandas DataFrames and Series
-
✅ Sorting by one or multiple columns
-
✅ Sorting by index
-
✅ Sorting with custom order and options
-
✅ Full working examples
-
✅ Tips and common pitfalls
What is Sorting in Pandas?
In Pandas, sorting refers to reordering data by column values or row/index labels, using:
-
sort_values()
– Sorts by column(s) -
sort_index()
– Sorts by index labels
Step 1: Import Pandas and Create Sample Data
import pandas as pd
# Sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 32, 37, 29],
'Score': [88, 92, 85, 95]
}
df = pd.DataFrame(data)
print(df)
Output:
Name Age Score
0 Alice 25 88
1 Bob 32 92
2 Charlie 37 85
3 David 29 95
Step 2: Sort by a Single Column
Sort by Score (ascending):
df_sorted = df.sort_values(by='Score')
Sort by Score (descending):
df_sorted_desc = df.sort_values(by='Score', ascending=False)
Step 3: Sort by Multiple Columns
Sometimes you want to sort by one column and break ties with another.
Example: Sort by Age, then by Score
df_multi = df.sort_values(by=['Age', 'Score'])
You can also sort in mixed order:
df_mixed = df.sort_values(by=['Age', 'Score'], ascending=[True, False])
Step 4: Sort by Index
If you want to sort based on the index (row labels):
df_index_sorted = df.sort_index()
Reverse index order:
df_index_desc = df.sort_index(ascending=False)
Step 5: Sort a Series
scores = pd.Series([88, 92, 85, 95], name='Scores')
scores_sorted = scores.sort_values()
Additional Options in sort_values()
Option | Description |
---|---|
by |
Column(s) to sort |
ascending |
True (default) or False |
inplace |
Modify the original DataFrame |
na_position |
'last' (default) or 'first' – controls placement of NaNs |
ignore_index |
Reset index in result (Pandas 1.0+) |
Example with NaN values:
df_nan = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Score': [88, None, 85, 95]
})
# Sort placing NaNs at the top
df_nan_sorted = df_nan.sort_values(by='Score', na_position='first')
✅ Full Working Example
import pandas as pd
# Create sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 32, 37, 29],
'Score': [88, 92, 85, 95]
}
df = pd.DataFrame(data)
# Sort by Score descending
sorted_df = df.sort_values(by='Score', ascending=False)
# Sort by Age and Score
multi_sorted_df = df.sort_values(by=['Age', 'Score'], ascending=[True, False])
# Sort by index
index_sorted_df = df.sort_index()
# Display results
print("Sorted by Score Descending:")
print(sorted_df)
print("\nSorted by Age and Score:")
print(multi_sorted_df)
print("\nSorted by Index:")
print(index_sorted_df)
Tips & Best Practices
-
Use
inplace=True
only when you don't need the original DataFrame. -
Always preview data after sorting with
.head()
or.tail()
. -
Use
ignore_index=True
to reset row numbers after sorting, especially for clean exports. -
For large datasets, avoid unnecessary sorts to save time and memory.
⚠️ Common Pitfalls
Problem | Fix |
---|---|
Sorting fails silently | Double-check column names (they're case-sensitive) |
NaNs not placed as expected | Use na_position='first' or 'last' |
DataFrame not updated | Remember to assign result or use inplace=True |
Summary
Sorting is a fundamental part of analyzing and presenting data. With just a few lines of code, Pandas makes it effortless to sort your data by values or index.
Key Takeaways:
-
Use
sort_values()
for column-based sorting -
Use
sort_index()
for row/index-based sorting -
Combine sorting with filtering, grouping, and other operations for powerful analysis