Sorting Data in Python Pandas: A Complete Guide

Last updated 3 weeks, 6 days ago | 90 views 75     5

Tags:- Python Pandas

Sorting is one of the most essential operations in data analysis. Whether you're trying to rank customers by sales, list products by price, or sort dates chronologically — Pandas makes it simple and powerful.

In this guide, you'll learn:

  • ✅ How to sort data in Pandas DataFrames and Series

  • ✅ Sorting by one or multiple columns

  • ✅ Sorting by index

  • ✅ Sorting with custom order and options

  • ✅ Full working examples

  • ✅ Tips and common pitfalls


What is Sorting in Pandas?

In Pandas, sorting refers to reordering data by column values or row/index labels, using:

  • sort_values() – Sorts by column(s)

  • sort_index() – Sorts by index labels


Step 1: Import Pandas and Create Sample Data

import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 32, 37, 29],
    'Score': [88, 92, 85, 95]
}

df = pd.DataFrame(data)
print(df)

Output:

      Name  Age  Score
0    Alice   25     88
1      Bob   32     92
2  Charlie   37     85
3    David   29     95

Step 2: Sort by a Single Column

Sort by Score (ascending):

df_sorted = df.sort_values(by='Score')

Sort by Score (descending):

df_sorted_desc = df.sort_values(by='Score', ascending=False)

Step 3: Sort by Multiple Columns

Sometimes you want to sort by one column and break ties with another.

Example: Sort by Age, then by Score

df_multi = df.sort_values(by=['Age', 'Score'])

You can also sort in mixed order:

df_mixed = df.sort_values(by=['Age', 'Score'], ascending=[True, False])

Step 4: Sort by Index

If you want to sort based on the index (row labels):

df_index_sorted = df.sort_index()

Reverse index order:

df_index_desc = df.sort_index(ascending=False)

Step 5: Sort a Series

scores = pd.Series([88, 92, 85, 95], name='Scores')
scores_sorted = scores.sort_values()

Additional Options in sort_values()

Option Description
by Column(s) to sort
ascending True (default) or False
inplace Modify the original DataFrame
na_position 'last' (default) or 'first' – controls placement of NaNs
ignore_index Reset index in result (Pandas 1.0+)

Example with NaN values:

df_nan = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Score': [88, None, 85, 95]
})

# Sort placing NaNs at the top
df_nan_sorted = df_nan.sort_values(by='Score', na_position='first')

✅ Full Working Example

import pandas as pd

# Create sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 32, 37, 29],
    'Score': [88, 92, 85, 95]
}
df = pd.DataFrame(data)

# Sort by Score descending
sorted_df = df.sort_values(by='Score', ascending=False)

# Sort by Age and Score
multi_sorted_df = df.sort_values(by=['Age', 'Score'], ascending=[True, False])

# Sort by index
index_sorted_df = df.sort_index()

# Display results
print("Sorted by Score Descending:")
print(sorted_df)

print("\nSorted by Age and Score:")
print(multi_sorted_df)

print("\nSorted by Index:")
print(index_sorted_df)

Tips & Best Practices

  • Use inplace=True only when you don't need the original DataFrame.

  • Always preview data after sorting with .head() or .tail().

  • Use ignore_index=True to reset row numbers after sorting, especially for clean exports.

  • For large datasets, avoid unnecessary sorts to save time and memory.


⚠️ Common Pitfalls

Problem Fix
Sorting fails silently Double-check column names (they're case-sensitive)
NaNs not placed as expected Use na_position='first' or 'last'
DataFrame not updated Remember to assign result or use inplace=True

Summary

Sorting is a fundamental part of analyzing and presenting data. With just a few lines of code, Pandas makes it effortless to sort your data by values or index.

Key Takeaways:

  • Use sort_values() for column-based sorting

  • Use sort_index() for row/index-based sorting

  • Combine sorting with filtering, grouping, and other operations for powerful analysis