How to Read JSON Files in Python Using Pandas

Last updated 3 weeks, 6 days ago | 89 views 75     5

Tags:- Python Pandas

JSON (JavaScript Object Notation) is a popular format for exchanging data — commonly used in APIs, web services, and configuration files. In Python, working with JSON is simple and efficient using the Pandas library.

In this guide, you’ll learn:

  • ✅ What JSON is and why it matters

  • ✅ How to read JSON files with pandas.read_json()

  • ✅ Handling nested JSON

  • ✅ Common parameters and use cases

  • ✅ Full working code examples

  • ✅ Tips and common pitfalls


What is JSON?

JSON represents data in a hierarchical format using key-value pairs. For example:

{
  "name": "Alice",
  "age": 25,
  "skills": ["Python", "SQL"]
}

This format is readable by both humans and machines and is widely used in web APIs and configuration files.


Why Use Pandas to Read JSON?

Pandas offers powerful methods to convert JSON directly into a DataFrame — making it easier to analyze, filter, and manipulate structured data.

pandas.read_json() reads JSON strings or files and returns a DataFrame or Series.


Basic Syntax

import pandas as pd

df = pd.read_json('data.json')
print(df.head())
  • If data.json is a local file, Pandas will read it and convert it into a DataFrame.

  • If the JSON is properly formatted, you'll get a table with columns and rows ready to work with.


Reading JSON from Different Sources

1️⃣ Reading from a Local File

df = pd.read_json('employees.json')

2️⃣ Reading from a JSON String

import json

json_data = '''
[
  {"name": "Alice", "age": 25},
  {"name": "Bob", "age": 30}
]
'''

df = pd.read_json(json_data)
print(df)

File Structure Matters

Example 1: JSON Array (Common Case)

[
  {"id": 1, "name": "Alice"},
  {"id": 2, "name": "Bob"}
]

✅ Works directly with pd.read_json().

Example 2: Nested JSON

{
  "employees": [
    {"name": "Alice", "age": 25},
    {"name": "Bob", "age": 30}
  ]
}

❌ This won’t work directly — use json_normalize():

import json
from pandas import json_normalize

with open('nested.json') as f:
    data = json.load(f)

df = json_normalize(data['employees'])
print(df)

Common Parameters in read_json()

Parameter Description
path_or_buf File path, URL, or string
orient Format of JSON string (records, split, index, columns, values, table)
typ Return type: frame (default) or series
convert_dates Try to parse datetimes
lines Set to True for JSON lines format
encoding Encoding for the input (e.g., 'utf-8')

Reading JSON Lines (.jsonl or .ndjson)

Some large files are stored as JSON Lines — one record per line.

{"id":1,"name":"Alice"}
{"id":2,"name":"Bob"}

Use:

df = pd.read_json('data.jsonl', lines=True)

Full Working Example

Let’s say you have a JSON file students.json:

[
  {"name": "Alice", "scores": {"math": 90, "english": 85}},
  {"name": "Bob", "scores": {"math": 78, "english": 82}}
]

Step-by-step Code

import json
import pandas as pd
from pandas import json_normalize

# Load the JSON data from file
with open('students.json') as file:
    data = json.load(file)

# Normalize nested fields
df = json_normalize(data, sep='_')
print(df)

Output

    name  scores_math  scores_english
0  Alice           90              85
1    Bob           78              82

You can now perform any Pandas operations on this DataFrame!


⚠️ Common Pitfalls

Problem Solution
ValueError: Expected object or value File might be empty or invalid JSON
❌ Nested JSON isn’t flattened Use json_normalize() to flatten
❌ JSON lines format not working Set lines=True in read_json()
❌ Wrong orientation Try changing orient='records' or 'index'

Pro Tips

  • Use json_normalize() for nested structures and hierarchical JSON.

  • Combine with requests to load JSON from web APIs:

    import requests
    
    url = 'https://api.example.com/users'
    response = requests.get(url)
    df = pd.json_normalize(response.json())
    
  • Always inspect your DataFrame with .info() and .head() before doing analysis.


Summary

Reading JSON with Pandas is a fast and efficient way to load and analyze structured data. Whether it's a flat file or nested structure, Pandas provides all the tools you need.

✅ Key Takeaways:

  • Use pd.read_json() for flat JSON

  • Use json_normalize() for nested JSON

  • Use lines=True for .jsonl or newline-delimited JSON

  • Handle API data or web sources using requests