Python Pandas - Reading and Writing JSON Files



JSON (JavaScript Object Notation) is a lightweight, human-readable data-interchange format widely used for data storage and transfer. It is widely used for transmitting data between a server and a web application. Python's Pandas library provides robust functionalities for reading and writing JSON files efficiently using the read_json() and to_json() methods.

A JSON file stores data in a structured format, which looks similar to a dictionary or a list in Python. A JSON file has .json extension. Below you can see how the data present in the JSON file looks like −

[
    {
        "Name": "Braund",
        "Gender": "Male",
        "Age": 30
    },
    {
        "Name": "Cumings",
        "Gender": "Female",
        "Age": 25
    },
    {
        "Name": "Heikkinen",
        "Gender": "female",
        "Age": 35
    }
]

In this tutorial, we will learn about basics of working with JSON files using Pandas, including reading and writing JSON files, and some common configurations.

Reading JSON Files with Pandas

The pandas.read_json() function is used to read JSON data into a Pandas DataFrame. This function can take a file path, URL, or JSON string as input.

Example

The following example demonstrates how to read JSON data using the pandas.read_json() function. Here we are using the StringIO to load the JSON string into a file-like object.

Open Compiler
import pandas as pd from io import StringIO # Create a string representing JSON data data = """[ {"Name": "Braund", "Gender": "Male", "Age": 30}, {"Name": "Cumings", "Gender": "Female", "Age": 25}, {"Name": "Heikkinen", "Gender": "Female", "Age": 35} ]""" # Use StringIO to convert the JSON formatted string data into a file-like object obj = StringIO(data) # Read JSON into a Pandas DataFrame df = pd.read_json(obj) print(df)

Following is the output of the above code −


Name Gender Age
0 Braund Male 30
1 Cumings Female 25
2 Heikkinen Female 35

Writing JSON Files with Pandas

Pandas provides the to_json() function to export or write JSON file using the data from a Pandas DataFrame or Series objects. This function is used to convert a Pandas data structure object into a JSON string, and it offers multiple configuration options for customizing the JSON output.

Example: Basic Example of writing a JSON file

Here is an example demonstrating how to write a Pandas DataFrame to a JSON file.

Open Compiler
import pandas as pd # Create a DataFrame from the above dictionary df = pd.DataFrame({"Name":["Braund", "Cumings", "Heikkinen"], "Gender": ["Male", "Female", "Female"], "Age": [30, 25, 25]}) print("Original DataFrame:\n", df) # Write DataFrame to a JSON file df.to_json("output_written_json_file.json", orient='records', lines=True) print("The output JSON file has been written successfully.")

Following is the output of the above code −

Original DataFrame:
Name Gender Age
0 Braund Male 30
1 Cumings Female 25
2 Heikkinen Female 35
The output JSON file has been written successfully.

After executing the above code, you can find the created JSON file named output_written_json_file.json in your working directory.

Example: Writing a JSON file using the split orientation

The following example writes a simple DataFrame object into JSON using the split orientation.

Open Compiler
import pandas as pd from json import loads, dumps # Create a DataFrame df = pd.DataFrame( [["x", "y"], ["z", "w"]], index=["row_1", "row_2"], columns=["col_1", "col_2"], ) # Convert DataFrame to JSON with 'split' orientation result = df.to_json(orient="split") parsed = loads(result) # Display the JSON output print("JSON Output (split orientation):") print(dumps(parsed, indent=4))

Following is the output of the above code −

JSON Output (split orientation):
{
    "columns": [
        "col_1",
        "col_2"
    ],
    "index": [
        "row_1",
        "row_2"
    ],
    "data": [
        [
            "x",
            "y"
        ],
        [
            "z",
            "w"
        ]
    ]
}
Advertisements