Jonathan Hsu
Building a Command Line JSON/CSV Converter 2019-01-02

Building a Command Line JSON/CSV Converter

I’m not a developer. I have some programming background but I’d say I’m better at following tutorials than I am at programming. I started this project because I was tired of Googling online converters, hoping they weren’t giving me Chrome AIDS, and running into file limitations. The project is available at https://github.com/jhsu98/json-csv-converter, but if you’d like to learn how I built it then continue on.


Introduction

Before we get started, I’m going to assume you’ve installed Python 3.x and know that to run the script you’ll want to type python3 json-csv-converter.py

Getting Started

We’re going to need to import four libraries: os, csv, json, and OrderedDict. One thing that slows me down when following tutorials is not knowing what each import is used for, so here’s a quick rundown:

  • os: manipulating the filename and checking to see if a file exists
  • csv: importing a CSV file and writing a converted CSV file
  • json: importing a JSON file and writing a converted JSON file
  • OrderedDict: creating an Ordered Dictionary, which tracks the order properties are added to an object. This is from the module collections.

To import these four libraries we’ll use the import keyword followed by the module name.

import os
import csv
import json
from collections import OrderedDict

Next, let’s add some descriptive text welcoming the user to our script.

print("Welcome to the JSON-CSV Converter.")
print("This script will convert a JSON file to CSV or a CSV file to JSON")

Request and Load File

From here the next step is to ask for the file to load. We could do a simple input; however, what happens if the file doesn’t exist? The rest of our script won’t work. So we’re going to use a try/except/else statement to control the flow of our script. Here’s the outlined structure:

try:
   # ask for the filename
   # load data
except Exception as e:
   # print error message, exit script
   print("Error opening file ... exiting:",e)
   exit()
else:
   # convert the file and save the output

If you’re still unsure about how the code block works, the try section will execute and if anything goes wrong the except section will take place. Assuming nothing goes wrong, the else section will execute once the try is completed.

Now let’s flesh out asking for the filename and loading the file. We use the input function to ask the user for the file name and determine the extension by taking the last section when split based on a period (.). Afterwards, we’ll use the appropriate method to read in the file—csv.reader() or json.load()—while printing an error message and exiting if the file does not have the appropriate extension.

try:
    print("Which file do you want to convert?")
    filename = input("Filename: ")
    extension = filename.split(".")[-1].lower()
    
    f = open(filename)
    if extension == "csv":
        # load csv file
        data = list(csv.reader(f))
        print("CSV file loaded")
    elif extension == "json":
        # load json file
        data = json.load(f,object_pairs_hook=OrderedDict)
        print("JSON file loaded")
    else:
        print("unsupported file type ... exiting")
        exit()

The argument object_pairs_hook=OrderedDict is critical here. Without this setting, the JSON key-value pairs would be orderless, meaning each object’s properties will show in various arrangements. The OrderedDict setting will ensure that the properties stay in the order that they are read.

Convert CSV to JSON

We’ll handle the CSV to JSON first. Since a CSV file is tabular, the first row will have the names of each column. These column names will become the keys in JSON’s key-value-pair structure. The key is the identifier for the value within the object. Knowing that the first row are the keys, that means we now need to loop through the remaining rows for the data. After creating an empty list for all the converted data, the following steps will happen for each of the data-containing rows:

  1. create a new empty Ordered Dictionary
  2. loop through each value (what would be each cell within a spreadsheet)
  3. add the key-value pair to the Ordered Dictionary where the key is the text from the first row and the value is either the data if present or None if it’s empty.
  4. once all cells for the row have been added to the Ordered Dictionary, add the object to the larger array.
# CONVERT CSV TO JSON
if extension == "csv":
    keys = data[0]
    converted = []
    for i in range(1, len(data)):
        obj = OrderedDict()
        for j in range(0,len(keys)):
            if len(data[i][j]) > 0:
                obj[keys[j]] = data[i][j]
            else:
                obj[keys[j]] = None
        converted.append(obj)

Make sure you read the code carefully. We’re nesting loops so the outer for loop uses the variable i while the inner loop uses the variable j. Also notice the outer loop starts at index 1 for the range because the first row contains the keys and not data.

Convert JSON to CSV

Now to write the section for converting JSON to CSV. The general structure of the block is similar, but the sections are a bit more complex. This is because in a CSV file all your keys are in the first row, one location…easy. In a JSON file, each object is independent of the others meaning it may have keys other objects do not…hard. What does this mean? We need to loop through the entire data set to find all unique keys.

# CONVERT JSON TO CSV
if extension == "json":
    # get all keys in json objects
    keys = []
    for i in range(0,len(data)):
        for j in data[i]:
            if j not in keys:
                keys.append(j)

Now that we have all our keys, let’s make our array to hold the converted data and add the keys as the first row. Next, we’ll loop through our data set a second time. The trick here is that the nested loop is not the data, but the array of keys we created above. We have to make sure the order of the data matches the order of the keys. For every key—what will be a column in our CSV file—if the key exists in the current object then append the respective value, otherwise the None value.

# map data in each row to key index
converted = []
converted.append(keys)
for i in range(0,len(data)):
    row = []
    for j in range(0,len(keys)):
        if keys[j] in data[i]:
            row.append(data[i][keys[j]])
        else:
            row.append(None)
    converted.append(row)

Create Converted File

Now we’re ready to take our converted data and save it to a new file. First, let’s prepare the basename and extension for our soon-to-be converted file. Inside the else block of the try/except/else we’ll split the basename from what was entered earlier and swap the extension. We also need to make sure our soon-to-be file’s name does not exist. If it does exist, we’ll add a numeric iterator to the file name.

converted_file_basename = os.path.basename(filename).split(".")[0]
converted_file_extension = ".json" if extension == "csv" else ".csv"
if(os.path.isfile(converted_file_basename + converted_file_extension)):
    counter = 1
    while os.path.isfile(converted_file_basename + " (" + str(counter) + ")" + converted_file_extension):
        counter += 1
    converted_file_basename = converted_file_basename + " (" + str(counter) + ")"

If you’re not familiar with the ternary operator, it’s a shorthand way of writing an if/else statement. Python handles the order a little differently than other languages which have the expression first followed by the positive and negative case. In Python you start with the positive case, followed by the expression, and finally the else value.

Our final step is to use either the json.dump() or csv.writer() methods to write the data in our newly created file. If all goes well, a success message will be printed. Otherwise, a failure message will be printed.

try:
    if converted_file_extension == ".json":
        with open(converted_file_basename + converted_file_extension, 'w') as outfile:
            json.dump(converted, outfile)
    elif converted_file_extension == ".csv":
        with open(converted_file_basename + converted_file_extension, 'w') as outfile:
            writer = csv.writer(outfile)
            writer.writerows(converted)
except:
    print("Error creating file ... exiting")
else:
    print("File created:",converted_file_basename + converted_file_extension)

Conclusion

I hope you learned something from this tutorial. There’s nothing earth shattering here, it’s actually a really simple script but it filled a need and I wanted to share what I learned with the world. I’m fully aware that my code could probably use some work, again you can head over to GitHub and view the repo. Clone it if you didn’t follow along above, submit issues for improvements, and I’m happy to take any and all constructive criticism.

View the repository at: https://github.com/jhsu98/json-csv-converter