Python throws valueerror trailing data when you try to read JSON from a file line by line and there are extra characters at the end like \n
or \r\n
. In this article we will see the code example to resolve this error.
Causes for this error
There are two reasons behind this error –
- The JSON file has trailing characters like \n or \r\n. This is due to file encodings like LF, CRLF etc.
- Wrong file path. If your JSON file is in different directory then
pd.read_json()
could throw this error.
Code Example
Error Code – Let’s reproduce this error first –
{"a": "ironman", "b": "Tony"} {"a": "captain", "b": "Steve"} {"a": "hulk", "b": "Bruce"} {"a": "spiderman", "b": "Peter"}
This is a json file of superheroes. Now we will read it using pd.read_json()
–
import pandas as pd df = pd.read_json("superhero.json")
It will throw valueerror: trailing data. So, output –
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 198, in read_json date_unit).parse() File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 266, in parse self._parse_no_numpy() File "/Users/d/anaconda/lib/python2.7/site-packages/pandas/io/json.py", line 483, in _parse_no_numpy loads(json, precise_float=self.precise_float), dtype=None) ValueError: Trailing data
Solution
1. Use lines=True
parameter in read_json
–
import pandas as pd data = pd.read_json('superhero.json', lines=True)
This will read the file line by line.
2. Use file open function and read lines –
import json import pandas as pd with open('superhero.json', encoding="utf8") as f: data = f.readlines() data = [json.loads(line) for line in data] #convert string to dict format df = pd.read_json(data) # Load into dataframe
3. If \n
or \r\n
are causing issues then use this code –
import pandas as pd with open('superhero.json', 'r') as f: data = f.readlines() # strip slashes data = map(lambda x: x.rstrip(), data) df = pd.read_json(data)
4. If you want to put all json objects in single array, then use this code –
import pandas as pd with open('superhero.json', 'r') as f: data = f.readlines() # strip slashes data = map(lambda x: x.rstrip(), data) data = "[" + ','.join(data) + "]" df = pd.read_json(data)
5. Check if the location of json file is correct. Sometimes the file is in parent folder and we refer it from current location of python script. This causes valueerror: trailing data.
import pandas as pd data = pd.read_json('../superhero.json', lines=True)