2

I am pretty new to polars and have been working around it just to get acquainted.

import polars as pl

filename_data = 'endomondoHR.json' pl.read_json(filename_data)

Data -> kaggle data

The error I am getting is :

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
c:\Users\shiv\Desktop\test\test.ipynb Cell 13 in <cell line: 4>()
      1 import polars as pl
      2 filename_data = 'endomondoHR.json'
----> 4 pl.read_json(filename_data)

File c:\Users\shiv\miniconda3\lib\site-packages\polars\io.py:985, in read_json(file) 971 def read_json(file: str | Path | IOBase) -> DataFrame: 972 """ 973 Read into a DataFrame from a JSON file. 974 (...) 983 984 """ --> 985 return DataFrame._read_json(file)

File c:\Users\shiv\miniconda3\lib\site-packages\polars\internals\dataframe\frame.py:959, in DataFrame._read_json(cls, file) 956 file = normalise_filepath(file) 958 self = cls.new(cls) --> 959 self._df = PyDataFrame.read_json(file, False) 960 return self

RuntimeError: BindingsError: "ComputeError(Owned(&quot;InvalidEOF&quot;))"

Useful informations :

Python version : 3.9.13 Windows 11, 16GB RAM, Intel Core i5-11400H, NVIDIA-GeForce GTX 1650 with 4GB GDDR6 dedicated VRAM Data size 6.6 GB

What are the reasons for the error can anybody help? Thanks

shivanshu dhawan
  • 188
  • 1
  • 2
  • 9

2 Answers2

1

It seems that there are two issues with the file you are trying to read that cause the file to not be a valid JSON file:

  1. The files uses single quotes instead of double quotes.
  2. The file contains multiple JSON objects concatenated by the newline character (\n).

To be able to read in the data correctly in JSON format you therefore need to first replace the single quotes by double quotes, after which you can read in the JSON objects in the file separately instead of all at once.

Oxbowerce
  • 8,522
  • 2
  • 10
  • 26
1

It seems the json uses newline as separator, you can try this:

import polars as pl
df = pl.read_ndjson(json_file)

polars provide read_ndjson for reading json files with newline separator.

usct01
  • 111
  • 2