最近發(fā)現(xiàn)比較坑的問題,pandas.read_json在讀取長整數(shù)的時候會篡改數(shù)字。
具體的代碼如下:
import json import pandas as pd data = { "id1": "3661430294729648121", "id2": "1298519559306190850", "id3": "9999999999999999", } df = pd.read_json(json.dumps(data), orient='index') print(df)
輸出的結(jié)果是:
研究了半天以后在:https://github.com/pandas-dev/pandas/issues/20608 和 https://github.com/pandas-dev/pandas/issues/33766 找到了答案。
主要原因就是默認(rèn)情況,pandas會把整數(shù)轉(zhuǎn)換為float浮點型,然后再轉(zhuǎn)為int類型,類似:
解決辦法就是讀取數(shù)據(jù)的時候加一個dtype={},然后代碼如下:
import json import pandas as pd data = { "id1": "3661430294729648121", "id2": "1298519559306190850", "id3": "9999999999999999", } df = pd.read_json(json.dumps(data), orient='index', dtype={}) print(df)
輸出結(jié)果就正常了
贊
0
賞