-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RecursionError: maximum recursion depth exceeded in comparison #16
Comments
When I create lmdb data, do I need to modify the config file as follows: |
This is caused by unsuccessful data preprocessing, please check where in ratio_dataset.py: outs = transform(data, self.ops[:-1]) the preprocessing is wrong. |
if __name__ == '__main__':
data_dir = './Union14M-L/'
label_file_list = [
'./Union14M-L/train_annos/filter_jsonl_mmocr0.x/filter_train_challenging.jsonl.txt',
'./Union14M-L/train_annos/filter_jsonl_mmocr0.x/filter_train_easy.jsonl.txt',
'./Union14M-L/train_annos/filter_jsonl_mmocr0.x/filter_train_hard.jsonl.txt',
'./Union14M-L/train_annos/filter_jsonl_mmocr0.x/filter_train_medium.jsonl.txt',
'./Union14M-L/train_annos/filter_jsonl_mmocr0.x/filter_train_normal.jsonl.txt'
]
save_path_root = './Union14M-L-LMDB-Filtered/'
for data_list in label_file_list:
save_path = save_path_root + data_list.split('/')[-1].split(
'.')[0] + '/'
os.makedirs(save_path, exist_ok=True)
print(save_path)
train_data_list = get_datalist(data_dir, data_list, 800)
createDataset(train_data_list, save_path) 你需要修改tools/create_lmdb_dataset.py中的data_dir、label_file_list和save_path_root。其中label_file.txt中的内容应该是 img_dir_name+\t+label的形式,例如: |
这是因为数据预处理有问题,你可以,在data返回为None时打印当前op: def transform(data, ops=None):
"""transform."""
if ops is None:
ops = []
for op in ops:
data = op(data)
if data is None:
print(op)
return None
return data 然后找到op的具体代码,尝试定位导致返回None的具体原因,一般是label处理不通过(比如长度超过max_text_length等)或者图片读取不成功导致。 |
上述测试代码是正常的,只是识别精度很低,我想使用自己构造地址数据集一边生成数据一边训练,请问训练前是否一定要将图片和标签转换为lmdb格式? |
RatioDateset only supports loading data in LMDB format, you can also modify it to load data in customized formats. |
The analysis of the dataset helps in determining the best parameters such as text length and aspect ratio distribution in the dataset. |
@Topdu @hp0716
|
I suspect it's because the last iter has only one image (0 1 912.0 6, indicating gpu 0, with 1 image of width 912 and a preset bs of 6). When get_item reports an error, it randomly selects equal-sized images, but the image width of 912 is only 1 image, so it exceeds the recursion count and causes an error. |
Hello, what should I pay attention to if I use create_lmdb_dataset.py to construct the dataset? When I run eval_rec_all_ch.py after generating the mdb file, I get an error
The text was updated successfully, but these errors were encountered: