pagexml.helper package
Submodules
- pagexml.helper.file_helper module
- pagexml.helper.pagexml_helper module
LineIterablecombine_adjacent_lines()elements_overlap()get_custom_tags()horizontal_group_lines()horizontally_merge_lines()line_ends_with_word_break()make_line_range()make_line_text()make_text_region_text()merge_lines()merge_sets()merge_textregions()pagexml_to_line_format()pretty_print_textregion()print_textregion_stats()read_line_format_file()sort_lines_in_column_reading_order()sort_lines_in_reading_direction()sort_lines_in_reading_order()sort_lines_in_row_reading_order()sort_regions_in_reading_order()write_pagexml_to_line_format()
- pagexml.helper.text_helper module
LineReaderfind_term_in_context()get_bbox()get_line_format_json()get_line_format_tsv()get_line_words()get_page_lines_words()make_line_format_file()make_list()make_skipgram_similarity_dict()read_lines_from_line_files()read_pagexml_docs_from_line_file()remove_hyphen()remove_word_break_chars()split_line_words()transform_box_to_coords()