pagexml-tools
latest

Contents

  • Installation
  • Usage
  • Tutorials
  • API
    • Subpackages
      • pagexml.analysis package
      • pagexml.helper package
        • Submodules
      • pagexml.model package
      • pagexml.plotting package
    • Submodules
  • Indices and tables
pagexml-tools
  • API
  • pagexml.helper package
  • Edit on GitHub

pagexml.helper package

Submodules

  • pagexml.helper.file_helper module
    • Extractor
    • get_archive_functions()
    • get_archived_file_names()
    • get_archived_files_infos()
    • get_archiver_mode()
    • parse_archived_filename()
    • read_7z_handle()
    • read_inner_archive()
    • read_page_7z_file()
    • read_page_archive_file()
    • read_page_archive_files()
    • read_tar_handle()
    • read_zip_handle()
  • pagexml.helper.pagexml_helper module
    • LineIterable
    • combine_adjacent_lines()
    • elements_overlap()
    • get_custom_tags()
    • horizontal_group_lines()
    • horizontally_merge_lines()
    • line_ends_with_word_break()
    • make_line_range()
    • make_line_text()
    • make_text_region_text()
    • merge_lines()
    • merge_sets()
    • merge_textregions()
    • pagexml_to_line_format()
    • pretty_print_textregion()
    • print_textregion_stats()
    • read_line_format_file()
    • sort_lines_in_column_reading_order()
    • sort_lines_in_reading_direction()
    • sort_lines_in_reading_order()
    • sort_lines_in_row_reading_order()
    • sort_regions_in_reading_order()
    • write_pagexml_to_line_format()
  • pagexml.helper.text_helper module
    • LineReader
    • find_term_in_context()
    • get_bbox()
    • get_line_format_json()
    • get_line_format_tsv()
    • get_line_words()
    • get_page_lines_words()
    • make_line_format_file()
    • make_list()
    • make_skipgram_similarity_dict()
    • read_lines_from_line_files()
    • read_pagexml_docs_from_line_file()
    • remove_hyphen()
    • remove_word_break_chars()
    • split_line_words()
    • transform_box_to_coords()
Previous Next

© Copyright 2023, KNAW-HuC. Revision cb0efad6.

Built with Sphinx using a theme provided by Read the Docs.