FileUtils
Utility class for performing file-related operations, including buffer-size detection, file type inference from binary headers, and UTF-8 detection.
This class is implemented as a singleton to ensure consistent and efficient use across the system, especially for repetitive header inspections and file classification logic.
Constructor- init()
Initialize internal UTF-8 and UTF-16 byte-order marker tables used for text-encoding detection.
This constructor is intentionally lightweight because the class is a singleton; instances are reused across the system to minimize repeated header-inspection setup.
Methods
get_buffer_size(file_obj) staticmethod
Determine the size (in bytes) of the given file-like object.
Supports: - Standard file objects with a fileno() - In-memory io.BytesIO buffers
Parameters
file_obj A file-like object. Must either:
Support
fileno()Be an instance of
io.BytesIReturns
Returns
int - Total number of bytes in the file or buffer.
Raises
ValueError - If the object type is unsupported.
get_file_type(file_path)
Infer the file type by examining the first 1024 bytes of the file.
Parameters
file_path (
str) Path to the file to inspect.
Returns
tuple[str, str, str]
Data modality (
"Audio","Image","Video","Text","File")File extension (e.g.,
"mp3","png","utf8","bin")MIME type (e.g.,
"audio/mpeg","image/png","application/octet-stream")