aito.utils.data_frame_handler.DataFrameHandler

class aito.utils.data_frame_handler.DataFrameHandler

Bases: object

A handler that supports read, write, and convert a Pandas DataFrame in accordance to a Aito Table Schema

Methods

convert_df_using_aito_table_schema(df, …)

convert a pandas DataFrame to match a given Aito table schema

convert_file(read_input, write_output, …)

Converting input file to expected format, generate or use Aito table schema if specified

df_to_format(df, out_format, write_output[, …])

Write a Pandas DataFrame

read_file_to_df(read_input, in_format[, …])

Read input to a Pandas DataFrame

Attributes

allowed_format

static convert_df_using_aito_table_schema(df: pandas.DataFrame, table_schema: Union[aito.schema.AitoTableSchema, Dict]) → pandas.DataFrame

convert a pandas DataFrame to match a given Aito table schema

Parameters
  • df (pd.DataFrame) – input pandas DataFrame

  • table_schema (an AitoTableSchema object or a Dict, optional) – input table schema

Raises
  • ValueError – input table schema is invalid

  • e – failed to convert

Returns

converted DataFrame

Return type

pd.DataFrame

convert_file(read_input: Union[str, pathlib.Path, IO], write_output: Union[str, pathlib.Path, IO], in_format: str, out_format: str, read_options: Dict = None, convert_options: Dict = None, apply_functions: List[Callable[[…], pandas.DataFrame]] = None, use_table_schema: Union[aito.schema.AitoTableSchema, Dict] = None) → pandas.DataFrame

Converting input file to expected format, generate or use Aito table schema if specified

Parameters
  • read_input (any valid string path, pathlike object, or file-like object (objects with a read() method)) – read input

  • write_output (any valid string path, pathlike object, or file-like object (objects with a read() method)) – write output

  • in_format (str) – input format

  • out_format (str) – output format

  • read_options (Dict, optional) – dictionary contains arguments for pandas read function, defaults to None

  • convert_options (Dict, optional) – dictionary contains arguments for pandas write function, defaults to None

  • apply_functions (List[Callable[.., pd.DataFrame]], optional) – list of partial functions that will be applied to the loaded pd.DataFrame, defaults to None

  • use_table_schema (an AitoTableSchema object or a Dict, optional) – use an aito schema to dictates data types and convert the data, defaults to None

Returns

converted DataFrame

Return type

pd.DataFrame

df_to_format(df: pandas.DataFrame, out_format: str, write_output: Union[str, pathlib.Path, IO], convert_options: Dict = None)

Write a Pandas DataFrame

Parameters
  • df (pd.DataFrame) – input DataFrame

  • out_format (str) – output format

  • write_output (any valid string path, pathlike object, or file-like object (objects with a read() method)) – write output

  • convert_options (Dict, optional) – dictionary contains arguments for pandas write function, defaults to None

read_file_to_df(read_input: Union[str, pathlib.Path, IO], in_format: str, read_options: Dict = None) → pandas.DataFrame

Read input to a Pandas DataFrame

Parameters
  • read_input (any valid string path, pathlike object, or file-like object (objects with a read() method)) – read input

  • in_format (str) – input format

  • read_options (Dict, optional) – dictionary contains arguments for pandas read function, defaults to None

Returns

read DataFrame

Return type

pd.DataFrame