Quickstart¶

This section explains how to upload data to Aito and send your first query with either CLI or Python SDK.

Essentially, uploading data into Aito can be broken down into the following steps:

Infer a Table Schema cli | sdk
Change the inferred schema if needed cli | sdk
Create a table cli | sdk
Convert the data cli | sdk
Upload the data cli | sdk
Send a query to an Aito Endpoint cli | sdk

Note

Skip steps 1, 2, and 3 if you upload data to an existing table Skip step 4 if you already have the data in the appropriate format for uploading or the data matches the table schema

If you don’t have a data file, you can download our example file and follow the guide.

Upload data and send your first query with the CLI¶

Setup Aito credentials ¶

The easiest way to set-up the credentials is by configure command:
$ aito configure

Note

You can use the Quick Add Table Operation instead of doing upload step-by-step if you want to upload to a new table and don’t think you need to adjust the inferred schema.

The CLI supports all steps needed to upload data:

Infer a Table Schema ¶

For examples, infer a table schema from a csv file:

$ aito infer-table-schema csv < path/to/myCSVFile.csv > path/to/inferredSchema.json

Change the Schema¶

You might want to change the ColumnType, e.g: The id column should be of type String instead of Int, or add an Analyzer to a Text column. In that case, just make changes to the inferred schema JSON file.

The example below use jq to change the id column type:

$ jq '.columns.id.type = "String"' < path/to/schemaFile.json > path/to/updatedSchemaFile.json

Create a Table ¶

You need a table name and a table schema to create a table:

$ aito database create-table tableName path/to/tableSchema.json

Convert the Data ¶

If you made changes to the inferred schema or have an existing schema, use the schema when with the -s flag to make sure that the converted data matches the schema:

$ aito convert csv -s path/to/updatedSchema.json path/to/myCSVFile.csv > path/to/myConvertedFile.ndjson

You can either convert the data to:

A list of entries in JSON format for Batch Upload:
$ aito convert csv --json path/to/myCSVFile.csv > path/to/myConvertedFile.json
A NDJSON file for File Upload:
$ aito convert csv < path/to/myFile.csv > path/to/myConvertedFile.ndjson
Remember to gzip the NDJSON file:
$ gzip path/to/myConvertedFile.ndjson

Upload the Data¶

You can upload the data by either:

Delete a Table:

$ aito upload-entries tableName < tableEntries.json

Upload a File to a Table:

$ aito upload-file tableName tableEntries.ndjson.gz

Send your first query¶

You can send a query to an Aito endpoint by:

$ aito <endpoint> <query>

For example:

$ aito search '{"from": "products"}'
$ aito predict '{"from": "products", "where": {"name": {"$match": "rye bread"}}, "predict": "tags"}'

Upload data and send your first query with the SDK¶

The Aito Python SDK uses Pandas DataFrame for multiple operations.

The example below shows how you can load a csv file into a DataFrame, please read the official pandas guide for further instructions. You can download an example csv file reddit_sample.csv here and run the code below:

import pandas
reddit_df = pandas.read_csv("reddit_sample.csv")

Infer a table schema¶

You can infer a AitoTableSchema from a Pandas DataFrame:

from aito.schema import AitoTableSchema
from pprint import pprint
reddit_schema = AitoTableSchema.infer_from_pandas_data_frame(reddit_df)
print(reddit_schema.to_json_string(indent=2, sort_keys=True))

{
  "columns": {
    "author": {
      "nullable": false,
      "type": "String"
    },
    "comment": {
      "analyzer": {
        "customKeyWords": [],
        "customStopWords": [],
        "language": "english",
        "type": "language",
        "useDefaultStopWords": false
      },
      "nullable": false,
      "type": "Text"
    },
    "created_utc": {
      "analyzer": {
        "delimiter": ":",
        "trimWhitespace": true,
        "type": "delimiter"
      },
      "nullable": false,
      "type": "Text"
    },
    "date": {
      "analyzer": {
        "delimiter": "-",
        "trimWhitespace": true,
        "type": "delimiter"
      },
      "nullable": false,
      "type": "Text"
    },
    "downs": {
      "nullable": false,
      "type": "Int"
    },
    "label": {
      "nullable": false,
      "type": "Int"
    },
    "parent_comment": {
      "analyzer": {
        "customKeyWords": [],
        "customStopWords": [],
        "language": "english",
        "type": "language",
        "useDefaultStopWords": false
      },
      "nullable": false,
      "type": "Text"
    },
    "score": {
      "nullable": false,
      "type": "Int"
    },
    "subreddit": {
      "nullable": false,
      "type": "String"
    },
    "ups": {
      "nullable": false,
      "type": "Int"
    }
  },
  "type": "table"
}

Change the Schema¶

You might want to change the ColumnType, e.g: The id column should be of type String instead of Int, or add a Analyzer to a Text column.

You can access and update the column schema by using the column name as the key:

from aito.schema import AitoStringType, AitoTokenNgramAnalyzerSchema, AitoAliasAnalyzerSchema

# Change the label type to String instead of Int
reddit_schema['label'].data_type = AitoStringType()

# Change the analyzer of the `comments` column
reddit_schema['comment'].analyzer = AitoTokenNgramAnalyzerSchema(
  source=AitoAliasAnalyzerSchema('en'),
  min_gram=1,
  max_gram=3
)

Create a table¶

You can create_table() using an AitoClient and specifying the table name and the table schema

Note

The example is not direclty copy-pastable. Please use your own Aito environment credentials

from aito.client import AitoClient
from aito.api import create_table
aito_client = AitoClient(instance_url=YOUR_AITO_INSTANCE_URL, api_key=YOUR_AITO_INSTANCE_API_KEY)
create_table(client=aito_client, table_name='reddit', schema=reddit_schema)

Convert the Data¶

The DataFrameHandler can convert a DataFrame to match an existing schema:

from aito.utils.data_frame_handler import DataFrameHandler
data_frame_handler = DataFrameHandler()
converted_reddit_df = data_frame_handler.convert_df_using_aito_table_schema(
  df=reddit_df,
  table_schema=reddit_schema
)

A DataFrame can be converted to:

A list of entries in JSON format for Batch Upload:

reddit_entries = converted_reddit_df.to_dict(orient="records")

A gzipped NDJSON file for File Upload using the DataFrameHandler:

data_frame_handler.df_to_format(
  df=converted_reddit_df,
  out_format='ndjson',
  write_output='reddit_sample.ndjson.gz',
  convert_options={'compression': 'gzip'}
)

Upload the Data¶

You can upload_entries() using an AitoClient

Batch Upload:

from aito.api import upload_entries
upload_entries(aito_client, table_name='reddit', entries=reddit_entries)

File Upload:

from pathlib import Path
from aito.api import upload_file, get_table_size

upload_file(aito_client, table_name='reddit', file_path=Path('reddit_sample.ndjson.gz'))

# Check that the data has been uploaded
print(get_table_size(aito_client, 'reddit'))

The Batch Upload can also be done using a generator:

def entries_generator(start, end):
  for idx in range(start, end):
    entry = {'id': idx}
    yield entry

upload_entries(
  aito_client,
  table_name="table_name",
  entries=entries_generator(start=0, end=4),
  batch_size=2,
  optimize_on_finished=False
)

Send your first query¶

You can send a query to an Aito endpoint by using the AitoClient method:

from aito.client import AitoClient
from aito.api import search, predict
aito_client = AitoClient(instance_url=INSTANCE_URL, api_key=INSTANCE_API_KEY)
search(client=aito_client, query={
  "from": "products",
  "where": {"name": {"$match": "rye bread"}}
})

predict(client=aito_client, query={
  "from": "products",
  "where": {"name": "rye bread"},
  "predict": "tags"
})

Quickstart¶

Upload data and send your first query with the CLI¶

Setup Aito credentials¶

Infer a Table Schema¶

Change the Schema¶

Create a Table¶

Convert the Data¶

Upload the Data¶

Send your first query¶

Upload data and send your first query with the SDK¶

Infer a table schema¶

Change the Schema¶

Create a table¶

Convert the Data¶

Upload the Data¶

Send your first query¶

Setup Aito credentials ¶

Infer a Table Schema ¶

Create a Table ¶

Convert the Data ¶