Skip to content

🧩 Training Files ​

In the Training Files section, you can upload documents or images that will be used to automatically create AI training datasets.
You can access this section here: https://ainisa.com/business/training-files

Training Files list

πŸ“ Supported File Types ​

  • PDF documents
  • Image files: .png, .jpg, .jpeg

πŸ‘‰ Both text-based PDFs and scanned image PDFs can be read and parsed automatically by Ainisa.


πŸͺœ Steps to Create Datasets from Files ​

  1. Upload your file
    Click Upload Files (top-right button) and select a PDF or image. After uploading, the file will appear in the table.

  2. If the file is a PDF

    • After upload, Ainisa starts parsing the PDF automatically.
    • The β€œNumber of Pages” column shows how many pages are ready.
    • When all pages are processed, click the β‹― icon and choose Create Question-Answer Dataset.
  3. If the file is an image (jpg, jpeg, png)

    • Images are processed instantly.
    • Click the β‹― icon β†’ Create Question-Answer Dataset.

    Training Files actions menu

  4. Parsing progress

    • For PDFs, the process may take some time depending on file size.
    • You can monitor progress under the Parse Stage column (0% β†’ 100%).
    • Once it reaches 100%, all pages have been converted into datasets.

Parsing to dataset progress


πŸ“Š Where do the datasets go? ​

After you click Create Question-Answer Dataset, Ainisa automatically generates Q&A items and saves them in the Datasets section:
https://ainisa.com/business/datasets

  • If the dataset was created from a file, you’ll see the file name in the dataset list.
  • If it was created manually, no file name will appear.

βœ… Example Table ​

File NameFile TypeNumber of PagesParse StatusParse Stage
agentic-rag.jpegjpg1/1Parsed to image for reading0%
Ainisa β€” partnership.pdfpdf8/8Parsed to image for reading0%

πŸ’‘ Notes ​

  • Use this section if you already have content (PDFs, scanned images, etc.) and want to auto-generate datasets instead of writing them manually.
  • This step only prepares data β€” actual model training happens later in the Fine-tuning section.
  • Each processing image or pdf consumes API tokens from your connected OpenAI account.
  • There are limits for reading pages:
    • Each image counts as 1 page.
    • A 10-page PDF counts as 10 pages.
    • Each subscription plan has its own monthly page limit for dataset creation.
    • If the user reaches the page limit, the reading/parsing process will stop and fail.
    • Example: If the Pro Plan allows 100 pages/month and the user already used 70, uploading a 50-page PDF will only process 30 pages; the remaining 20 will fail.
    • To continue, users can upgrade to a higher plan or wait until the next subscription renewal date.