Indices & Schemas

Learn what an index is, what a schema is, and how to create them.

2 important concepts in Wuha are indices and schemas. An index is like a table in a database - it's a place you store your data. A schema is also like the schema in a database. It describes the format of data in the index.

In Wuha, you create an index with a given schema. You then store documents that conform to that schema in the created index.

In the Getting Started guide, you created an index called my_demo which used the demo schema. All documents stored in that index needed to fit the demo schema to be accepted.

An index has one schema. But you can use the same schema for many indices. Imagine I have 2 servers full of documents: one server for Human Resources documents and one server for Sales documents. I might create 2 different indexes: human_resources and sales that both use the same document schema.

You cannot change the schema on an index once it has been created.

Built-in Schemas vs. User Schemas

Wuha comes with some common schemas that may already be sufficient for your use case. We've identified some "common data types", and built schemas around them.

For example, we already have a schema for a document, which can be used for all PDF files, Word files, Excel files etc.

However, if you have custom data, you may want to create your own custom schema instead of using a built-in schema. In order to do this, you must create the schema in a JSON format and upload it via the Wuha interface.

For the moment, custom schema creation is an advanced topic. We recommend you get in touch with us so we can help you during the early stages of Wuha.

Last updated