Prerequisites
-
A table in Snowflake that contains data that was generated by Unstructured. The
target Snowflake table must have a column named
EMBEDDINGSthat will contains vector embeddings for the text in the table’sTEXTcolumn. The following Streamlit example app assumes that theEMBEDDINGScolumn contains 1,024 vector embeddings and has a data type ofVECTOR(FLOAT, 1024). To create this table, you can create a custom Unstructured workflow that uses any supported source connector along with the Snowflake destination connector. Then run the workflow to generate the data and then insert that generated data into the target Snowflake table. After the data is inserted into the target Snowflake table, you can run the following Snowflake SQL statement to generate the 1,024 vector embeddings for the text in the table’sTEXTcolumn and then insert those generated vector embeddings into the table’sEMBEDDINGScolumn. The model specified here for generating the vector embeddings is the same one that is used by the Streamlit example app:To learn how to run Snowflake SQL statements, see for example Querying data using worksheets. - You must have the appropriate privileges to create and use a Streamlit app in your Snowflake account. These privileges include ones for the target table’s parent database and schema as well as the Snowflake warehouse that runs the Streamlit app. For details, see Getting started with Streamlit in Snowflake.
Create and run the example app
1
Create the Streamlit app
- In Snowsight for your Snowflake account, on the sidebar, click Projects > Streamlit.
- Click + Streamlit App.
- For App title, enter a name for your app, such as
Unstructured Demo Streamlit App. - For App location, chose the target database and schema to store the app in.
- For App warehouse, choose the warehouse that you want to use to run your app and execute its queries.
- Click Create.
2
Add code to the Streamlit app
In this step, you add Python code to the Streamlit app that you created in the previous step.This step explains each part of the code as you add it. If you want to skip past these explanations, add the
code in the complete code example all at once, and then skip ahead to
the next step, “Run the Streamlit app.”
-
Import Python dependencies that get the current connection to the Snowflake database and schema and get Streamlit functions and features.
-
Get the current connection to the Snowflake database and schema.
-
Display the title of the app in the Streamlit UI, and get the user’s search query from the Streamlit UI.
-
Get the user’s search query and display a progress indicator in the UI.
-
Use the user’s search query to get the top result from the
ELEMENTStable. TheELEMENTStable contains the data that was generated by Unstructured. The code uses theSNOWFLAKE.CORTEX.EMBED_TEXT_1024function to generate vector embeddings for the user’s search query and theVECTOR_COSINE_SIMILARITYfunction to get the similarity between the vector embeddings for the user’s search query and the vector embeddings for theTEXTcolumn for each rown in theELEMENTStable. The code then orders the results by similarity and limits the results to the row with the greatest similarity between the search query and the target text. -
Get the
TEXTcolumn from the top result and use it as context for the user’s search query. -
Use the user’s search query and the context from the top result to get a response from Snowflake Cortex Search for RAG.
The code uses the
SNOWFLAKE.CORTEX.COMPLETEfunction to generate a response to the user’s search query based on the context from the top result. -
Display the generated response in the Streamlit UI.
3
Run the Streamlit app
- In the upper right corner, click Run.
- For Enter your search query, enter some natural-language question about the
TEXTcolumn in the table. - Press Enter.

