Tutorial

network_check Beginner
playlist_add_check 6 Steps
query_builder 15 min
Tutorial-specific formatting
Special page attributes

Add the following attributes to the page header to automatically add the corresponding heading details to the top of the page.

:page-skill-level: Beginner
:page-time-commitment: 15 min
:page-colab-link: https://colab.research.google.com/
Automatic steps

Add the .step role to a section title to prefix it with an auto-incrementing step number. The number does not appear in the auto-generated anchor link or the table of contents.

[.step]
=== Install the Python SDK and open a Python REPL.
...
[.step]
=== Connect to Astra and create a database.
Step reset

Add the .step-reset role to a section title to reset the step number.

[.step-reset]
=== This title will now be prefixed with "1." regardless of order.

Objective

Learn how to create a new database, connect to your database, load a set of vector embeddings, and perform a similarity search to find vectors that are close to the one in your query.

Tutorial overview

Prerequisites

To get started, ensure you have an active Astra account with the requisite permissions.

Install the Python SDK and open a Python REPL

pip install astrapy
python

Additional clients, such as Node.js and JSON API, are available.

Connect to Astra and create a database

import astra_vector

# Authenticate to the SaaS database
api_key = 'your_api_key'
client = astra_vector.Client(api_key)

# Create a new database
database_name = 'my_vector_database'
client.create_database(database_name)

# Connect to the database
db = client.connect(database_name)

# Create a new table for vectors
table_name = 'vector_data'
db.create_table(table_name)

Core steps

Prepare and ingest data

# Load sample vector data
sample_vectors = [
    {'id': 1, 'vector': [0.1, 0.2, 0.3]},
    {'id': 2, 'vector': [0.4, 0.5, 0.7]}
]

for data in sample_vectors:
    db.insert_record(table_name, data)
# Run a similarity search
query_vector = [0.2, 0.3, 0.4]
results = db.similarity_search(table_name, query_vector, k=5)

Show the results

# Similarity search results
for result in results:
    print(f"ID: {result['id']}, Similarity Score: {result['score']}")

Cleanup

Delete all resources

# Delete the table
db.delete_table(table_name)
print(f"Table '{table_name}' deleted.")

# Delete the database
client.delete_database(database_name)
print(f"Database '{database_name}' deleted.")

Conclusion

In this tutorial, you learned how to:

  • Create a new database

  • Connect to your database

  • Load a set of vector embeddings

  • Perform a similarity search to find vectors that are close to the one in your query

You’re well on your way to becoming an Astra Vector expert!

Next steps

  • Grasp the basics auto_stories Tutorial

    Before diving deep, ensure a solid understanding of foundational concepts surrounding vector databases. Delve into embeddings, the nature of high-dimensional data, and their profound impact on machine learning processes.

  • Installation fact_check Guide

    Before diving deep, ensure a solid understanding of foundational concepts surrounding vector databases. Delve into embeddings, the nature of high-dimensional data, and their profound impact on machine learning processes.

  • Ingest and store vector data auto_stories Tutorial

    Before diving deep, ensure a solid understanding of foundational concepts surrounding vector databases. Delve into embeddings, the nature of high-dimensional data, and their profound impact on machine learning processes.

Was this helpful?

Give Feedback

How can we improve the documentation?

© 2024 DataStax | Privacy policy | Terms of use

Apache, Apache Cassandra, Cassandra, Apache Tomcat, Tomcat, Apache Lucene, Apache Solr, Apache Hadoop, Hadoop, Apache Pulsar, Pulsar, Apache Spark, Spark, Apache TinkerPop, TinkerPop, Apache Kafka and Kafka are either registered trademarks or trademarks of the Apache Software Foundation or its subsidiaries in Canada, the United States and/or other countries. Kubernetes is the registered trademark of the Linux Foundation.

General Inquiries: +1 (650) 389-6000, info@datastax.com