Documents reference
📝 NOTE: This content was pulled from PR 308
Documents represent a single row or record of data in Astra DB Serverless.
Use the Collection
class to work with documents.
If you haven’t already, consult the Collections reference topic for details on how to get a Collection
object.
Working with dates
-
Python
-
TypeScript
-
Java
-
cURL
Date and datetime objects, which are instances of the Python standard library
datetime.datetime
and datetime.date
classes, can be used anywhere in documents.
Read operations from a collection always return the datetime
class regardless of whether a date
or a datetime
was provided in the insertion.
import datetime
from astrapy import DataAPIClient
from astrapy.ids import ObjectId, uuid8, UUID
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_one({"when": datetime.datetime.now()})
collection.insert_one({"date_of_birth": datetime.date(2000, 1, 1)})
collection.update_one(
{"registered_at": datetime.date(1999, 11, 14)},
{"$set": {"message": "happy Sunday!"}},
)
print(
collection.find_one(
{"date_of_birth": {"$lt": datetime.date(2001, 1, 1)}},
projection={"_id": False},
)
)
# will print:
# {'date_of_birth': datetime.datetime(2000, 1, 1, 0, 0)}
As shown in the example, read operations from a collection always return the |
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Working with document IDs
Documents in a collection are always identified by an ID that is unique within the collection.
The ID can be any of several types, such as a string, integer, or datetime. However, it’s recommended to instead prefer the uuid
or the ObjectId
types.
The Data API supports uuid
identifiers up to version 8, as well as ObjectId
identifiers as provided by the bson
library.
These can appear anywhere in the document, not only in its _id
field. Moreover, different types of identifier can appear in different parts of the same document. And these identifiers can be part of filtering clauses and update/replace directives just like any other data type.
One of the optional settings of a collection is the "default ID type": that is, it is possible to specify what kind of identifiers the server should supply
for documents without an explicit _id
field. (For details, see the create_collection
method and Data API createCollection
command in the Collections reference.) Regardless of the defaultId
setting, however, identifiers of any type can be explicitly provided for documents. For example, during insertions, and will be honored by the insertion process.
-
Python
-
TypeScript
-
Java
-
cURL
from astrapy.ids import (
ObjectId,
uuid1,
uuid3,
uuid4,
uuid5,
uuid6,
uuid7,
uuid8,
UUID,
)
AstraPy recognizes uuid
versions 1 through 8 (with the exception of 2) as provided by the uuid
and uuid6
Python libraries, as well as the ObjectId
from the bson
package. Furthermore, out of convenience, these same utilities are exposed as shown in the example above.
You can then generate new identifiers with statements such as new_id = uuid8()
or new_obj_id = ObjectId()
.
Keep in mind that all uuid
versions are instances of the same class (UUID
), which exposes a version
property, should you need to access it.
Here is a short example of the concepts:
from astrapy import DataAPIClient
from astrapy.ids import ObjectId, uuid8, UUID
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_one({"_id": uuid8(), "tag": "new_id_v_8"})
collection.insert_one(
{"_id": UUID("018e77bc-648d-8795-a0e2-1cad0fdd53f5"), "tag": "id_v_8"}
)
collection.insert_one({"id": ObjectId(), "tag": "new_obj_id"})
collection.insert_one(
{"id": ObjectId("6601fb0f83ffc5f51ba22b88"), "tag": "obj_id"}
)
collection.find_one_and_update(
{"_id": ObjectId("6601fb0f83ffc5f51ba22b88")},
{"$set": {"item_inventory_id": UUID("1eeeaf80-e333-6613-b42f-f739b95106e6")}},
)
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Insert a single document
Insert a single document into a collection.
-
Python
-
TypeScript
-
Java
-
cURL
insert_result = collection.insert_one({"name": "Jane Doe"})
Insert a document with an associated vector.
insert_result = collection.insert_one(
{"name": "Jane Doe"},
vector=[.08, .68, .30],
)
Parameters:
Name | Type | Description |
---|---|---|
document |
|
The dictionary expressing the document to insert. The |
vector |
|
A vector (a list of numbers appropriate for the collection) for the document. Passing this parameter is equivalent to providing the vector in the "$vector" field of the document itself, however the two are mutually exclusive. |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
# Insert a document with a specific ID
response1 = collection.insert_one(
{
"_id": 101,
"name": "John Doe",
},
vector=[.12, .52, .32],
)
# Insert a document without specifying an ID
# so that _id
is generated automatically
response2 = collection.insert_one(
{
"name": "Jane Doe",
},
vector=[.08, .68, .30],
)
Collection<Schema>.insertOne(
document: MaybeId<Schema>,
options?: InsertOneOptions,
): Promise<InsertOneResult<Schema>>
Parameters:
Name | Type | Description |
---|---|---|
document |
The document to insert. If the document does not have an |
|
options? |
The options for this operation. |
Options (InsertOneOptions
):
Name | Type | Description |
---|---|---|
|
The vector for the document. Equivalent to providing the vector in the |
|
|
The maximum time in milliseconds that the client should wait for the operation to complete. |
Returns:
Promise<InsertOneResult<Schema>>
- A promise that resolves to the inserted ID.
Example:
import { DataApiClient } from '@datastax/astra-db-ts';
// Reference an untyped collection
const db = new DataApiClient('TOKEN').db('API_ENDPOINT');
const collection = db.collection('my_collection');
(async () => {
// Insert a document with a specific ID
await collection.insertOne({ _id: '1', name: 'John Doe' });
// Insert a document with an autogenerated ID
await collection.insertOne({ name: 'Jane Doe' });
// Insert a document with a vector
await collection.insertOne({ name: 'Jane Doe' }, { vector: [.12, .52, .32] });
await collection.insertOne({ name: 'Jane Doe', $vector: [.12, .52, .32] });
})();
TBD
cURL -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"insertOne": {
"document": {
"_id": "1",
"purchase_type": "Online",
"$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
"customer": {
"name": "Jim A.",
"phone": "123-456-1111",
"age": 51,
"credit_score": 782,
"address": {
"address_line": "1234 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1690045891},
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [
{
"car" : "BMW 330i Sedan",
"color": "Silver"
},
"Extended warranty - 5 years"
],
"amount": 47601,
"status" : "active",
"preferred_customer" : true
}
}
}' | json_pp
Properties:
Name | Type | Description |
---|---|---|
insertOne |
command |
Data API designation that a single document is inserted. |
Response
{
"status": {
"insertedIds": [
"1"
]
}
}
Insert many documents
Insert multiple documents into a collection.
-
Python
-
TypeScript
-
Java
-
cURL
response = collection.insert_many(
[
{
"_id": 101,
"name": "John Doe",
},
{
# ID is generated automatically
"name": "Jane Doe",
},
],
vectors=[
[.12, .52, .32],
[.08, .68, .30],
],
ordered=True,
)
Returns:
InsertManyResult
- An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.
Example response
InsertManyResult(raw_results=[{'status': {'insertedIds': [101, '81077d86-05dc-43ca-877d-8605dce3ca4d']}}], inserted_ids=[101, '81077d86-05dc-43ca-877d-8605dce3ca4d'])
Parameters:
Name | Type | Description |
---|---|---|
documents |
|
An iterable of dictionaries, each a document to insert. Documents may specify their |
vectors |
|
An optional list of vectors (as many vectors as the provided documents) to associate to the documents when inserting. Each vector is added to the corresponding document prior to insertion on database. The list can be a mixture of None and vectors, in which case some documents will not have a vector, unless it is specified in their "$vector" field already. Passing vectors this way is indeed equivalent to the "$vector" field of the documents, however the two are mutually exclusive. |
ordered |
|
If True (default), the insertions are processed sequentially. If False, they can occur in arbitrary order and possibly concurrently. |
chunk_size |
|
How many documents to include in a single API request. Exceeding the server maximum allowed value results in an error. Leave it unspecified (recommended) to use the system default. |
concurrency |
|
Maximum number of concurrent requests to the API at a given time. It cannot be more than one for ordered insertions. |
max_time_ms |
|
A timeout, in milliseconds, for the operation. |
Unless there are specific reasons not to, it is recommended to prefer |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many([{"a": 10}, {"a": 5}, {"b": [True, False, False]}])
collection.insert_many(
[{"seq": i} for i in range(50)],
ordered=False,
concurrency=5,
)
# The following are three equivalent statements:
collection.insert_many(
[{"tag": "a"}, {"tag": "b"}],
vectors=[[1, 2], [3, 4]],
)
collection.insert_many(
[{"tag": "a", "$vector": [1, 2]}, {"tag": "b"}],
vectors=[None, [3, 4]],
)
collection.insert_many(
[
{"tag": "a", "$vector": [1, 2]},
{"tag": "b", "$vector": [3, 4]},
]
)
Collection<Schema>.insertMany(
documents: MaybeId<Schema>[],
options?: InsertManyOptions,
): Promise<InsertManyResult<Schema>>
Parameters:
Name | Type | Description |
---|---|---|
documents |
The documents to insert. If any document does not have an |
|
options? |
The options for this operation. |
Options (InsertManyOptions
):
Name | Type | Description |
---|---|---|
|
You may set the |
|
|
You can set the |
|
|
Control how many documents are sent each network request. Defaults to |
|
|
An array of vectors to associate with each document. If a vector is |
Unless there are specific reasons not to, it is recommended to prefer to leave ordered |
Returns:
Promise<InsertManyResult<Schema>>
- A promise that resolves to the inserted IDs.
Example:
import { DataApiClient, InsertManyError } from '@datastax/astra-db-ts';
// Reference an untyped collection
const db = new DataApiClient('TOKEN').db('API_ENDPOINT');
const collection = db.collection('my_collection');
(async () => {
try {
// Insert many documents
await collection.insertMany([
{ _id: '1', name: 'John Doe' },
{ name: 'Jane Doe' }, // Will autogen ID
], { ordered: true });
// Insert many with vectors
await collection.insertMany([
{ name: 'John Doe', $vector: [.12, .52, .32] },
{ name: 'Jane Doe' },
{ name: 'Jane Doe', $vector: [.32, .52, .12] },
]);
await collection.insertMany([
{ name: 'John Doe' },
{ name: 'Jane Doe' },
{ name: 'Dane Joe' },
], {
vectors: [
[.12, .52, .32],
null,
[.32, .52, .12],
],
ordered: true,
});
} catch (e) {
if (e instanceof InsertManyError) {
console.log(e.insertedIds);
}
}
})();
TBD
The following Data API insertMany
command adds 20 documents to a collection.
cURL -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"insertMany": {
"documents": [
{
"_id": "2",
"purchase_type": "Online",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
"customer": {
"name": "Jack B.",
"phone": "123-456-2222",
"age": 34,
"credit_score": 700,
"address": {
"address_line": "888 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1690391491},
"seller": {
"name": "Tammy S.",
"location": "Staten Island NYC"
},
"items": [
{
"car" : "Tesla Model 3",
"color": "White"
},
"Extended warranty - 10 years",
"Service - 5 years"
],
"amount": 53990,
"status" : "active"
},
{
"_id": "3",
"purchase_type": "Online",
"$vector": [0.15, 0.1, 0.1, 0.35, 0.55],
"customer": {
"name": "Jill D.",
"phone": "123-456-3333",
"age": 30,
"credit_score": 742,
"address": {
"address_line": "12345 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1690564291},
"seller": {
"name": "Jasmine S.",
"location": "Brooklyn NYC"
},
"items": "Extended warranty - 10 years",
"amount": 4600,
"status" : "active"
},
{
"_id": "4",
"purchase_type": "In Person",
"$vector": [0.25, 0.25, 0.25, 0.25, 0.26],
"customer": {
"name": "Lester M.",
"phone": "123-456-4444",
"age": 40,
"credit_score": 802,
"address": {
"address_line": "12346 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1690909891},
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [
{
"car" : "BMW 330i Sedan",
"color": "Red"
},
"Extended warranty - 5 years",
"Service - 5 years"
],
"amount": 48510,
"status" : "active"
},
{
"_id": "5",
"purchase_type": "Online",
"$vector": [0.25, 0.045, 0.38, 0.31, 0.67],
"customer": {
"name": "David C.",
"phone": "123-456-5555",
"age": 50,
"credit_score": 800,
"address": {
"address_line": "32345 Main Ave",
"city": "Jersey City",
"state": "NJ"
}
},
"purchase_date": {"$date": 1690996291},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": [
{
"car" : "Tesla Model S",
"color": "Red"
},
"Extended warranty - 5 years"
],
"amount": 94990,
"status" : "active"
},
{
"_id": "6",
"purchase_type": "In Person",
"$vector": [0.11, 0.02, 0.78, 0.10, 0.27],
"customer": {
"name": "Chris E.",
"phone": "123-456-6666",
"age": 43,
"credit_score": 764,
"address": {
"address_line": "32346 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1691860291},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": [
{
"car" : "Tesla Model X",
"color": "Blue"
}
],
"amount": 109990,
"status" : "active"
},
{
"_id": "7",
"purchase_type": "Online",
"$vector": [0.21, 0.22, 0.33, 0.44, 0.53],
"customer": {
"name": "Jeff G.",
"phone": "123-456-7777",
"age": 66,
"credit_score": 802,
"address": {
"address_line": "22999 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1692119491},
"seller": {
"name": "Jasmine S.",
"location": "Brooklyn NYC"
},
"items": [{
"car" : "BMW M440i Gran Coupe",
"color": "Black"
},
"Extended warranty - 5 years"],
"amount": 61050,
"status" : "active"
},
{
"_id": "8",
"purchase_type": "In Person",
"$vector": [0.3, 0.23, 0.15, 0.17, 0.4],
"customer": {
"name": "Harold S.",
"phone": "123-456-8888",
"age": 29,
"credit_score": 710,
"address": {
"address_line": "1234 Main St",
"city": "Orange",
"state": "NJ"
}
},
"purchase_date": {"$date": 1693329091},
"seller": {
"name": "Tammy S.",
"location": "Staten Island NYC"
},
"items": [{
"car" : "BMW X3 SUV",
"color": "Black"
},
"Extended warranty - 5 years"
],
"amount": 46900,
"status" : "active"
},
{
"_id": "9",
"purchase_type": "Online",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.06],
"customer": {
"name": "Richard Z.",
"phone": "123-456-9999",
"age": 22,
"credit_score": 690,
"address": {
"address_line": "22345 Broadway",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1693588291},
"seller": {
"name": "Jasmine S.",
"location": "Brooklyn NYC"
},
"items": [{
"car" : "Tesla Model 3",
"color": "White"
},
"Extended warranty - 5 years"
],
"amount": 53990,
"status" : "active"
},
{
"_id": "10",
"purchase_type": "In Person",
"$vector": [0.25, 0.045, 0.38, 0.31, 0.68],
"customer": {
"name": "Eric B.",
"phone": null,
"age": 54,
"credit_score": 780,
"address": {
"address_line": "9999 River Rd",
"city": "Fair Haven",
"state": "NJ"
}
},
"purchase_date": {"$date": 1694797891},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": [{
"car" : "Tesla Model S",
"color": "Black"
}
],
"amount": 93800,
"status" : "active"
},
{
"_id": "11",
"purchase_type": "Online",
"$vector": [0.44, 0.11, 0.33, 0.22, 0.88],
"customer": {
"name": "Ann J.",
"phone": "123-456-1112",
"age": 47,
"credit_score": 660,
"address": {
"address_line": "99 Elm St",
"city": "Fair Lawn",
"state": "NJ"
}
},
"purchase_date": {"$date": 1695921091},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": [{
"car" : "Tesla Model Y",
"color": "White"
},
"Extended warranty - 5 years"
],
"amount": 57500,
"status" : "active"
},
{
"_id": "12",
"purchase_type": "In Person",
"$vector": [0.33, 0.44, 0.55, 0.77, 0.66],
"customer": {
"name": "John T.",
"phone": "123-456-1123",
"age": 55,
"credit_score": 786,
"address": {
"address_line": "23 Main Blvd",
"city": "Staten Island",
"state": "NY"
}
},
"purchase_date": {"$date": 1696180291},
"seller": {
"name": "Tammy S.",
"location": "Staten Island NYC"
},
"items": [{
"car" : "BMW 540i xDrive Sedan",
"color": "Black"
},
"Extended warranty - 5 years"
],
"amount": 64900,
"status" : "active"
},
{
"_id": "13",
"purchase_type": "Online",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.07],
"customer": {
"name": "Aaron W.",
"phone": "123-456-1133",
"age": 60,
"credit_score": 702,
"address": {
"address_line": "1234 4th Ave",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1697389891},
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [{
"car" : "Tesla Model 3",
"color": "White"
},
"Extended warranty - 5 years"
],
"amount": 55000,
"status" : "active"
},
{
"_id": "14",
"purchase_type": "In Person",
"$vector": [0.11, 0.02, 0.78, 0.21, 0.27],
"customer": {
"name": "Kris S.",
"phone": "123-456-1144",
"age": 44,
"credit_score": 702,
"address": {
"address_line": "1414 14th Pl",
"city": "Brooklyn",
"state": "NY"
}
},
"purchase_date": {"$date": 1698513091},
"seller": {
"name": "Jasmine S.",
"location": "Brooklyn NYC"
},
"items": [{
"car" : "Tesla Model X",
"color": "White"
}
],
"amount": 110400,
"status" : "active"
},
{
"_id": "15",
"purchase_type": "Online",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.08],
"customer": {
"name": "Maddy O.",
"phone": "123-456-1155",
"age": 41,
"credit_score": 782,
"address": {
"address_line": "1234 Maple Ave",
"city": "West New York",
"state": "NJ"
}
},
"purchase_date": {"$date": 1701191491},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": {
"car" : "Tesla Model 3",
"color": "White"
},
"amount": 52990,
"status" : "active"
},
{
"_id": "16",
"purchase_type": "In Person",
"$vector": [0.44, 0.11, 0.33, 0.22, 0.88],
"customer": {
"name": "Tim C.",
"phone": "123-456-1166",
"age": 38,
"credit_score": 700,
"address": {
"address_line": "1234 Main St",
"city": "Staten Island",
"state": "NY"
}
},
"purchase_date": {"$date": 1701450691},
"seller": {
"name": "Tammy S.",
"location": "Staten Island NYC"
},
"items": [{
"car" : "Tesla Model Y",
"color": "White"
},
"Extended warranty - 5 years"
],
"amount": 58990,
"status" : "active"
},
{
"_id": "17",
"purchase_type": "Online",
"$vector": [0.1, 0.15, 0.3, 0.12, 0.09],
"customer": {
"name": "Yolanda Z.",
"phone": "123-456-1177",
"age": 61,
"credit_score": 694,
"address": {
"address_line": "1234 Main St",
"city": "Hoboken",
"state": "NJ"
}
},
"purchase_date": {"$date": 1702660291},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": [{
"car" : "Tesla Model 3",
"color": "Blue"
},
"Extended warranty - 5 years"
],
"amount": 54900,
"status" : "active"
},
{
"_id": "18",
"purchase_type": "Online",
"$vector": [0.15, 0.17, 0.15, 0.43, 0.55],
"customer": {
"name": "Thomas D.",
"phone": "123-456-1188",
"age": 45,
"credit_score": 724,
"address": {
"address_line": "98980 20th St",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1703092291},
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [{
"car" : "BMW 750e xDrive Sedan",
"color": "Black"
},
"Extended warranty - 5 years"
],
"amount": 106900,
"status" : "active"
},
{
"_id": "19",
"purchase_type": "Online",
"$vector": [0.25, 0.25, 0.25, 0.25, 0.27],
"customer": {
"name": "Vivian W.",
"phone": "123-456-1199",
"age": 20,
"credit_score": 698,
"address": {
"address_line": "5678 Elm St",
"city": "Hartford",
"state": "CT"
}
},
"purchase_date": {"$date": 1704215491},
"seller": {
"name": "Jasmine S.",
"location": "Brooklyn NYC"
},
"items": [{
"car" : "BMW 330i Sedan",
"color": "White"
},
"Extended warranty - 5 years"
],
"amount": 46980,
"status" : "active"
},
{
"_id": "20",
"purchase_type": "In Person",
"$vector": [0.44, 0.11, 0.33, 0.22, 0.88],
"customer": {
"name": "Leslie E.",
"phone": null,
"age": 44,
"credit_score": 782,
"address": {
"address_line": "1234 Main St",
"city": "Newark",
"state": "NJ"
}
},
"purchase_date": {"$date": 1705338691},
"seller": {
"name": "Jim A.",
"location": "Jersey City NJ"
},
"items": [{
"car" : "Tesla Model Y",
"color": "Black"
},
"Extended warranty - 5 years"
],
"amount": 59800,
"status" : "active"
},
{
"_id": "21",
"purchase_type": "In Person",
"$vector": [0.21, 0.22, 0.33, 0.44, 0.53],
"customer": {
"name": "Rachel I.",
"phone": null,
"age": 62,
"credit_score": 786,
"address": {
"address_line": "1234 Park Ave",
"city": "New York",
"state": "NY"
}
},
"purchase_date": {"$date": 1706202691},
"seller": {
"name": "Jon B.",
"location": "Manhattan NYC"
},
"items": [{
"car" : "BMW M440i Gran Coupe",
"color": "Silver"
},
"Extended warranty - 5 years",
"Gap Insurance - 5 years"
],
"amount": 65250,
"status" : "active"
}
],
"options": {
"ordered": false
}
}
}' | json_pp
Response
{
"status" : {
"insertedIds" : [
"4",
"7",
"10",
"13",
"16",
"19",
"21",
"18",
"6",
"12",
"15",
"9",
"3",
"11",
"2",
"17",
"14",
"8",
"20",
"5"
]
}
}
Properties:
Name | Type | Description |
---|---|---|
insertMany |
command |
Data API designation that many documents (up to 20 at a time) are being inserted. |
Find a document
Retrieve a single document from a collection using various options.
-
Python
-
TypeScript
-
Java
-
cURL
Retrieve a single document from a collection by its _id
.
document = collection.find_one({"_id": 101})
Retrieve a single document from a collection by any attribute, as long as it is covered by the collection’s indexing configuration.
As noted in The Indexing option in the Collections reference topic, any field that is part of a subsequent filter or sort operation must be an indexed field. If you elected to not index certain or all fields when you created the collection, you cannot reference that field in a filter/sort query. |
document = collection.find_one({"location": "warehouse_C"})
Retrieve a single document from a collection by an arbitrary filtering clause.
document = collection.find_one({"tag": {"$exists": True}})
Retrieve the most similar document to a given vector.
result = collection.find_one({}, vector=[.12, .52, .32])
Retrieve only specific fields from a document.
result = collection.find_one({"_id": 101}, projection={"name": True})
Returns:
Union[Dict[str, Any], None]
- Either the found document as a dictionary or None if no matching document is found.
Example response
{'_id': 101, 'name': 'John Doe', '$vector': [0.12, 0.52, 0.32]}
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
projection |
|
Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents. |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, Approximate Nearest Neighbors (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with |
include_similarity |
|
A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in the returned document. Can only be used for vector ANN search, i.e. when either |
sort |
|
With this dictionary parameter one can control the order the documents are returned. See the Note about sorting for details. |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.find_one({})
# prints: {'_id': '68d1e515-...', 'seq': 37}
collection.find_one({"seq": 10})
# prints: {'_id': 'd560e217-...', 'seq': 10}
collection.find_one({"seq": 1011})
# (returns None for no matches)
collection.find_one({}, projection={"seq": False})
# prints: {'_id': '68d1e515-...'}
collection.find_one(
{},
sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
# prints: {'_id': '97e85f81-...', 'seq': 69}
collection.find_one({}, vector=[1, 0])
# prints: {'_id': '...', 'tag': 'D', '$vector': [4.0, 1.0]}
TBD
TBD
This Data API findOne
command retrieves a document based on a filter using a specific _id
value.
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"findOne": {
"filter": {"_id" : "14"}
}
}' | json_pp
Result:
{
"data" : {
"document" : {
"$vector" : [
0.11,
0.02,
0.78,
0.21,
0.27
],
"_id" : "14",
"amount" : 110400,
"customer" : {
"address" : {
"address_line" : "1414 14th Pl",
"city" : "Brooklyn",
"state" : "NY"
},
"age" : 44,
"credit_score" : 702,
"name" : "Kris S.",
"phone" : "123-456-1144"
},
"items" : [
{
"car" : "Tesla Model X",
"color" : "White"
}
],
"purchase_date" : {
"$date" : 1698513091
},
"purchase_type" : "In Person",
"seller" : {
"location" : "Brooklyn NYC",
"name" : "Jasmine S."
},
"status" : "active"
}
}
}
Find documents using filtering options
Iterate over documents in a collection matching a given filter.
-
Python
-
TypeScript
-
Java
-
cURL
doc_iterator = collection.find({"category": "house_appliance"}, limit=10)
Iterate over the documents most similar to a given query vector.
doc_iterator = collection.find({}, vector=[0.55, -0.40, 0.08], limit=5)
Returns:
Cursor
- A cursor for iterating over documents. An AstraPy cursor can
be used in a for loop, and provides a few additional features.
Example response
Cursor("vector_collection", new, retrieved: 0)
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
projection |
|
Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents. |
skip |
|
With this integer parameter, what would be the first |
limit |
|
This (integer) parameter sets a limit over how many documents are returned. Once |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search; that is, Approximate Nearest Neighbors (ANN) search. When running similarity search on a collection, no other sorting criteria can be specified. Moreover, there is an upper bound to the number of documents that can be returned. For details, see the Data API Limits. |
include_similarity |
|
A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in each returned document. Can only be used for vector ANN search, i.e. when either |
sort |
|
With this dictionary parameter one can control the order the documents are returned. See the Note about sorting, as well as the one about upper bounds, for details. |
max_time_ms |
|
A timeout, in milliseconds, for each single one of the underlying HTTP requests used to fetch documents as the cursor is iterated over. |
TBD
TBD
TBD
Example values for sort operations
-
Python
-
TypeScript
-
Java
-
cURL
When no particular order is required:
sort={} # (default when parameter not provided)
When sorting by a certain value in ascending/descending order:
from astrapy.constants import SortDocuments
sort={"field": SortDocuments.ASCENDING}
sort={"field": SortDocuments.DESCENDING}
When sorting first by "field" and then by "subfield"
(while modern Python versions preserve the order of dictionaries,
it is suggested for clarity to employ a collections.OrderedDict
in these cases):
sort={
"field": SortDocuments.ASCENDING,
"subfield": SortDocuments.ASCENDING,
}
When running a vector similarity (ANN) search:
sort={"$vector": [0.4, 0.15, -0.5]}
Some combinations of arguments impose an implicit upper bound on the number of documents that are returned by the Data API. More specifically:
Keep in mind these provisions even when subsequently running a command such as |
When not specifying sorting criteria at all (by vector or otherwise), the cursor can scroll through an arbitrary number of documents as the Data API and the client periodically exchange new chunks of documents. The behavior of the cursor — in the case that documents have been added/removed after the |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
filter = {"seq": {"$exists": True}}
for doc in collection.find(filter, projection={"seq": True}, limit=5):
print(doc["seq"])
...
# will print e.g.:
# 37
# 35
# 10
# 36
# 27
cursor1 = collection.find(
{},
limit=4,
sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
[doc["_id"] for doc in cursor1]
# prints: ['97e85f81-...', '1581efe4-...', '...', '...']
cursor2 = collection.find({}, limit=3)
cursor2.distinct("seq")
# prints: [37, 35, 10]
collection.insert_many([
{"tag": "A", "$vector": [4, 5]},
{"tag": "B", "$vector": [3, 4]},
{"tag": "C", "$vector": [3, 2]},
{"tag": "D", "$vector": [4, 1]},
{"tag": "E", "$vector": [2, 5]},
])
ann_tags = [
document["tag"]
for document in collection.find(
{},
limit=3,
vector=[3, 3],
)
]
ann_tags
# prints: ['A', 'B', 'C']
# (assuming the collection has metric VectorMetric.COSINE)
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Find and update a document
Locate a document matching a filter condition and apply changes to it, returning the document itself.
-
Python
-
TypeScript
-
Java
-
cURL
collection.find_one_and_update(
{"Marco": {"$exists": True}},
{"$set": {"title": "Mr."}},
)
Locate and update a document, returning the document itself, creating a new one if nothing is found.
collection.find_one_and_update(
{"Marco": {"$exists": True}},
{"$set": {"title": "Mr."}},
upsert=True,
)
Returns:
Dict[str, Any]
- The document that was found, either before or after the update
(or a projection thereof, as requested). If no matches are found, None
is returned.
Example response
{'_id': 999, 'Marco': 'Polo'}
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
update |
|
The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are: |
projection |
|
Used to select a subset of fields in the document being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents. |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with |
sort |
|
With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the |
upsert |
|
This parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the |
return_document |
|
A flag controlling what document is returned: if set to |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_one({"Marco": "Polo"})
collection.find_one_and_update(
{"Marco": {"$exists": True}},
{"$set": {"title": "Mr."}},
)
# prints: {'_id': 'a80106f2-...', 'Marco': 'Polo'}
collection.find_one_and_update(
{"title": "Mr."},
{"$inc": {"rank": 3}},
projection=["title", "rank"],
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'a80106f2-...', 'title': 'Mr.', 'rank': 3}
collection.find_one_and_update(
{"name": "Johnny"},
{"$set": {"rank": 0}},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# (returns None for no matches)
collection.find_one_and_update(
{"name": "Johnny"},
{"$set": {"rank": 0}},
upsert=True,
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'cb4ef2ab-...', 'name': 'Johnny', 'rank': 0}
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello DevOps cURL world. TBD.
Update a document
Update a single document on the collection as requested.
-
Python
-
TypeScript
-
Java
-
cURL
update_result = collection.update_one(
{"_id": 456},
{"$set": {"name": "John Smith"}},
)
Update a single document on the collection, inserting a new one if no match is found.
update_result = collection.update_one(
{"_id": 456},
{"$set": {"name": "John Smith"}},
upsert=True,
)
Returns:
UpdateResult
- An object representing the response from the database after the update operation. It includes information about the operation.
Example response
UpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'name': 'John Doe'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
update |
|
The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are: |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with |
sort |
|
With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the |
upsert |
|
This parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_one({"Marco": "Polo"})
collection.update_one({"Marco": {"$exists": True}}, {"$inc": {"rank": 3}})
# prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})
collection.update_one({"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}})
# prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0})
collection.update_one(
{"Mirko": {"$exists": True}},
{"$inc": {"rank": 3}},
upsert=True,
)
# prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '2a45ff60-...'})
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Update multiple documents
Update multiple documents in a collection.
-
Python
-
TypeScript
-
Java
-
cURL
results = collection.update_many(
{"name": {"$exists": False}},
{"$set": {"name": "unknown"}},
)
Update multiple documents in a collection, inserting a new one if no matches are found.
results = collection.update_many(
{"name": {"$exists": False}},
{"$set": {"name": "unknown"}},
upsert=True,
)
Returns:
UpdateResult
- An object representing the response from the database after the update operation. It includes information about the operation.
Example response
UpdateResult(raw_results=[{'status': {'matchedCount': 2, 'modifiedCount': 2}}], update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2})
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
update |
|
The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are: |
upsert |
|
This parameter controls the behavior in absence of matches. If True, a single new document (resulting from applying |
max_time_ms |
|
A timeout, in milliseconds, for the operation. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many([{"c": "red"}, {"c": "green"}, {"c": "blue"}])
collection.update_many({"c": {"$ne": "green"}}, {"$set": {"nongreen": True}})
# prints: UpdateResult(raw_results=..., update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2})
collection.update_many({"c": "orange"}, {"$set": {"is_also_fruit": True}})
# prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0})
collection.update_many(
{"c": "orange"},
{"$set": {"is_also_fruit": True}},
upsert=True,
)
# prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '46643050-...'})
Hello TypeScript world. TBD.
Hello Java world. TBD.
Use the Data API updateMany command to update multiple documents in a collection.
In this example, the JSON payload uses the $set
update operator to change a status to "inactive" for those documents that have an "active" status.
The updateMany
command includes pagination support in the event more documents that matched the filter are on a subsequent page. For more, see the pagination note after the cURL example.
Example:
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"updateMany": {
"filter": {"status" : "active" },
"update" : {"$set" : { "status" : "inactive"}}
}
}' | json_pp
Result:
{
"status" : {
"matchedCount" : 20,
"modifiedCount" : 20,
"moreData" : true
}
}
Name | Type | Description |
---|---|---|
updateMany |
command |
Updates multiple documents in the database’s collection. |
filter |
object |
Defines the criteria for selecting documents to which the command applies. The filter looks for documents where:
* |
update |
object |
Specifies the modifications to be applied to all documents that match the criteria set by the filter. |
$set |
operator |
An update operator indicating that the operation should overwrite the value of a property (or properties) in the selected documents. |
status |
String |
Specifies the property in the document to update. In this example, active or inactive will be set for all selected documents. In this context, it’s changing the |
In the
|
Follow the sequence of one or more insertMany
commands until all pages with documents matching the filter have the update applied.
Find distinct values across documents
Get a list of the distinct values of a certain key in a collection.
-
Python
-
TypeScript
-
Java
-
cURL
collection.distinct("category")
Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
collection.distinct(
"food.allergies",
filter={"registered_for_dinner": True},
)
Returns:
List[Any]
- A list of the distinct values encountered. Documents that lack the requested key are ignored.
Example response
['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]
Parameters:
Name | Type | Description |
---|---|---|
key |
|
The name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable |
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
max_time_ms |
|
A timeout, in milliseconds, for the operation. |
Keep in mind that |
For details on the behavior of "distinct" in conjunction with real-time changes in the collection contents, see the discussion in the Sort examples values section.
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many(
[
{"name": "Marco", "food": ["apple", "orange"], "city": "Helsinki"},
{"name": "Emma", "food": {"likes_fruit": True, "allergies": []}},
]
)
collection.distinct("name")
# prints: ['Marco', 'Emma']
collection.distinct("city")
# prints: ['Helsinki']
collection.distinct("food")
# prints: ['apple', 'orange', {'likes_fruit': True, 'allergies': []}]
collection.distinct("food.1")
# prints: ['orange']
collection.distinct("food.allergies")
# prints: []
collection.distinct("food.likes_fruit")
# prints: [True]
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Count documents in a collection
Get the count of all documents in a collection.
-
Python
-
TypeScript
-
Java
-
cURL
collection.count_documents({}, upper_bound=500)
Get the count of the documents in a collection matching a condition.
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=50)
Returns:
int
- The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.
Example response
320
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
upper_bound |
|
A required ceiling on the result of the count operation. If the actual number of documents exceeds this value, an exception will be raised. Furthermore, if the actual number of documents exceeds the maximum count that the Data API can reach (regardless of upper_bound), an exception will be raised. |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many([{"seq": i} for i in range(20)])
collection.count_documents({}, upper_bound=100)
# prints: 20
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=100)
# prints: 4
collection.count_documents({}, upper_bound=10)
# Raises: astrapy.exceptions.TooManyDocumentsToCountException
Hello TypeScript world. TBD.
Hello Java world. TBD.
Use the Data API estimatedDocumentCount
command to return the approximate number of documents in the collection.
In the |
curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
"estimatedDocumentCount": {
}
}' | json_pp
Result:
{
"status": {
"count": 21
}
}
Properties:
Name | Type | Description |
---|---|---|
estimatedDocumentCount |
command |
Returns an estimated count of documents within the context of the specified collection. |
The object is { } empty, meaning there are no filters or options for this implementation of the estimatedDocumentCount
command.
Find and replace a document
Locate a document matching a filter condition and replace it with a new document, returning the document itself.
-
Python
-
TypeScript
-
Java
-
cURL
collection.find_one_and_replace(
{"_id": "rule1"},
{"text": "some animals are more equal!"},
)
Locate and replace a document, returning the document itself, additionally creating it if nothing is found.
collection.find_one_and_replace(
{"_id": "rule1"},
{"text": "some animals are more equal!"},
upsert=True,
)
Returns:
Dict[str, Any]
- The document that was found, either before or after the
replacement (or a projection thereof, as requested). If no matches are found,
None
is returned.
Example response
{'_id': 'rule1', 'text': 'all animals are equal'}
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
replacement |
|
the new document to write into the collection. |
projection |
|
Used to select a subset of fields in the document being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents. |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with |
sort |
|
With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the |
upsert |
|
This parameter controls the behavior in absence of matches. If True, |
return_document |
|
A flag controlling what document is returned: if set to |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
import astrapy
collection.insert_one({"_id": "rule1", "text": "all animals are equal"})
collection.find_one_and_replace(
{"_id": "rule1"},
{"text": "some animals are more equal!"},
)
# prints: {'_id': 'rule1', 'text': 'all animals are equal'}
collection.find_one_and_replace(
{"text": "some animals are more equal!"},
{"text": "and the pigs are the rulers"},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'rule1', 'text': 'and the pigs are the rulers'}
collection.find_one_and_replace(
{"_id": "rule2"},
{"text": "F=ma^2"},
return_document=astrapy.constants.ReturnDocument.AFTER,
)
# (returns None for no matches)
collection.find_one_and_replace(
{"_id": "rule2"},
{"text": "F=ma"},
upsert=True,
return_document=astrapy.constants.ReturnDocument.AFTER,
projection={"_id": False},
)
# prints: {'text': 'F=ma'}
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Replace a document
Replace a document in the collection with a new one.
-
Python
-
TypeScript
-
Java
-
cURL
replace_result = collection.replace_one(
{"Marco": {"$exists": True}},
{"Buda": "Pest"},
)
Replace a document in the collection with a new one, creating a new one if no match is found.
replace_result = collection.replace_one(
{"Marco": {"$exists": True}},
{"Buda": "Pest"},
upsert=True,
)
Returns:
UpdateResult
- An object representing the response from the database after the replace operation. It includes information about the operation.
Example response
UpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'Marco': 'Polo'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
replacement |
|
the new document to write into the collection. |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with |
sort |
|
With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the |
upsert |
|
This parameter controls the behavior in absence of matches. If True, |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_one({"Marco": "Polo"})
collection.replace_one({"Marco": {"$exists": True}}, {"Buda": "Pest"})
prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})
collection.find_one({"Buda": "Pest"})
prints: {'_id': '8424905a-...', 'Buda': 'Pest'}
collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"})
prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0})
collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}, upsert=True)
prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '931b47d6-...'})
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Find and delete a document
Locate a document matching a filter condition and delete it, returning the document itself.
-
Python
-
TypeScript
-
Java
-
cURL
collection.find_one_and_delete({"status": "stale_entry"})
Returns:
Dict[str, Any]
- The document that was just deleted (or a projection thereof, as requested). If no matches are
found, None
is returned.
Example response
{'_id': 199, 'status': 'stale_entry', 'request_id': 'A4431'}
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
projection |
|
Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents. |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, Approximate Nearest Neighbors (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with |
sort |
|
With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many(
[
{"species": "swan", "class": "Aves"},
{"species": "frog", "class": "Amphibia"},
],
)
collection.find_one_and_delete(
{"species": {"$ne": "frog"}},
projection=["species"],
)
# prints: {'_id': '5997fb48-...', 'species': 'swan'}
collection.find_one_and_delete({"species": {"$ne": "frog"}})
# (returns None for no matches)
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Delete a document
Locate and delete a single document from a collection.
-
Python
-
TypeScript
-
Java
-
cURL
response = collection.delete_one({ "_id": "1" })
Locate and delete a single document from a collection by any attribute (as long as it is covered by the collection’s indexing configuration).
document = collection.delete_one({"location": "warehouse_C"})
Locate and delete a single document from a collection by an arbitrary filtering clause.
document = collection.delete_one({"tag": {"$exists": True}})
Delete the most similar document to a given vector.
result = collection.delete_one({}, vector=[.12, .52, .32])
Returns:
DeleteResult
- An object representing the response from the database after the delete operation. It includes information about the success of the operation.
Example response
DeleteResult(raw_results=[{'status': {'deletedCount': 1}}], deleted_count=1)
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
vector |
|
A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with |
sort |
|
With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the |
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}])
collection.delete_one({"seq": 1})
# prints: DeleteResult(raw_results=..., deleted_count=1)
collection.distinct("seq")
# prints: [0, 2]
collection.delete_one(
{"seq": {"$exists": True}},
sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
# prints: DeleteResult(raw_results=..., deleted_count=1)
collection.distinct("seq")
# prints: [0]
collection.delete_one({"seq": 2})
# prints: DeleteResult(raw_results=..., deleted_count=0)
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Delete documents
Delete multiple documents from a collection.
-
Python
-
TypeScript
-
Java
-
cURL
delete_result = collection.delete_many({"status": "processed"})
Returns:
DeleteResult
- An object representing the response from the database after the delete operation. It includes information about the success of the operation.
Example response
DeleteResult(raw_results=[{'status': {'deletedCount': 2}}], deleted_count=2)
Parameters:
Name | Type | Description |
---|---|---|
filter |
|
A predicate expressed as a dictionary according to the Data API filter syntax. Examples are |
max_time_ms |
|
A timeout, in milliseconds, for the operation. |
This method would not admit an empty filter clause: use the |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}])
collection.delete_many({"seq": {"$lte": 1}})
# prints: DeleteResult(raw_results=..., deleted_count=2)
collection.distinct("seq")
# prints: [2]
collection.delete_many({"seq": {"$lte": 1}})
# prints: DeleteResult(raw_results=..., deleted_count=0)
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Execute multiple write operations
Execute a (reusable) list of write operations on a collection with a single command.
-
Python
-
TypeScript
-
Java
-
cURL
bw_results = collection.bulk_write(
[
InsertMany([{"a": 1}, {"a": 2}]),
ReplaceOne(
{"z": 9},
replacement={"z": 9, "replaced": True},
upsert=True,
),
],
)
Returns:
BulkWriteResult
- A single object summarizing the whole list of requested operations. The keys in the map attributes of the result (when present) are the integer indices of the corresponding operation in the requests
iterable.
Example response
BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'})
Parameters:
Name | Type | Description |
---|---|---|
requests |
|
An iterable over concrete subclasses of |
ordered |
|
Whether to launch the |
concurrency |
|
Maximum number of concurrent operations executing at a given time. It cannot be more than one for ordered bulk writes. |
max_time_ms |
|
A timeout, in milliseconds, for the whole bulk write. Remember that, if the method call times out, then there’s no guarantee about what portion of the bulk write has been received and successfully executed by the Data API. |
Example:
from astrapy import DataAPIClient
from astrapy.operations import (
InsertOne,
InsertMany,
UpdateOne,
UpdateMany,
ReplaceOne,
DeleteOne,
DeleteMany,
)
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
op1 = InsertMany([{"a": 1}, {"a": 2}])
op2 = ReplaceOne({"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True)
collection.bulk_write([op1, op2])
# prints: BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'})
collection.count_documents({}, upper_bound=100)
# prints: 3
collection.distinct("replaced")
# prints: [True]
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Delete all documents from a collection
Delete all documents in a collection.
-
Python
-
TypeScript
-
Java
-
cURL
result = collection.delete_all()
Returns:
Dict
- A dictionary in the form {"ok": 1}
if the method succeeds.
Example response
{'ok': 1}
Parameters:
Name | Type | Description |
---|---|---|
max_time_ms |
|
A timeout, in milliseconds, for the underlying HTTP request. |
Example:
from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
my_coll.distinct("seq")
# prints: [2, 1, 0]
my_coll.count_documents({}, upper_bound=100)
# prints: 4
my_coll.delete_all()
# prints: {'ok': 1}
my_coll.count_documents({}, upper_bound=100)
# prints: 0
Hello TypeScript world. TBD.
Hello Java world. TBD.
Hello cURL world. TBD.
Next steps
See the Administration reference topic.