Skip to content

Conversation

haydenhoang
Copy link
Contributor

@haydenhoang haydenhoang commented Sep 5, 2025

What's in this PR

Added /openapi-generator-template

We can freely modify this template which is used to generate typesense_codegen. This PR added some custom open-api attributes:

  • x-rust-return-type to specify the return type of a generated API method
  • x-rust-generic-parameter to specify the generic params of a generated API method or a model
  • x-rust-type to override the type of a generated model field
  • x-supports-plain-text: this is an existing attribute which indicates that the response will be text instead of JSON
  • x-rust-body-is-raw-text set this to true to avoid JSON serializing plain text request body
  • x-rust-builder: true will create builder for that model using bon

Examples

typesense-rust/openapi.yml

Lines 425 to 433 in 4d4d38e

/collections/{collectionName}/documents/search:
get:
tags:
- documents
summary: Search for documents in a collection
description: Search for documents in a collection that match the search criteria.
operationId: searchCollection
x-rust-generic-parameter: "<D: for<'de> serde::Deserialize<'de> + Serialize>"
x-rust-return-type: "models::SearchResult<D>"

typesense-rust/openapi.yml

Lines 742 to 753 in 4b811e2

/collections/{collectionName}/documents/import:
post:
tags:
- documents
summary: Import documents into a collection
description:
The documents to be imported must be formatted in a newline delimited
JSON structure. You can feed the output file from a Typesense export operation
directly as import.
operationId: importDocuments
x-rust-body-is-raw-text: true
x-supports-plain-text: true

typesense-rust/openapi.yml

Lines 2446 to 2448 in 4b811e2

SearchResult:
type: object
x-rust-generic-parameter: "<D>"

typesense-rust/openapi.yml

Lines 2476 to 2481 in 4b811e2

hits:
type: array
description: The documents that matched the search query
items:
$ref: "#/components/schemas/SearchResultHit"
x-rust-type: "Option<Vec<models::SearchResultHit<D>>>"

typesense-rust/openapi.yml

Lines 2163 to 2168 in 4d4d38e

CollectionSchema:
required:
- name
- fields
type: object
x-rust-builder: true

Multi-node client

All the generated API methods are wrapped in client.execute() which will provide retry & load-balance across all nodes.
For easier review, the client only has Collections, Documents, Keys and Multi search APIs, future PRs will add more APIs.

use typesense::{Client, ExponentialBackoff, models};

Client::builder()
    .nodes(vec!["http://localhost:8108"])
    .api_key("xyz")
    .healthcheck_interval(Duration::from_secs(5))
    .retry_policy(ExponentialBackoff::builder().build_with_max_retries(0))
    .connection_timeout(Duration::from_secs(3))
    .build()
    .expect("Failed to create Typesense client")

struct Product { name: String }

let typed_result: models::SearchResult<Product> = client.multi_search().perform_union(search_requests, common_params).await?;

Please see the integration tests for more examples.

Removed httpmock in CI

The mock data is no longer relevant, it should be removed to avoid overhead. I replaced it with an actual Typesense v30.0.rc10 server.

PR Checklist

@haydenhoang haydenhoang marked this pull request as ready for review September 7, 2025 15:50
@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 10, 2025

First of all, thank you for fixing the return types of the auto-generated code. And /openapi-generator-template looks powerful. But I don't understand where does xtask take an x-rust-return-type variable from?
Also I see that openapi.yml has been modified, and x-rust-return-type was added. I thought that this file is supposed to be an unmodified source copy, and preprocessed_openapi.yml is a modified one (which is confirmed by xtask/src/main.rs). How come is it modified?

I also have like a 100 questions to client/mod.rs optimizations. Mainly the position of code is irrational and it can be put outside of the execute() function, for example this can be preprocessed:

node_url
    .strip_suffix('/')
    .unwrap_or(node_url.as_str())
    .to_string()

You also can prebuild a reqwest::Client for each node, so it would not be recreated on each request (call to execute()).
Arc is unused, as Client is not Clone.
Also I don't see the need for a separate optional nearest_node variable (not the builder). As it has the same logic as all nodes. It can be just pushed at the end of the nodes Vec. And then nodes must be never empty. Which drys get_next_node() (it checks the nearest one first, as it was):

        let (nodes_len, mut index) = if self.has_nearest_node {
            let last_node_index = self.nodes.len() - 1;
            (last_node_index, last_node_index)
        } else {
            (self.nodes.len(), self.current_node_index.fetch_add(1, Ordering::Relaxed) % self.nodes.len())
        };
        for _ in 0..self.nodes.len() {
            let node_arc = self.nodes[index];
            let node = node_arc.lock().unwrap();
            let is_due_for_check = Instant::now().duration_since(node.last_access_timestamp)
                >= self.healthcheck_interval;

            if node.is_healthy || is_due_for_check {
                return Arc::clone(node_arc);
            }
            index = self.current_node_index.fetch_add(1, Ordering::Relaxed) % nodes_len;
        }
        ...

@haydenhoang
Copy link
Contributor Author

Also I see that openapi.yml has been modified, and x-rust-return-type was added. I thought that this file is supposed to be an unmodified source copy, and preprocessed_openapi.yml is a modified one (which is confirmed by xtask/src/main.rs). How come is it modified?

After this PR gets approved which means the /openapi-generator-template approach is good , I will create another PR to sync the new custom attributes to this repo https://github.com/typesense/typesense-api-spec/blob/master/openapi.yml. I modified the openapi.yml locally since I was just testing it out.

@haydenhoang
Copy link
Contributor Author

I also have like a 100 questions to client/mod.rs optimizations. Mainly the position of code is irrational and it can be put outside of the execute() function, for example this can be preprocessed:

Hey @RoDmitry, would you be open to collaborating on this PR with me? I have sent you an invitation to my repository, with that, you can directly push the changes to optimize the client/mod.rs

@haydenhoang
Copy link
Contributor Author

haydenhoang commented Sep 10, 2025

But I don't understand where does xtask take an x-rust-return-type variable from?

It's not xtask but the openapi generator. I modified the template, specifically this line

}}) -> Result<{{#vendorExtensions.x-rust-return-type}}{{{.}}}{{/vendorExtensions.x-rust-return-type}}{{^vendorExtensions.x-rust-return-type}}{{#isResponseFile}}{{#supportAsync}}reqwest::Response{{/supportAsync}}{{^supportAsync}}reqwest::blocking::Response{{/supportAsync}}{{/isResponseFile}}{{^isResponseFile}}{{#supportMultipleResponses}}ResponseContent<{{{operationIdCamelCase}}}Success>{{/supportMultipleResponses}}{{^supportMultipleResponses}}{{^returnType}}(){{/returnType}}{{{returnType}}}{{/supportMultipleResponses}}{{/isResponseFile}}{{/vendorExtensions.x-rust-return-type}}, Error<{{{operationIdCamelCase}}}Error>> {

Xtask is just a convenient way to run the open api generator

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 10, 2025

I will create another PR to sync the new custom attributes to this repo (typesense/typesense-api-spec)

I don't think they need the rust specific variables there. I thought you would preprocess it by xtask the way preprocessed_openapi.yml is generated.

@haydenhoang
Copy link
Contributor Author

haydenhoang commented Sep 10, 2025

I don't think they need the rust specific variables there. I thought you would preprocess it by xtask the way preprocessed_openapi.yml is generated.

Adding a few of rust specific attributes won't hurt. Plus, there are some Go specific attributes in the openapi spec too

      responses:
        '200':
          description: List of all collections
          content:
            application/json:
              schema:
                type: array
                x-go-type: "[]*CollectionResponse"
                items:
                  $ref: "#/components/schemas/CollectionResponse"

Preprocess it by xtask is possible but it's pretty tedious I think, but let's see if doing it through xtask is better...

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 10, 2025

Also it's better to have Rust specific variables in this repository, because it is part of the code which can be changed. And it's just easier to modify and keep it synchronized when it's in a single repo. That's why I would prefer adding it through xtask.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 11, 2025

Just downloaded the project, no changes, and one test fails on my machine:

thread 'client_test::test_health_check_and_node_recovery' (260414) panicked at typesense/tests/client/client_test.rs:183:5:
assertion `left == right` failed
  left: 2
 right: 1

In code: assert_eq!(server1.received_requests().await.unwrap().len(), 1); // No new request

healthcheck_interval = 1000 fixes it. So looks like this test is very dependent on the CPU speed.

Each client::execute() takes a whole second. It's so slow...

P.S. How does it not fail in github tests? They are very slow.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 11, 2025

Optimized client::execute(). Now it takes just microseconds (in dev unoptimized mode), which is a great result.

@RoDmitry
Copy link
Contributor

Looks good. No more points to improve from me.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 15, 2025

While migrating my project to this version of typesense, caught myself on using the wrong collection() method. So I have changed the default collection() to require the type of trait Document. Also it takes the collection name from it, which can be changed using #[typesense(collection_name = "movies").
Can't stop fixing 😅

@RoDmitry RoDmitry force-pushed the multi_nodes_client branch 7 times, most recently from 0320f8b to dd9a723 Compare September 15, 2025 18:23
@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 16, 2025

@haydenhoang do you approve collection() methods rename?

@haydenhoang
Copy link
Contributor Author

So now we have three separate ways to access the single collection API: collection, collection_named and collection_schemaless?

This looks good! But are there any useful usecases for collection_named given that collection already does pretty much the same thing?

I think we need to take Typesense's exclude_fields search parameter into consideration.

Sorry for the late reply, I'm currently not available for a while.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 19, 2025

Yes. Previously I have tried to use multiple collections of the same type with different names. I'm not sure if it's useful, and whether I will use it in the future or not. So I thought maybe collection_named might be desired by somebody.

Didn't understand about exclude_fields, please clarify. Where is it? If it's not blocking, then maybe in the next PR, or is it a breaking change?

@haydenhoang
Copy link
Contributor Author

https://typesense.org/docs/29.0/api/search.html#results-parameters

Like the name suggest, setting this param will exclude fields from search result. I think this would cause a parsing error if those fields aren't marked as optional in the struct.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 20, 2025

It's up to a user, which struct he chooses to use. He can create another struct, without those excluded fields, and pass it to collection, or collection_named (if he does not want to use derive macro. And that's another reason to have collection_named). So I don't see no problem here.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 20, 2025

We have some APIs implemented in this PR, but not all of them. So I think maybe we should expose pub use typesense_codegen but as legacy, so users would have some kind of access to the auto-generated code for others not implemented APIs? And it's name (legacy) shows that it's not the best implementation to use. Also it would allow them for easier update to this version, without changing much code.
And useful if a user has used the part of the now unimplemented API, so it would be possible for a such user to at least partially use a new API.
After implementing all of the APIs we will be able to safely remove this legacy re-export.

@haydenhoang
Copy link
Contributor Author

I think it's best to not reexport those legacy APIs entirely to keep it minimal.

We won't release this until we have implemented all the APIs. I have already had those lined up in the draft PR so this shouldn't take much time.

This is a major version bump, so it is expected for users of old version to migrate.

After implementing all of the APIs we will be able to safely remove this legacy re-export.

Then this will be another breaking change.

Other than that, all looks good!

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 21, 2025

I have already had those lined up in the draft PR so this shouldn't take much time.

Then this re-export is temporal until the next PR with all of the APIs. I will need to check if all of the APIs are implemented, then it can be removed or marked as deprecated. And deprecating old API is a more user-friendly way of making a new API. Not so radical as removing access to an old one.

Then this will be another breaking change.

but

We won't release this until we have implemented all the APIs.

It's ok then. Nothing to worry about.

My opinion is: nobody knows how much time it may take. So legacy is like a backup APIs for now.

@RoDmitry
Copy link
Contributor

RoDmitry commented Sep 23, 2025

@morenol it's ready for merge. We will continue in another PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants