Skip to content

Commit f2679ac

Browse files
committed
Update README.md
1 parent 5a0ae6e commit f2679ac

File tree

1 file changed

+81
-28
lines changed

1 file changed

+81
-28
lines changed

README.md

Lines changed: 81 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,43 @@
1-
# FastAPI-BitNet
1+
# FastAPI-BitNet Orchestrator
22

3-
This project uses a combination of [Uvicorn](https://www.uvicorn.org/), [FastAPI](https://fastapi.tiangolo.com/) (Python) and [Docker](https://www.docker.com/) to provide a reliable REST API for testing [Microsoft's BitNet inference framework](https://github.com/microsoft/BitNet) out locally, specifically their [BitNet b1.58 2B4T](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T) model!
3+
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
44

5-
It supports running the inference framework, running BitNet model benchmarks and calculating BitNet model perplexity values.
5+
This project provides a robust REST API built with FastAPI and Docker to manage and interact with `llama.cpp`-based BitNet model instances. It allows developers and researchers to programmatically control `llama-cli` processes for automated testing, benchmarking, and interactive chat sessions.
66

7-
It's offers the same functionality as the [Electron-BitNet](https://github.com/grctest/Electron-BitNet) project, however it does so through a REST API which devs/researchers can use to automate testing/benchmarking of 1-bit BitNet models!
7+
It serves as a backend replacement for the [Electron-BitNet](https://github.com/grctest/Electron-BitNet) project, offering enhanced performance, scalability, and persistent chat sessions.
88

9-
## Setup instructions
9+
## Key Features
1010

11-
If running in dev mode, run Docker Desktop on windows to initialize docker in WSL2.
11+
* **Session Management**: Start, stop, and check the status of multiple persistent `llama-cli` and `llama-server` session based chats.
12+
* **Batch Operations**: Initialize, shut down, and chat with multiple instances in a single API call.
13+
* **Interactive Chat**: Send prompts to running bitnet sessions and receive cleaned model responses.
14+
* **Model Benchmarking**: Programmatically run benchmarks and calculate perplexity on GGUF models.
15+
* **Resource Estimation**: Estimate maximum server capacity based on available system RAM and CPU threads.
16+
* **VS Code Integration**: Connects directly to GitHub Copilot Chat as a tool via the Model Context Protocol.
17+
* **Automatic API Docs**: Interactive API documentation powered by Swagger UI and ReDoc.
1218

13-
Launch WSL: `wsl`
19+
## Technology Stack
1420

15-
Install Conda: https://anaconda.org/anaconda/conda
21+
* [FastAPI](https://github.com/fastapi/fastapi) for the core web framework.
22+
* [Uvicorn](https://www.uvicorn.org/) as the ASGI server.
23+
* [Docker](https://www.docker.com/) for containerization and easy deployment.
24+
* [Pydantic](https://docs.pydantic.dev/) for data validation and settings management.
25+
* [fastapi-mcp](https://github.com/tadata-org/fastapi_mcp) for VS Code Copilot tool integration.
1626

17-
Initialize the python environment:
18-
```
19-
conda init
27+
---
28+
29+
## Getting Started
30+
31+
### Prerequisites
32+
33+
* [Docker Desktop](https://www.docker.com/products/docker-desktop/)
34+
* [Conda](https://www.anaconda.com/download) (or another Python environment manager)
35+
* Python 3.10+
36+
37+
### 1. Set Up the Python Environment
38+
39+
Create and activate a Conda environment:
40+
```bash
2041
conda create -n bitnet python=3.11
2142
conda activate bitnet
2243
```
@@ -31,32 +52,64 @@ Download Microsoft's official BitNet model:
3152
huggingface-cli download microsoft/BitNet-b1.58-2B-4T-gguf --local-dir app/models/BitNet-b1.58-2B-4T
3253
```
3354

34-
Build the docker image:
35-
```
36-
docker build -t fastapi_bitnet .
37-
```
55+
---
3856

39-
Run the docker image:
40-
```
41-
docker run -d --name ai_container -p 8080:8080 fastapi_bitnet
57+
## Running the Application
58+
59+
### Using Docker (Recommended)
60+
61+
This is the easiest and recommended way to run the application.
62+
63+
1. **Build the Docker image:**
64+
```bash
65+
docker build -t fastapi_bitnet .
66+
```
67+
68+
2. **Run the Docker container:**
69+
This command runs the container in detached mode (`-d`) and maps port 8080 on your host to port 8080 in the container.
70+
```bash
71+
docker run -d --name ai_container -p 8080:8080 fastapi_bitnet
72+
```
73+
74+
### Local Development
75+
76+
For development, you can run the application directly with Uvicorn, which enables auto-reloading.
77+
78+
```bash
79+
uvicorn app.main:app --host 0.0.0.0 --port 8080 --reload
4280
```
4381

44-
Once it's running navigate to http://127.0.0.1:8080/docs
82+
---
83+
84+
## API Usage
85+
86+
Once the server is running, you can access the interactive API documentation:
87+
88+
* **Swagger UI**: [http://127.0.0.1:8080/docs](http://127.0.0.1:8080/docs)
89+
* **ReDoc**: [http://127.0.0.1:8080/redoc](http://127.0.0.1:8080/redoc)
90+
91+
---
4592

46-
## Docker hub repository
93+
## VS Code Integration
4794

48-
You can fetch the dockerfile at: https://hub.docker.com/repository/docker/grctest/fastapi_bitnet/general
95+
### As a Copilot Tool (MCP)
4996

50-
## How to add to VSCode!
97+
You can connect this API directly to VS Code's Copilot Chat to create and interact with models.
5198
52-
Run the dockerfile locally using the command above, then navigate to the VSCode Copilot chat window and find the wrench icon "Configure Tools...".
99+
1. Run the application using Docker or locally.
100+
2. In VS Code, open the Copilot Chat panel.
101+
3. Click the wrench icon ("Configure Tools...").
102+
4. Scroll to the bottom and select `+ Add MCP Server`, then choose `HTTP`.
103+
5. Enter the URL: `http://127.0.0.1:8080/mcp`
53104
54-
In the tool configuration overview scroll to the bottom and select 'Add more tools...' then '+ Add MCP Server' then 'HTTP'.
105+
Copilot will now be able to use the API to launch and chat with BitNet instances.
55106
56-
Enter into the URL field `http://127.0.0.1:8080/mcp` then your copilot will be able to launch new bitnet server instances and chat with them.
107+
### See Also - VSCode Extension!
57108
58-
## See also - VSCode Extension!
109+
For a more integrated experience, check out the companion VS Code extension:
110+
* **GitHub**: [https://github.com/grctest/BitNet-VSCode-Extension](https://github.com/grctest/BitNet-VSCode-Extension)
111+
* **Marketplace**: [https://marketplace.visualstudio.com/items?itemName=nftea-gallery.bitnet-vscode-extension](https://marketplace.visualstudio.com/items?itemName=nftea-gallery.bitnet-vscode-extension)
59112
60-
https://github.com/grctest/BitNet-VSCode-Extension
113+
## License
61114
62-
https://marketplace.visualstudio.com/items?itemName=nftea-gallery.bitnet-vscode-extension
115+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

0 commit comments

Comments
 (0)