|
5 | 5 | </p> |
6 | 6 |
|
7 | 7 | <p align="center"> |
8 | | - <a href="https://docs.jan.ai/">Getting Started</a> - <a href="https://docs.jan.ai">Docs</a> |
| 8 | + <a href="https://jan.ai/nitro">Getting Started</a> - <a href="https://jan.ai/nitro">Docs</a> |
9 | 9 | - <a href="https://docs.jan.ai/changelog/">Changelog</a> - <a href="https://github.com/janhq/nitro/issues">Bug reports</a> - <a href="https://discord.gg/AsJ8krTT3N">Discord</a> |
10 | 10 | </p> |
11 | 11 |
|
@@ -67,11 +67,27 @@ curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \ |
67 | 67 | "llama_model_path": "/path/to/your_model.gguf", |
68 | 68 | "ctx_len": 2048, |
69 | 69 | "ngl": 100, |
70 | | - "embedding": true |
| 70 | + "embedding": true, |
| 71 | + "n_parallel": 4, |
| 72 | + "pre_prompt": "A chat between a curious user and an artificial intelligence", |
| 73 | + "user_prompt": "what is AI?" |
71 | 74 | }' |
72 | 75 | ``` |
73 | 76 |
|
74 | | -`ctx_len` and `ngl` are typical llama C++ parameters, and `embedding` determines whether to enable the embedding endpoint or not. |
| 77 | +Table of parameters |
| 78 | + |
| 79 | +| Parameter | Type | Description | |
| 80 | +|------------------|---------|--------------------------------------------------------------| |
| 81 | +| `llama_model_path` | String | The file path to the LLaMA model. | |
| 82 | +| `ngl` | Integer | The number of GPU layers to use. | |
| 83 | +| `ctx_len` | Integer | The context length for the model operations. | |
| 84 | +| `embedding` | Boolean | Whether to use embedding in the model. | |
| 85 | +| `n_parallel` | Integer | The number of parallel operations. Uses Drogon thread count if not set. | |
| 86 | +| `cont_batching` | Boolean | Whether to use continuous batching. | |
| 87 | +| `user_prompt` | String | The prompt to use for the user. | |
| 88 | +| `ai_prompt` | String | The prompt to use for the AI assistant. | |
| 89 | +| `system_prompt` | String | The prompt to use for system rules. | |
| 90 | +| `pre_prompt` | String | The prompt to use for internal configuration. | |
75 | 91 |
|
76 | 92 | **Step 4: Perform Inference on Nitro for the First Time** |
77 | 93 |
|
|
0 commit comments