1The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.
2
3<img
4 width='740'
5 alt='microservice'
6 src='https://github.com/user-attachments/assets/b46f050b-e500-4d53-b354-24f0ab30cad3'
7/>
8
9**Benefits**
10
11- Loose coupling between the application and the Spice Runtime.
12- Independent scaling and upgrades.
13- Can serve multiple applications or services within an organization.
14- Helps achieve high availability and redundancy.
15
16**Considerations**
17
18- Additional network hop introduces latency compared to sidecar.
19- More complex infrastructure, requiring service discovery and load balancing.
20- Potentially higher cost due to additional infrastructure components.
21
22**Use This Approach When**
23
24- A loosely coupled architecture and the ability to independently scale the AI service are desired.
25- Multiple services or teams need to share the same AI engine.
26- Heavy or varying traffic is anticipated, requiring independent scaling of the Spice Runtime.
27- Resiliency and redundancy are prioritized over simplicity.
28
29**Example Use Case**
30A large organization where multiple services (recommendations, analytics, etc.) need to share AI insights. A centralized Spice Runtime microservice cluster helps separate teams consume AI outputs without duplicating efforts.
31
1The Spice Runtime operates as an independent microservice. Multiple replicas may be deployed behind a load balancer to achieve high availability and handle spikes in demand.
2
3<img
4 width='740'
5 alt='microservice'
6 src='https://github.com/user-attachments/assets/b46f050b-e500-4d53-b354-24f0ab30cad3'
7/>
8
9**Benefits**
10
11- Loose coupling between the application and the Spice Runtime.
12- Independent scaling and upgrades.
13- Can serve multiple applications or services within an organization.
14- Helps achieve high availability and redundancy.
15
16**Considerations**
17
18- Additional network hop introduces latency compared to sidecar.
19- More complex infrastructure, requiring service discovery and load balancing.
20- Potentially higher cost due to additional infrastructure components.
21
22**Use This Approach When**
23
24- A loosely coupled architecture and the ability to independently scale the AI service are desired.
25- Multiple services or teams need to share the same AI engine.
26- Heavy or varying traffic is anticipated, requiring independent scaling of the Spice Runtime.
27- Resiliency and redundancy are prioritized over simplicity.
28
29**Example Use Case**
30A large organization where multiple services (recommendations, analytics, etc.) need to share AI insights. A centralized Spice Runtime microservice cluster helps separate teams consume AI outputs without duplicating efforts.
31