4 minutes
Nats Streaming High Availability
Developing and deploying applications and services that communicate in distributed systems can be complex and difficult. However there are two basic patterns, request/reply or RPC for services, and event and data streams. A modern messaging system needs to support multiple communication patterns, be secure by default, support multiple qualities of service, and provide secure multi-tenancy for a truly shared infrastructure
NATS is simple and secure messaging made for developers and operators who want to spend more time developing modern applications and services than worrying about a distributed communication system. NATS and NATS Streaming are two different things. NATS Streaming is a log-based streaming system built on top of NATS, and NATS is a lightweight pub/sub messaging system. NATS was originally built (and then open sourced) as the control plane for Cloud Foundry. NATS Streaming was built in response to the community’s ask for higher-level guarantees—durability, at-least-once delivery, and so forth—beyond what NATS provided. It was built as a separate layer on top of NATS. Important point to note is that NATS and NATS Streaming are distinct systems with distinct protocols, distinct APIs, and distinct client libraries.
Let’s look at how to setup a high availability nats streaming server cluster. There are two kinds of High availability setup
- Clustering
- Fault Tolerance
Clustering
Clustering makes use of RAFT Consensus Algorithm for the purposes of high availability. It provides protection for some of the nodes in the cluster failing, but since the leader is handling all incoming data from publishers and outgoing data to subscribers, it is not horizontally scalable. The cluster size should probably be limited to 3 to 5 nodes (RAFT recommends an odd number of nodes).
nats-streaming-server embeds a NATS server too, and to cluster nats-streaming-server we need to cluster NATS as well. We have two alternatives here, either setup a separated NATS cluster or cluster the one already embedded in nats-streaming-server. I choose to use the embed one.
Example
package main
import (
"fmt"
"log"
"os"
"strings"
"time"
"github.com/nats-io/stan"
)
func main() {
sc, err := stan.Connect(
"test-cluster",
"client-1",
stan.Pings(1, 3),
stan.NatsURL(strings.Join(os.Args[1:], ",")),
)
if err != nil {
log.Fatalln(err)
}
defer sc.Close()
sub, err := sc.Subscribe("foo", func(m *stan.Msg) {
fmt.Print(".")
})
if err != nil {
log.Fatalln(err)
}
defer sub.Unsubscribe()
for {
if err := sc.Publish("foo", []byte("msg")); err != nil {
log.Fatalln(err)
}
time.Sleep(time.Millisecond * 100)
}
}
It connects to the nats streaming server. A . is printed on the screen for every message receieved.
So, now we can just start both:
$ ./nats-streaming-server
$ go run main.go localhost:4222
So, now let’s stop both client and server, and start a nats-streaming-server cluster.
Create 3 config files as follows:
; a.conf
port: 4221
cluster {
listen: 0.0.0.0:6221
routes: [
"nats-route://localhost:6222",
"nats-route://localhost:6223",
]
}
streaming {
id: test
store: file
dir: storea
cluster {
node_id: "a"
peers: ["b", "c"]
}
}
; c.conf
port: 4223
cluster {
listen: 0.0.0.0:6223
routes: [
"nats-route://localhost:6221",
"nats-route://localhost:6222",
]
}
streaming {
id: test
store: file
dir: storec
cluster {
node_id: "c"
peers: ["a", "b"]
}
}
Note that each config listens on different ports:
a: 4221 and 6221
b: 4222 and 6222
c: 4223 and 6223
Also note that in each config’s cluster we setup the routes to the other 2 instances. This cluster config is the actual NATS cluster.
The streaming.cluster config is the actual nats-streaming-server cluster configuration, and only IDs each node and add the other 2 as peers.
Since we are running all nodes on the same machine, notice that the streaming.dir option is different in each config.
Once that’s done, we can start the 3 servers:
$ ./nats-streaming-server -c a.conf
$ ./nats-streaming-server -c b.conf
$ ./nats-streaming-server -c c.conf
Once all of them are up, you should see logs like the following on each of them:
[11361] 2019/04/09 11:09:39.997893 [INF] ::1:87920 - rid:8 - Route connection created
[11361] 2019/04/09 11:09:39.993830 [INF] ::1:87921 - rid:9 - Route connection created
Now, we can connect start our client again:
$ go run main.go nats://localhost:4221 nats://localhost:4222 nats://localhost:4223
I hope this is helpful!