Message Brokers

Apache Pulsar

Spring Boot

Auto-Config Clients

Imperative Produce/Consume

Reactive Produce/Consume

Message Process Model

https://pulsar.apache.org/docs/3.1.x/concepts-messaging/

Spring Boot

https://docs.spring.io/spring-boot/docs/3.2.0/reference/html/messaging.html#messaging.pulsar

Sending a Message

https://docs.spring.io/spring-boot/docs/3.2.0/reference/html/messaging.html#messaging.pulsar.sending

import org.apache.pulsar.client.api.PulsarClientException
import org.springframework.kafka.core.KafkaTemplate
import org.springframework.pulsar.core.PulsarTemplate
import org.springframework.stereotype.Component

@Component
class MyBean(private val pulsarTemplate: PulsarTemplate<String>) {

    @Throws(PulsarClientException::class)
    fun someMethod() {
        pulsarTemplate.send("someTopic", "Hello")
    }

}

Sending a Message

https://docs.spring.io/spring-boot/docs/3.2.0/reference/html/messaging.html#messaging.pulsar.sending

import org.apache.pulsar.client.api.PulsarClientException
import org.springframework.kafka.core.KafkaTemplate
import org.springframework.pulsar.core.PulsarTemplate
import org.springframework.stereotype.Component

@Component
class MyBean(private val pulsarTemplate: PulsarTemplate<String>) {

    @Throws(PulsarClientException::class)
    fun someMethod() {
        pulsarTemplate.send("someTopic", "Hello")
    }

}

Sending a Message Reactively

https://docs.spring.io/spring-boot/docs/3.2.0/reference/html/messaging.html#messaging.pulsar.sending-reactive

import org.springframework.pulsar.reactive.core.ReactivePulsarTemplate
import org.springframework.stereotype.Component

@Component
class MyBean(private val pulsarTemplate: ReactivePulsarTemplate<String>) {

    fun someMethod() {
        pulsarTemplate.send("someTopic", "Hello").subscribe()
    }

}

Receiving a Message

https://docs.spring.io/spring-boot/docs/3.2.0/reference/html/messaging.html#messaging.pulsar.receiving

import org.springframework.pulsar.annotation.PulsarListener
import org.springframework.stereotype.Component

@Component
class MyBean {

    @PulsarListener(topics = ["someTopic"])
    fun processMessage(content: String?) {
        // ...
    }

}

Receiving a Message Reactively

https://docs.spring.io/spring-boot/docs/3.2.0/reference/html/messaging.html#messaging.pulsar.sending-reactive

@Component
class MyBean {

    @ReactivePulsarListener(topics = ["someTopic"])
    fun processMessage(content: String?): Mono<Void> {
        // ...
        return Mono.empty()
    }

}

Planned: Spring Integration Support

https://spring.io/projects/spring-integration/

Example

Parallel

Pipelining

https://github.com/lhotari/reactive-pulsar-showcase

All significant improvements in system scalability come from parallelism.

CPU cores don't go much faster these days. We are getting more cores these days. Hence, we need to find a way to use all the CPUs cores and CPU resources that we get with the cloud.

One idea/approach of parallelism is pipelining.

Scaling Performance of Message Processing

reactive pulsar showcase

https://github.com/lhotari/reactive-pulsar-showcase

TeleMetryProcessorIntegrationTests

String subscriptionName = "testSubscription" + UUID.randomUUID();
ReactiveMessageConsumer<TelemetryEvent> messageConsumer = reactivePulsarClient
    .messageConsumer(Schema.JSON(TelemetryEvent.class))
    .topic(topicNameResolver.resolveTopicName(TelemetryProcessor.TELEMETRY_MEDIAN_TOPIC_NAME))
    .subscriptionType(SubscriptionType.Exclusive)
    .subscriptionName(subscriptionName)
    .subscriptionInitialPosition(SubscriptionInitialPosition.Latest)
    .acknowledgeAsynchronously(false)
    .build();

TeleMetryProcessorIntegrationTests

// create a subscription to the result topic before executing the operation
String subscriptionName = "testSubscription" + UUID.randomUUID();
ReactiveMessageConsumer<TelemetryEvent> messageConsumer = reactivePulsarClient
    .messageConsumer(Schema.JSON(TelemetryEvent.class))
    .topic(topicNameResolver.resolveTopicName(TelemetryProcessor.TELEMETRY_MEDIAN_TOPIC_NAME))
    .subscriptionType(SubscriptionType.Exclusive)
    .subscriptionName(subscriptionName)
    .subscriptionInitialPosition(SubscriptionInitialPosition.Latest)
    .acknowledgeAsynchronously(false)
    .build();
// create the consumer and close it immediately. This is just to create the Pulsar subscription
messageConsumer.consumeNothing().block();

ReactiveMessageSender<TelemetryEvent> messageSender = reactivePulsarClient
    .messageSender(Schema.JSON(TelemetryEvent.class))
    .topic(topicNameResolver.resolveTopicName(IngestController.TELEMETRY_INGEST_TOPIC_NAME))
    .build();

TeleMetryProcessorIntegrationTests

// when
// 100 values for 100 devices are sent to the ingest topic
messageSender
    .sendMany(
        Flux
            .range(1, DEVICE_COUNT)
            .flatMap(value -> {
                String name = "device" + value + "/sensor1";
                return Flux
                    .range(1, 100)
                    .map(entryCounter -> TelemetryEvent.builder().n(name).v(entryCounter).build());
            })
            .map(telemetryEvent -> MessageSpec.builder(telemetryEvent).key(telemetryEvent.getN()).build())
    )
    .blockLast();

TeleMetryProcessorIntegrationTests

// then the TelemetryProcessor should have aggregated a single median value for each sensor in the result topic
Set<String> deviceNames = new HashSet<>();
messageConsumer
    .consumeMany(messageFlux -> messageFlux.map(MessageResult::acknowledgeAndReturn))
    .as(StepVerifier::create)
    .expectSubscription()
    .thenConsumeWhile(message -> {
        assertThat(deviceNames.add(message.getValue().getN()))
            .as("there shouldn't be more than 1 message per device")
            .isTrue();
        assertThat(message.getValue().getV()).isEqualTo(51.0);
        return deviceNames.size() < DEVICE_COUNT;
    })
    .expectNoEvent(Duration.ofSeconds(1))
    .thenCancel()
    .verify(Duration.ofSeconds(10));

Pulsar Functions

Transformations

Pulsar IO

Pulsar transformations

https://github.com/datastax/pulsar-transformations

Pulsar transformations

https://github.com/datastax/pulsar-transformations

cast: modifies the key or value schema to a target compatible schema.
drop-fields: drops fields from structured data.
merge-key-value: merges the fields of KeyValue records where both the key and value are structured data with the same schema type.
unwrap-key-value: if the record is a KeyValue, extract the KeyValue's key or value and make it the record value.
flatten: flattens structured data.
drop: drops a record from further processing.
compute: computes new properties, values or field values on the fly or replaces existing ones.

Example

https://github.com/datastax/pulsar-transformations

{key={keyField1: key1, keyField2: key2, keyField3: key3}, 
	value={valueField1: value1, valueField2: value2, valueField3: value3}}(KeyValue<AVRO, AVRO>)
           |
           | ”type": "drop-fields", "fields": "keyField1,keyField2”, "part": "key”
           |
{key={keyField3: key3}, value={valueField1: value1, 
	valueField2: value2, valueField3: value3}} (KeyValue<AVRO, AVRO>)
           |
           | "type": "merge-key-value"
           |
{key={keyField3: key3}, value={keyField3: key3, valueField1: value1, 
	valueField2: value2, valueField3: value3}} (KeyValue<AVRO, AVRO>)
           |
           | "type": "unwrap-key-value"
           |
{keyField3: key3, valueField1: value1, 
	valueField2: value2, valueField3: value3} (AVRO)
           |
           | "type": "cast", "schema-type": "STRING"
           |
{"keyField3": "key3", "valueField1": "value1", 
	"valueField2": "value2", "valueField3": "value3"} (STRING)

Scaling

Zookeeper

https://zookeeper.apache.org/
https://zookeeper.apache.org/doc/current/index.html
https://github.com/apache/zookeeper

Holds cluster metadata, handles coordination tasks between Pulsar clusters

Bookkeeper

https://bookkeeper.apache.org/
https://bookkeeper.apache.org/docs/overview/
https://github.com/apache/bookkeeper

Bookkeeper

- Persistent message store
- Fast, low impact, horizontal scaling
- Reduced long-term and day-to-day expenses
- Broker
- Stateless
- Built-In load balancing
- Instantaneous scaling
- Zero impact disaster recovery
- Bookies
- Scalable, WAL based, fault-tolerant, low latency, storage service
- Tunable consistency from message replication
- Ensemble size, Write Quorum, ACK Quorum
- Fast write guarantee through Journals
- Segment-centric data persistency via Ledgers
- Topic Partitions

Deployments

Luna Streaming

1) Helm Chart for Pulsar Cluster

2) Luna Streaming tarball

3) Pulsar Ansible

https://github.com/datastax/pulsar-ansible

4) Luna Streaming Examples

https://github.com/datastaxdevs/luna-streaming-examples

5) StreamNative Helm Charts

https://github.com/streamnative/charts

argocd-deploy.yaml

https://github.com/streamnative/charts/blob/master/examples/argocd-deploy.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: sn-pulsar
  finalizers:
  - resources-finalizer.argocd.argoproj.io
  namespace: argo
spec:
  destination:
    namespace: argo
    server: https://kubernetes.default.svc
  source:
    repoURL: 'https://github.com/streamnative/charts'
    path: examples/argocd/charts/sn-pulsar
    targetRevision: master
    helm:
      values: |
  project: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

6) StreamNative Terraform

https://github.com/streamnative/terraform-provider-pulsar

Additional Sources

StreamNative Academy

https://www.academy.streamnative.io/tracks

StreamNative Examples

https://github.com/streamnative/examples