Message Brokers

Apache Pulsar

Spring Boot

Auto-Config Clients

Imperative Produce/Consume

Reactive Produce/Consume

Message Process Model

Spring Boot

Sending a Message

import org.apache.pulsar.client.api.PulsarClientException
import org.springframework.kafka.core.KafkaTemplate
import org.springframework.pulsar.core.PulsarTemplate
import org.springframework.stereotype.Component

@Component
class MyBean(private val pulsarTemplate: PulsarTemplate<String>) {

    @Throws(PulsarClientException::class)
    fun someMethod() {
        pulsarTemplate.send("someTopic", "Hello")
    }

}

Sending a Message

import org.apache.pulsar.client.api.PulsarClientException
import org.springframework.kafka.core.KafkaTemplate
import org.springframework.pulsar.core.PulsarTemplate
import org.springframework.stereotype.Component

@Component
class MyBean(private val pulsarTemplate: PulsarTemplate<String>) {

    @Throws(PulsarClientException::class)
    fun someMethod() {
        pulsarTemplate.send("someTopic", "Hello")
    }

}

Sending a Message Reactively

import org.springframework.pulsar.reactive.core.ReactivePulsarTemplate
import org.springframework.stereotype.Component

@Component
class MyBean(private val pulsarTemplate: ReactivePulsarTemplate<String>) {

    fun someMethod() {
        pulsarTemplate.send("someTopic", "Hello").subscribe()
    }

}

Receiving a Message

import org.springframework.pulsar.annotation.PulsarListener
import org.springframework.stereotype.Component

@Component
class MyBean {

    @PulsarListener(topics = ["someTopic"])
    fun processMessage(content: String?) {
        // ...
    }

}

Receiving a Message Reactively

@Component
class MyBean {

    @ReactivePulsarListener(topics = ["someTopic"])
    fun processMessage(content: String?): Mono<Void> {
        // ...
        return Mono.empty()
    }

}

Planned: Spring Integration Support

Example

Parallel

Pipelining

All significant improvements in system scalability come from parallelism.

CPU cores don't go much faster these days. We are getting more cores these days. Hence, we need to find a way to use all the CPUs cores and CPU resources that we get with the cloud.

One idea/approach of parallelism is pipelining.

Scaling Performance of Message Processing

reactive pulsar showcase

TeleMetryProcessorIntegrationTests

String subscriptionName = "testSubscription" + UUID.randomUUID();
ReactiveMessageConsumer<TelemetryEvent> messageConsumer = reactivePulsarClient
    .messageConsumer(Schema.JSON(TelemetryEvent.class))
    .topic(topicNameResolver.resolveTopicName(TelemetryProcessor.TELEMETRY_MEDIAN_TOPIC_NAME))
    .subscriptionType(SubscriptionType.Exclusive)
    .subscriptionName(subscriptionName)
    .subscriptionInitialPosition(SubscriptionInitialPosition.Latest)
    .acknowledgeAsynchronously(false)
    .build();

TeleMetryProcessorIntegrationTests

// create a subscription to the result topic before executing the operation
String subscriptionName = "testSubscription" + UUID.randomUUID();
ReactiveMessageConsumer<TelemetryEvent> messageConsumer = reactivePulsarClient
    .messageConsumer(Schema.JSON(TelemetryEvent.class))
    .topic(topicNameResolver.resolveTopicName(TelemetryProcessor.TELEMETRY_MEDIAN_TOPIC_NAME))
    .subscriptionType(SubscriptionType.Exclusive)
    .subscriptionName(subscriptionName)
    .subscriptionInitialPosition(SubscriptionInitialPosition.Latest)
    .acknowledgeAsynchronously(false)
    .build();
// create the consumer and close it immediately. This is just to create the Pulsar subscription
messageConsumer.consumeNothing().block();

ReactiveMessageSender<TelemetryEvent> messageSender = reactivePulsarClient
    .messageSender(Schema.JSON(TelemetryEvent.class))
    .topic(topicNameResolver.resolveTopicName(IngestController.TELEMETRY_INGEST_TOPIC_NAME))
    .build();

TeleMetryProcessorIntegrationTests

// when
// 100 values for 100 devices are sent to the ingest topic
messageSender
    .sendMany(
        Flux
            .range(1, DEVICE_COUNT)
            .flatMap(value -> {
                String name = "device" + value + "/sensor1";
                return Flux
                    .range(1, 100)
                    .map(entryCounter -> TelemetryEvent.builder().n(name).v(entryCounter).build());
            })
            .map(telemetryEvent -> MessageSpec.builder(telemetryEvent).key(telemetryEvent.getN()).build())
    )
    .blockLast();

TeleMetryProcessorIntegrationTests

// then the TelemetryProcessor should have aggregated a single median value for each sensor in the result topic
Set<String> deviceNames = new HashSet<>();
messageConsumer
    .consumeMany(messageFlux -> messageFlux.map(MessageResult::acknowledgeAndReturn))
    .as(StepVerifier::create)
    .expectSubscription()
    .thenConsumeWhile(message -> {
        assertThat(deviceNames.add(message.getValue().getN()))
            .as("there shouldn't be more than 1 message per device")
            .isTrue();
        assertThat(message.getValue().getV()).isEqualTo(51.0);
        return deviceNames.size() < DEVICE_COUNT;
    })
    .expectNoEvent(Duration.ofSeconds(1))
    .thenCancel()
    .verify(Duration.ofSeconds(10));

 Pulsar Functions

Transformations

 Pulsar IO

Pulsar transformations

Pulsar transformations

  • cast: modifies the key or value schema to a target compatible schema.

  • drop-fields: drops fields from structured data.

  • merge-key-value: merges the fields of KeyValue records where both the key and value are structured data with the same schema type.

  • unwrap-key-value: if the record is a KeyValue, extract the KeyValue's key or value and make it the record value.

  • flatten: flattens structured data.

  • drop: drops a record from further processing.

  • compute: computes new properties, values or field values on the fly or replaces existing ones.

Example

{key={keyField1: key1, keyField2: key2, keyField3: key3}, 
	value={valueField1: value1, valueField2: value2, valueField3: value3}}(KeyValue<AVRO, AVRO>)
           |
           | ”type": "drop-fields", "fields": "keyField1,keyField2”, "part": "key”
           |
{key={keyField3: key3}, value={valueField1: value1, 
	valueField2: value2, valueField3: value3}} (KeyValue<AVRO, AVRO>)
           |
           | "type": "merge-key-value"
           |
{key={keyField3: key3}, value={keyField3: key3, valueField1: value1, 
	valueField2: value2, valueField3: value3}} (KeyValue<AVRO, AVRO>)
           |
           | "type": "unwrap-key-value"
           |
{keyField3: key3, valueField1: value1, 
	valueField2: value2, valueField3: value3} (AVRO)
           |
           | "type": "cast", "schema-type": "STRING"
           |
{"keyField3": "key3", "valueField1": "value1", 
	"valueField2": "value2", "valueField3": "value3"} (STRING)

Scaling

Zookeeper

  • Holds cluster metadata, handles coordination tasks between Pulsar clusters

Bookkeeper

Bookkeeper

- Persistent message store
  - Fast, low impact, horizontal scaling
  - Reduced long-term and day-to-day expenses
- Broker
  - Stateless
  - Built-In load balancing
  - Instantaneous scaling
  - Zero impact disaster recovery
- Bookies
  - Scalable, WAL based, fault-tolerant, low latency, storage service
  - Tunable consistency from message replication
    - Ensemble size, Write Quorum, ACK Quorum
  - Fast write guarantee through Journals
  - Segment-centric data persistency via Ledgers
  - Topic Partitions

Deployments

Luna Streaming

1) Helm Chart for Pulsar Cluster

2) Luna Streaming tarball

3) Pulsar Ansible

4) Luna Streaming Examples

5) StreamNative Helm Charts

argocd-deploy.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: sn-pulsar
  finalizers:
  - resources-finalizer.argocd.argoproj.io
  namespace: argo
spec:
  destination:
    namespace: argo
    server: https://kubernetes.default.svc
  source:
    repoURL: 'https://github.com/streamnative/charts'
    path: examples/argocd/charts/sn-pulsar
    targetRevision: master
    helm:
      values: |
  project: default
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

6) StreamNative Terraform

Additional Sources

StreamNative Academy

StreamNative Examples

Apache Pulsar

By Benjamin Nothdurft

Apache Pulsar

  • 196