Skip to content
This repository has been archived by the owner on May 30, 2018. It is now read-only.

example code for adding a custom adjuster to zipkin-sparkstreaming job #1

Merged
merged 3 commits into from
Feb 22, 2017

Conversation

naoman
Copy link
Collaborator

@naoman naoman commented Feb 16, 2017

zipkin-sparkstreaming custom adjuster example

This example shows how to add a custom adjuster for zipkin-sparkstreaming.
We'll create a jar with a simple adjuster that trims the length of binary annotation values.
This adjuster will be applied by placing it in the class path of the spark job.

Steps for creating an adjuster jar

  1. Create a java source file that..
  • extends zipkin.sparkstreaming.Adjuster
  • has an annotation org.springframework.context.annotation.Configuration
  1. Create a resource file META-INF/spring.factories that contains
org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
your.package.FooAdjuster
  1. Put compiled class and META-INF/spring.factories into a jar.

In maven, the following structure would accomplish this.

src/main/java/your/package/FooAdjuster.java
src/main/resources/META-INF/spring.factories

Building example

mvn clean package

Running adjuster jar with the job

Download spark-streaming jar

wget -O zipkin-sparkstreaming-job.jar 'https://search.maven.org/remote_content?g=io.zipkin.sparkstreaming&a=zipkin-sparkstreaming-job&v=LATEST'

Run the job by adding adjuster jar to the classpath.

Note We'll can't run the job with -jar flag. If we use this flag, -cp option is ignored.

java -cp "zipkin-sparkstreaming-job.jar:zipkin-sparkstreaming-example-*.jar" \
  zipkin.sparkstreaming.job.ZipkinSparkStreamingJob \
  --zipkin.storage.type=elasticsearch \
  --zipkin.storage.elasticsearch.hosts=http://127.0.0.1:9200 \
  --zipkin.sparkstreaming.stream.kafka.bootstrap-servers=127.0.0.1:9092

Why this should work..

The spring factories thing allows spring boot to pick up and load the class you made. This is called auto-configuration... like java service loader, but better. Read more here.

@Override
protected boolean shouldAdjust(Span span) {
return span.binaryAnnotations.stream()
.filter(ba -> ((ba.type.equals(BinaryAnnotation.Type.BYTES)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BYTES probably isn't a great example, as it shouldn't be used in zipkin. mind if we remove this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can take it out. But why we shouldn't use BYTES in zipkin?

<dependency>
<groupId>io.zipkin.sparkstreaming</groupId>
<artifactId>zipkin-sparkstreaming-job</artifactId>
<version>0.1.0</version>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be a provided dep I think

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are using this dep, maybe drop it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped.

* or implied. See the License for the specific language governing permissions and limitations under
* the License.
*/
package zipkin.sparkstreaming.adjuster;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to not use core package names. maybe sparkstreaming.tagtruncator which is short and to the point.

On that note, a lot of users don't use the word binary annotation and we are actually changing the model to the more accessible term "tag"

To make the example easier you could consider just using the word tag. That's what's used in opentracing, brave4 and the upcoming simplified model.

In the description you can say that tag is a binary annotation of type string (which happens to be the only usable binary annotation).

import zipkin.sparkstreaming.Adjuster;

@Configuration
public class BinaryAnnotationTrimAdjusterConfiguration {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could hit harder actually using spring

@Configuration
public class TagTruncatorConfiguration {
  /** This lets you override the max length via commandline, like --tagtruncator.max-length=255 */
  @Bean
  Adjuster tagTruncator(@Value("tagtruncator.max-length:1024") int maxLength) {
    return new TagTruncator(maxLength);
  }
}

import zipkin.sparkstreaming.Adjuster;

@Configuration
public class BinaryAnnotationTrimAdjusterConfiguration {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could hit harder actually using spring

@Configuration
public class TagTruncatorConfiguration {
  /** This lets you override the max length via commandline, like --tagtruncator.max-length=255 */
  @Bean
  Adjuster tagTruncator(@Value("tagtruncator.max-length:1024") int maxLength) {
    return new TagTruncator(maxLength);
  }
}

import zipkin.sparkstreaming.adjuster.BinaryAnnotationTrimAdjusterConfiguration;
import zipkin.sparkstreaming.autoconfigure.adjuster.finagle.ZipkinFinagleAdjusterAutoConfiguration;

public class BinaryAnnotationTrimAdjusterPropertiesTest {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually test class matches the name of the subject. there's no class BinaryAnnotationTrimAdjusterProperties

@@ -0,0 +1,37 @@
/**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't put license headers on example code. it distracts from the content, which isn't being distributed etc.

README.md Outdated
We'll create a jar with a simple adjuster that trims the length of binary annotation values.
This adjuster will be applied by placing it in the class path of the spark job.

## Steps for creating an adjuster jar

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the readme to help people get started, the primary goal is "how to use this", not which steps maven is doing.

For example, in Brave example, we don't tell people how the maven plugin works to make a war file.

README.md Outdated

In maven, the following structure would accomplish this.
```
src/main/java/your/package/FooAdjuster.java

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other words, I think this whole section can be taken out and replaced with a section that says that this project is an example setup that creates an adjuster jar.

README.md Outdated
--zipkin.sparkstreaming.stream.kafka.bootstrap-servers=127.0.0.1:9092
```

## Why this should work..

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is copy/paste from the issue again, the README of the example, should concentrate on what we are doing, and not use tentative language like "this should work". I used phrasing like that in the issue as I didn't test it, yet :)

@codefromthecrypt
Copy link

network broke earlier...

so the key takeaway is let's put detailed documentation in the upstream project, then link to that if folks need or want to learn more.

This README could probably do better to talk about the value of what this is doing. For example, some intro like.

Adjusters are used to clean or change data that goes into zipkin, such that it is more usable. For the sake of example, we assume you have an application adding very large tags in trace data. These slow down the UI and eat up more storage. This is an example adjuster for the spark streaming job, which truncates tags to a configured length.

then, the build/run instructions you have already.

what's left after that is how to actually run the example.

For example, how would you know if this works? Maybe passing an arg that limits to 10 characters to the job then making a test trace. If this part isn't ready due to missing "testing mode" for upstream, it can be a TODO.

@codefromthecrypt
Copy link

codefromthecrypt commented Feb 17, 2017 via email

@codefromthecrypt
Copy link

codefromthecrypt commented Feb 22, 2017 via email

@naoman naoman merged commit 7947504 into openzipkin-attic:master Feb 22, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants