Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable setting compression via the KafkaAttribute #175

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

gliljas
Copy link
Contributor

@gliljas gliljas commented Sep 17, 2020

No description provided.

@TsuyoshiUshio
Copy link
Contributor

Hi @gliljas

Thank you for your contribution. The code looks good. Could you update the documentation on our Readme?

@gliljas
Copy link
Contributor Author

gliljas commented Oct 31, 2020

Documentation added.

Please note that I also fixed all the nullable properties on KafkaAttribute. Nullable properties are unusable on attributes.

@gliljas
Copy link
Contributor Author

gliljas commented Nov 5, 2020

@TsuyoshiUshio As mentioned, this fixes a bug in the current attribute, which makes quite a few of the attribute parameters completely useless.

Copy link
Contributor

@TsuyoshiUshio TsuyoshiUshio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you have a look the comment? I'd happy to close my PR and accept your contribution. Could you resolve the conflict and Use KafkaOptions instead?

@@ -15,6 +15,14 @@ namespace Microsoft.Azure.WebJobs.Extensions.Kafka
[Binding]
public sealed class KafkaAttribute : Attribute
{
private int? compressionLevel;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use KafkaOption to configure it? We talked with @fbeltrao if we should go with Kafka annotation or host.json, This time we decided to go with host.json . We are reusing the KafkaProducer, so that we can't have different config for each producer. Also if we introduce the change for the Kafka attribute, we need to update here as well. https://github.com/Azure/azure-functions-java-library

#218

Copy link
Contributor Author

@gliljas gliljas Mar 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TsuyoshiUshio I can do that change, but I don't think it makes much sense. The attribute is already used to configure things that are much more global in scope, like the "brokerList", and things that are on sort of the same level, such as "EnableIdempotence". There's a reason, I guess, that the KafkaProducerFactory.GetProducerConfig method gets the attribute as parameter, and I can't see the KafkaProducer being reused beyond this point.

"Also if we introduce the change for the Kafka attribute, we need to update here as well. https://github.com/Azure/azure-functions-java-library"

Is there a technical reason for this, or is there just a desire to have feature parity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gliljas
It is true. I have two reason for it.

  1. The compression settings will be global so that avoid customers confusion. For KafkaAttribute should have parameter that is for each producer and host.json is for global. (even if there is the past legacy is here)
  2. Simply workload. If we introduce the change, We need to update the https://github.com/Azure/azure-functions-java-library also, need to update here and test it. https://github.com/Azure/azure-functions-java-worker/tree/dev/endtoendtests/src/main/java/com/microsoft/azure/functions/endtoend .

So that both on technical side and workload. Then I though host.json might be better place. However, I'd like to listen @fbeltrao 's opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see you both have good points.

When creating producers we are reusing the same reference to the Librdkafka. Reusing is based on the value of the producer config which is created from the KafkaOptions (global) and Kafka attribute (function based).
So, @TsuyoshiUshio , unlike what I was suggesting you before, there is no technical requirement forcing us to make the compression global, sorry for the miss information. We can make it at function level as well, at the cost of creating additional Librdkafka producers (same applies if the batch size changes and other Kafka attribute settings).

What I don't know is:

  • Will customers want to mix different compression settings in same function deployment?
  • Is it easier for customers to set once at the global level or set always on function level?

If we don't have a clear answer perhaps being more flexible is better (defining at the function level). We should also validate and document what we recommend in terms of base producers / function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that any settings that aren't explicitly global in their nature should be possible to set on a function/attribute level, but having a fallback to a global setting is very good. In fact, I would say that broker connection things (authentication etc.) are good candidates for global settings, whereas "topic" is not and things like compression could work in both places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fbeltrao That's make sense. For the feature expansion, it can be flexible. Let's go with Attribute.
As @gliljas suggested, we might think to make it clear which parameter is global and which is not. In the future release, when major change introduced, we can make the change. Until then I'll find time to add comments on the Attribute.
Sure Let's go with Attribute!

@ghost ghost added the no recent activity label Mar 13, 2021
@ghost
Copy link

ghost commented Mar 13, 2021

This pull request has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 7 days. It will be closed if no further activity occurs within 7 days of this comment.

@@ -15,6 +15,14 @@ namespace Microsoft.Azure.WebJobs.Extensions.Kafka
[Binding]
public sealed class KafkaAttribute : Attribute
{
private int? compressionLevel;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see you both have good points.

When creating producers we are reusing the same reference to the Librdkafka. Reusing is based on the value of the producer config which is created from the KafkaOptions (global) and Kafka attribute (function based).
So, @TsuyoshiUshio , unlike what I was suggesting you before, there is no technical requirement forcing us to make the compression global, sorry for the miss information. We can make it at function level as well, at the cost of creating additional Librdkafka producers (same applies if the batch size changes and other Kafka attribute settings).

What I don't know is:

  • Will customers want to mix different compression settings in same function deployment?
  • Is it easier for customers to set once at the global level or set always on function level?

If we don't have a clear answer perhaps being more flexible is better (defining at the function level). We should also validate and document what we recommend in terms of base producers / function.

@@ -15,6 +15,14 @@ namespace Microsoft.Azure.WebJobs.Extensions.Kafka
[Binding]
public sealed class KafkaAttribute : Attribute
{
private int? compressionLevel;
private int? maxMessageBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we shouldn't change this to have the default values set on the field/property declarations instead of having in the getter implementation body?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it looked a bit weird the way I did it, but my reason was to start a pattern where it could be determined whether a setting had been specified or not. I relates to the function/global setting scope issue.

Copy link
Contributor

@fbeltrao fbeltrao Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see. I haven't see any usage. Are we using this information (whether or not a setting was set) anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I didn't get that far. I pushed a commit, to demonstrate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the use case of determining if a value was defined and if not take a different action than setting a default value, but I couldn't find one.

If that is true, I wonder if we couldn't make it simpler like this:

        /// <summary>
        /// Gets or sets the Maximum transmit message size. Default: 1MB
        /// </summary>
        public int MaxMessageBytes { get; set; } = 1_000_000;

instead of

        /// <summary>
        /// Gets or sets the Maximum transmit message size. Default: 1MB
        /// </summary>
        public int MaxMessageBytes { get => maxMessageBytes.GetValueOrDefault(1000000); set => maxMessageBytes = value; }

@@ -54,33 +62,33 @@ public KafkaAttribute()
/// <summary>
/// Gets or sets the Maximum transmit message size. Default: 1MB
/// </summary>
public int? MaxMessageBytes { get; set; }
public int MaxMessageBytes { get => maxMessageBytes.GetValueOrDefault(1000000); set => maxMessageBytes = value; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: recommend using thousand separators to make it easier to read:

Suggested change
public int MaxMessageBytes { get => maxMessageBytes.GetValueOrDefault(1000000); set => maxMessageBytes = value; }
public int MaxMessageBytes { get => maxMessageBytes.GetValueOrDefault(1_000_000); set => maxMessageBytes = value; }

There are other occurrences.

@TsuyoshiUshio
Copy link
Contributor

Hi @gliljas and @fbeltrao
What is the current status? No rush, however I'm curious. I'm preparing to release 3.3.0. :) I'll release it on this weekend.
If this feature coming, I'll include it on the next release.

Copy link
Contributor

@fbeltrao fbeltrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make the properties a bit cleaner by removing the unnecessary getter and setter body and ensure that all properties have a getter.

@@ -15,6 +15,14 @@ namespace Microsoft.Azure.WebJobs.Extensions.Kafka
[Binding]
public sealed class KafkaAttribute : Attribute
{
private int? compressionLevel;
private int? maxMessageBytes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the use case of determining if a value was defined and if not take a different action than setting a default value, but I couldn't find one.

If that is true, I wonder if we couldn't make it simpler like this:

        /// <summary>
        /// Gets or sets the Maximum transmit message size. Default: 1MB
        /// </summary>
        public int MaxMessageBytes { get; set; } = 1_000_000;

instead of

        /// <summary>
        /// Gets or sets the Maximum transmit message size. Default: 1MB
        /// </summary>
        public int MaxMessageBytes { get => maxMessageBytes.GetValueOrDefault(1000000); set => maxMessageBytes = value; }

@@ -139,5 +148,17 @@ public KafkaAttribute()
/// ssl.key.password in librdkafka
/// </summary>
public string SslKeyPassword { get; set; }

/// <summary>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Properties should have getters, according .NET guidelines.

I can imagine that we might want to log/write the value and not having the setters prevents it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my latest commit I have used one possible implementation. The getters have been reintroduced, in order to comply with guidelines, but the backing fields are still nullable. The values from the attribute are then merged (coalesced) with the values from the global options (can be also done as a one-liner using JsonConvert, among many ways of doing it), and the default value is used if neither supplies it.

The same approach could definitely be used for consumers as well. It may seem a bit convoluted, and there may well be better ways of doing it, but I think it provides maximum flexibility, while maintaining backwards compatibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merging could also look like this:

var mergedOptions = DefaultProducerOptions.CreateNew()
                    .MergeWithNonNullPropertiesFrom(kafkaOptions)
                    .MergeWithNonNullPropertiesFrom(entity.Attribute);

using a simple extension like this:

public static class MergeExtensions
{
    public static T MergeWithNonNullPropertiesFrom<T>(this T instance, T other)
    {
        if (other != null)
        {
            JsonConvert.PopulateObject(JsonConvert.SerializeObject(other, new JsonSerializerSettings { NullValueHandling = NullValueHandling.Ignore }), instance);
        }
        return instance;
    }
}

@gliljas
Copy link
Contributor Author

gliljas commented Jul 8, 2022

Rebased on dev and reverted controversial stuff.

@shrohilla
Copy link
Contributor

shrohilla commented Dec 13, 2022

@gliljas could you please close these comments, I am happy to close this PR and accept your contribution. Could you resolve the comments.
You can open a fresh PR also if this contribution lost history
Moreover, you also need to address the regression.

Add the description also to define the scope of the PR

@shrohilla
Copy link
Contributor

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@shrohilla shrohilla self-requested a review December 13, 2022 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants