Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: JSON format access logging support #1479

Merged
merged 5 commits into from
Nov 14, 2024

Conversation

hongzhidao
Copy link
Contributor

@hongzhidao hongzhidao commented Oct 25, 2024

Allow format to be an object to generate JSON logs. The object keys
become JSON field names, and values support string, variable, and JS.

Note that when there is no JS in the format values, the object will
be pre-serialized to a JSON template string at configuration phase
for better performance.

Example config:

  {
    "access_log": {
      "path": "/tmp/access.log",
      "format": {
        "remote_addr": "$remote_addr",
        "time_local": "$time_local",
        "request_line": "$request_line",
        "status": "$status",
        "body_bytes_sent": "$body_bytes_sent",
        "header_referer": "$header_referer",
        "header_user_agent": "$header_user_agent"
      }
    }
  }

@hongzhidao hongzhidao marked this pull request as draft October 25, 2024 08:27
@hongzhidao hongzhidao marked this pull request as ready for review October 29, 2024 05:19

if (nxt_conf_type(value) != NXT_CONF_STRING) {
return nxt_conf_vldt_error(vldt, "In the access log format, the value "
"must be only string.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that could be re-worded slightly to "must be a string."

Comment on lines +182 to +183
next = 0;
has_js = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what I'm going to say!

These could just be declared (and initialised) here to reduce their scope...


next = 0;

for (i = 0; i < n; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we're compiling with -std=gnu11, you could take advantage of C99 for loop initialisers... I.e.

for (uint32_t i = 0, i < n; i++) {

nxt_router_conf_t *rtcf;

if (nxt_tstr_is_const(tstr)) {
nxt_tstr_str(tstr, &ctx->text);

} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, Mr Reduce Scope again!

} else {
    nxt_int_t          ret;
    nxt_router_conf_t  *rtcf;

return NXT_ERROR;
}

for (i = 0; i < format->nmembers; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i could be for loop initialised...

if (nxt_tstr_is_const(member->tstr)) {
nxt_tstr_str(member->tstr, &str);

} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like ret could be declared here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

member could be declared in the for loop...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, thanks for the suggestions.
I'm fine with both styles. And I'm used to declaring variables at the beginning of functions, so I'd like to keep this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it is more than about style, it's good programming practice, which is why I won't stop suggesting it ;)

Note: declaring variables at start of scope is not the same as declaring variables randomly throughout the function, which I certainly wouldn't suggest to do!

Copy link
Contributor Author

@hongzhidao hongzhidao Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both ways have pros and cons. Declaring at the start makes all variables visible at a glance and makes reuse easier.

Take this function as an example:

static nxt_int_t
nxt_router_access_log_json(nxt_task_t *task, nxt_http_request_t *r,
    nxt_router_access_log_ctx_t *ctx, nxt_router_access_log_format_t *format)
{
    nxt_uint_t                      i;
    nxt_conf_value_t                *value;
    nxt_router_access_log_member_t  *member;

    value = nxt_conf_create_object(r->mem_pool, format->nmembers);
    if (nxt_slow_path(value == NULL)) {
        return NXT_ERROR;
    }

    for (i = 0; i < format->nmembers; i++) {
        member = &format->member[i];

        ...

        nxt_conf_set_member_string(value, &member->name, &str, i);
    }

    ...

    return NXT_OK;
}

I feel it's common to write like it. If we asked someone to totally follow something like the above "practice", it looks like that:

static nxt_int_t
nxt_router_access_log_json(nxt_task_t *task, nxt_http_request_t *r,
    nxt_router_access_log_ctx_t *ctx, nxt_router_access_log_format_t *format)
{
    nxt_conf_value_t  *value;

    value = nxt_conf_create_object(r->mem_pool, format->nmembers);
    if (nxt_slow_path(value == NULL)) {
        return NXT_ERROR;
    }

    for (nxt_uint_t i = 0; i < format->nmembers; i++) {
        nxt_router_access_log_member_t  *member;

        member = &format->member[i];

        ...

        nxt_conf_set_member_string(value, &member->name, &str, i);
    }

    ...

    return NXT_OK;
}

From the point of code style, both ways look fine.
It doesn't seem to make a difference in terms of whether it's more buggy or not.

I'm just using this function as an example, because it's not big, and I'm not sure if it's going to be different if it's really big

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed it based on the specific code which I listed above. Please do not expand the scope and apply one "principle" to all situations.

I am talking generally (not specific examples)...

Reducing variable scope is just good programming practice... ipso facto...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take nxt_router_access_log_json() as an example, i and member are only used in the for sentence, I can't see benefits to force their declaration within it.
I would be confused if it can introduce issues in this way, I'm only talking about the function since we are reviewing it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take nxt_router_access_log_json() as an example, i and member are only used in the for sentence,

Yes, that's exactly the point! They are only used in the for loop so that is where they should be scoped.

I can't see benefits to force their declaration within it.

I already gave an example of a benefit...

I would be confused if it can introduce issues in this way, I'm only talking about the function since we are reviewing it.

Never underestimate the propensity for introducing bugs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a function is as small as array_sum(), should it follow the practice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly no harm... and it's more concise to read...

@hongzhidao hongzhidao force-pushed the json-access-log branch 2 times, most recently from 8e2149a to d4fa165 Compare November 7, 2024 05:12
Copy link
Member

@ac000 ac000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this doesn't change wildly...

This is a preparatory refactoring for upcoming JSON format support
in access log.

No functional changes.
This is a preparatory refactoring for upcoming JSON format support
in access log.

No functional changes.
This is a preparatory refactoring for upcoming JSON format support
in access log. We will extend format option to access object for
JSON support.

No functional changes.
Allow format to be an object to generate JSON logs. The object keys
become JSON field names, and values support string, variable, and JS.

Note that when there is no JS in the format values, the object will
be pre-serialized to a JSON template string at configuration phase
for better performance.

Example config:
  {
    "access_log": {
      "path": "/tmp/access.log",
      "format": {
        "remote_addr": "$remote_addr",
        "time_local": "$time_local",
        "request_line": "$request_line",
        "status": "$status",
        "body_bytes_sent": "$body_bytes_sent",
        "header_referer": "$header_referer",
        "header_user_agent": "$header_user_agent"
      }
    }
  }
@hongzhidao hongzhidao merged commit a355179 into nginx:master Nov 14, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants