enhancement (json): enhance JSON Process Tool extraction to return st… #10575

BenjaminX · 2024-11-12T06:37:33Z

…ructured messages and improve error handling

Checklist:

Important

Please review the checklist below before submitting your pull request.

Please open an issue before creating a PR or link to an existing issue
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Description

Link issue #10559

Demo data:

{ 
    "Tables": [
        { 
            "Name": "123Name", 
            "ID": "1", 
            "DDL": "rewrew", 
            "QAs": "rewrewfew", 
            "SQLs": "fwrfre", 
            "Memo": "freferfre" 
        }, 
        { 
            "Name": "321Name", 
            "ID": "2", 
            "DDL": "fdsfdsfdswer", 
            "QAs": "32423r3", 
            "SQLs": "654654g54", 
            "Memo": "54332423423" 
        } 
    ] 
}

JSONPath filters:
$.Tables[*]

Expected

{
  "text": "[{\"Name\": \"123Name\", \"ID\": \"1\", \"DDL\": \"rewrew\", \"QAs\": \"rewrewfew\", \"SQLs\": \"fwrfre\", \"Memo\": \"freferfre\"}, {\"Name\": \"321Name\", \"ID\": \"2\", \"DDL\": \"fdsfdsfdswer\", \"QAs\": \"32423r3\", \"SQLs\": \"654654g54\", \"Memo\": \"54332423423\"}]\n",
  "files": [],
  "json": [
    {
      "0": {
        "Name": "123Name",
        "ID": "1",
        "DDL": "rewrew",
        "QAs": "rewrewfew",
        "SQLs": "fwrfre",
        "Memo": "freferfre"
      },
      "1": {
        "Name": "321Name",
        "ID": "2",
        "DDL": "fdsfdsfdswer",
        "QAs": "32423r3",
        "SQLs": "654654g54",
        "Memo": "54332423423"
      }
    }
  ]
}

Actual

{
  "text": "[{\"Name\": \"123Name\", \"ID\": \"1\", \"DDL\": \"rewrew\", \"QAs\": \"rewrewfew\", \"SQLs\": \"fwrfre\", \"Memo\": \"freferfre\"}, {\"Name\": \"321Name\", \"ID\": \"2\", \"DDL\": \"fdsfdsfdswer\", \"QAs\": \"32423r3\", \"SQLs\": \"654654g54\", \"Memo\": \"54332423423\"}]",
  "files": [],
  "json": []
}

Did not parse the json array objects, fixed return objects by filters.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update, included: Dify Document
Improvement, including but not limited to code refactoring, performance optimization, and UI/UX improvement
Dependency upgrade

Testing Instructions

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

Test A
Test B

BenjaminX · 2024-11-14T02:16:10Z

@crazywoola
bro, pls review code and check it.
Please help me merge to the main branch

If you have any questions about the code, please let me know in time.
Thx a lot

BenjaminX · 2024-11-18T04:42:43Z

@crazywoola
have you any concerns about PR?

PedroGomes02 · 2024-11-18T17:56:43Z

Hi there,
It seems you're looking for a tool that outputs the entire JSON representation of an object, allowing it to be used in subsequent workflow nodes. Is that correct? Essentially, the tool would return the full JSON object in its entirety.
In my opinion:
Rename JSON Parse to JSON Parse & Extractor:
This updated tool will always return the full JSON object (parse) and, if a JSONPath is provided, extract and return the corresponding values (extractor), current behavior.
This ensures the tool serves both purposes without disrupting existing workflows, maintaining backward compatibility while adding clarity and functionality (real JSON parse of the string).
Introduce a JSON Filter Tool:
This new tool would perform the opposite of JSON Delete by returning a JSON object (or its text representation) that contains only the fields matching the specified JSONPath.
This complements JSON Parse & Extractor by offering more fine-grained filtering options.

With these updates, the toolset would address a wider range of needs while remaining easy to use. Let me know your thoughts on this proposal!

BenjaminX · 2024-11-19T03:01:11Z

Hi there, It seems you're looking for a tool that outputs the entire JSON representation of an object, allowing it to be used in subsequent workflow nodes. Is that correct? Essentially, the tool would return the full JSON object in its entirety. In my opinion: Rename JSON Parse to JSON Parse & Extractor: This updated tool will always return the full JSON object (parse) and, if a JSONPath is provided, extract and return the corresponding values (extractor), current behavior. This ensures the tool serves both purposes without disrupting existing workflows, maintaining backward compatibility while adding clarity and functionality (real JSON parse of the string). Introduce a JSON Filter Tool: This new tool would perform the opposite of JSON Delete by returning a JSON object (or its text representation) that contains only the fields matching the specified JSONPath. This complements JSON Parse & Extractor by offering more fine-grained filtering options.

With these updates, the toolset would address a wider range of needs while remaining easy to use. Let me know your thoughts on this proposal!

Hi Pedro,
'It seems you're looking for a tool that outputs the entire JSON representation of an object, allowing it to be used in subsequent workflow nodes. Is that correct? Essentially, the tool would return the full JSON object in its entirety.
In my opinion:
Rename JSON Parse to JSON Parse & Extractor:
This updated tool will always return the full JSON object (parse) and, if a JSONPath is provided, extract and return the corresponding values (extractor), current behavior.'

Yes, Absolutely correct.

Just like you said, for backward compatibility. I also agree, making a new JSON Parse & Extractor tool might be a better choice than modifying the current one.

I will modify this part of the code, adding a JSON Extractor in the JSON Process Tool, while retaining the existing four functions: Parse, Insert, Delete, Replace.

@crazywoola have u anything comments and suggestion about this?

…ructured messages and improve error handling

BenjaminX · 2024-11-22T01:35:08Z

@crazywoola
Revised the implementation method entirely according to @PedroGomes02's suggestion, please code review.

Thx PedroGomes02

PedroGomes02 · 2024-11-22T12:15:16Z

Hi, my original idea is to adapt the parse tool to:

Return the filtered JSONPath text message, as it currently does (backward compatibility), but make this optional (with the "json_filter" parameter not required in parse.yaml).
Additionally, I propose adding a new boolean parameter in parse.yaml to optionally return the fully parsed JSON.

This way, we can use the parse tool either to filter/extract specific information or to return the fully parsed JSON, or both.
I think the introduction on a new JSON Filter Tool (returning a JSON object (or its text representation) that contains only the fields matching the specified JSONPath) should be done separately from this one.

parse.py

    def _invoke(
        self,
        user_id: str,
        tool_parameters: dict[str, Any],
    ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
        """
        invoke tools
        """
        # get tool parameters
        content = tool_parameters.get("content", "")
        json_filter = tool_parameters.get("json_filter", "")
        ensure_ascii = tool_parameters.get("ensure_ascii", True)
        output_full_parsed_json = tool_parameters.get("output_full_parsed_json", True)
        
        if not content:
            return self.create_text_message("Invalid parameter content")

        try:
            final_result = []
            if output_full_parsed_json:
                # parse full json
                json_content = json.loads(content)

                # append json_messages to final_result
                if isinstance(json_content, list):
                    for item in json_content:
                        final_result.append(self.create_json_message(item))
                else:
                    final_result.append(self.create_json_message(json_content))

            if json_filter:
                filtered_result = self._extract(content, json_filter, ensure_ascii)
                final_result.append(self.create_text_message(str(filtered_result)))

            return final_result

        except Exception:
            return self.create_text_message("Failed to extract JSON content")

parse.yaml

identity:
  name: parse
  author: Mingwei Zhang
  label:
    en_US: JSON Parse
    zh_Hans: JSON 解析
    pt_BR: JSON Parse
description:
  human:
    en_US: A tool for extracting JSON objects
    zh_Hans: 一个解析JSON对象的工具
    pt_BR: A tool for extracting JSON objects
  llm: A tool for extracting JSON objects
parameters:
  - name: content
    type: string
    required: true
    label:
      en_US: JSON data
      zh_Hans: JSON数据
      pt_BR: JSON data
    human_description:
      en_US: JSON data
      zh_Hans: JSON数据
      pt_BR: JSON数据
    llm_description: JSON data to be processed
    form: llm
  - name: json_filter
    type: string
    required: false
    label:
      en_US: JSON filter
      zh_Hans: JSON解析对象
      pt_BR: JSON filter
    human_description:
      en_US: JSON fields to be parsed
      zh_Hans: 需要解析的 JSON 字段
      pt_BR: JSON fields to be parsed
    llm_description: JSON fields to be parsed
    form: llm
  - name: ensure_ascii
    type: boolean
    default: true
    label:
      en_US: Ensure ASCII
      zh_Hans: 确保 ASCII
      pt_BR: Ensure ASCII
    human_description:
      en_US: Ensure the JSON output is ASCII encoded
      zh_Hans: 确保输出的 JSON 是 ASCII 编码
      pt_BR: Ensure the JSON output is ASCII encoded
    form: form
  - name: output_full_parsed_json
    type: boolean
    default: true
    label:
      en_US: Output Full Parsed JSON
      zh_Hans: 输出完整解析的 JSON
      pt_BR: Output Full Parsed JSON
    human_description:
      en_US: The full parsed JSON is also outputted
      zh_Hans: 完整解析的 JSON 也已输出
      pt_BR: The full parsed JSON is also outputted
    form: form

BenjaminX · 2024-11-22T14:21:33Z

Hi, my original idea is to adapt the parse tool to:

Return the filtered JSONPath text message, as it currently does (backward compatibility), but make this optional (with the "json_filter" parameter not required in parse.yaml).
Additionally, I propose adding a new boolean parameter in parse.yaml to optionally return the fully parsed JSON.

This way, we can use the parse tool either to filter/extract specific information or to return the fully parsed JSON, or both. I think the introduction on a new JSON Filter Tool (returning a JSON object (or its text representation) that contains only the fields matching the specified JSONPath) should be done separately from this one.

parse.py

    def _invoke(
        self,
        user_id: str,
        tool_parameters: dict[str, Any],
    ) -> Union[ToolInvokeMessage, list[ToolInvokeMessage]]:
        """
        invoke tools
        """
        # get tool parameters
        content = tool_parameters.get("content", "")
        json_filter = tool_parameters.get("json_filter", "")
        ensure_ascii = tool_parameters.get("ensure_ascii", True)
        output_full_parsed_json = tool_parameters.get("output_full_parsed_json", True)
        
        if not content:
            return self.create_text_message("Invalid parameter content")

        try:
            final_result = []
            if output_full_parsed_json:
                # parse full json
                json_content = json.loads(content)

                # append json_messages to final_result
                if isinstance(json_content, list):
                    for item in json_content:
                        final_result.append(self.create_json_message(item))
                else:
                    final_result.append(self.create_json_message(json_content))

            if json_filter:
                filtered_result = self._extract(content, json_filter, ensure_ascii)
                final_result.append(self.create_text_message(str(filtered_result)))

            return final_result

        except Exception:
            return self.create_text_message("Failed to extract JSON content")

parse.yaml

identity:
  name: parse
  author: Mingwei Zhang
  label:
    en_US: JSON Parse
    zh_Hans: JSON 解析
    pt_BR: JSON Parse
description:
  human:
    en_US: A tool for extracting JSON objects
    zh_Hans: 一个解析JSON对象的工具
    pt_BR: A tool for extracting JSON objects
  llm: A tool for extracting JSON objects
parameters:
  - name: content
    type: string
    required: true
    label:
      en_US: JSON data
      zh_Hans: JSON数据
      pt_BR: JSON data
    human_description:
      en_US: JSON data
      zh_Hans: JSON数据
      pt_BR: JSON数据
    llm_description: JSON data to be processed
    form: llm
  - name: json_filter
    type: string
    required: false
    label:
      en_US: JSON filter
      zh_Hans: JSON解析对象
      pt_BR: JSON filter
    human_description:
      en_US: JSON fields to be parsed
      zh_Hans: 需要解析的 JSON 字段
      pt_BR: JSON fields to be parsed
    llm_description: JSON fields to be parsed
    form: llm
  - name: ensure_ascii
    type: boolean
    default: true
    label:
      en_US: Ensure ASCII
      zh_Hans: 确保 ASCII
      pt_BR: Ensure ASCII
    human_description:
      en_US: Ensure the JSON output is ASCII encoded
      zh_Hans: 确保输出的 JSON 是 ASCII 编码
      pt_BR: Ensure the JSON output is ASCII encoded
    form: form
  - name: output_full_parsed_json
    type: boolean
    default: true
    label:
      en_US: Output Full Parsed JSON
      zh_Hans: 输出完整解析的 JSON
      pt_BR: Output Full Parsed JSON
    human_description:
      en_US: The full parsed JSON is also outputted
      zh_Hans: 完整解析的 JSON 也已输出
      pt_BR: The full parsed JSON is also outputted
    form: form

This code implementation is really better, re-commit.

…optional filtering

…ts and improving readability

PedroGomes02 · 2024-11-22T14:30:28Z

This is to replace parse.py tool, not create a next extractor tool

…meters

BenjaminX · 2024-11-22T14:34:24Z

This is to replace parse.py tool, not create a next extractor tool

sorry, follow you code.

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. 💪 enhancement New feature or request labels Nov 12, 2024

BenjaminX mentioned this pull request Nov 12, 2024

Buildin JSON Process tool return issues with JSONPath #10559

Closed

5 tasks

BenjaminX added 4 commits November 22, 2024 09:27

enhancement (json): enhance JSON Process Tool extraction to return st…

a8021be

…ructured messages and improve error handling

Revert: process json triple arguments

29ee137

feat(json): add extractor module for JSON processing

cc492c9

feat(json): implement JSON extractor tool with filtering capabilities

e3474b6

BenjaminX force-pushed the buildin_json_process_issue branch from 643fb5c to e3474b6 Compare November 22, 2024 01:28

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Nov 22, 2024

BenjaminX added 3 commits November 22, 2024 22:21

feat(json): enhance JSON extractor to support full parsed output and …

000a5c2

…optional filtering

refactor(json): clean up JSON extractor code by removing unused impor…

3849157

…ts and improving readability

feat(json): implement JSON extraction with filtering using jsonpath_ng

f4f7cc6

BenjaminX added 2 commits November 22, 2024 22:32

refactor(json): remove JSON extractor tool and update parse tool para…

1a5d60b

…meters

fix(json): remove unnecessary whitespace in JSONParseTool class

f122a0c

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Nov 22, 2024

fix(yaml): add missing newline at end of parse.yaml file

132225f

PedroGomes02 approved these changes Nov 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement (json): enhance JSON Process Tool extraction to return st… #10575

enhancement (json): enhance JSON Process Tool extraction to return st… #10575

BenjaminX commented Nov 12, 2024 •

edited

Loading

BenjaminX commented Nov 14, 2024 •

edited

Loading

BenjaminX commented Nov 18, 2024

PedroGomes02 commented Nov 18, 2024

BenjaminX commented Nov 19, 2024 •

edited

Loading

BenjaminX commented Nov 22, 2024

PedroGomes02 commented Nov 22, 2024

BenjaminX commented Nov 22, 2024

parse.py

parse.yaml

PedroGomes02 commented Nov 22, 2024

BenjaminX commented Nov 22, 2024

enhancement (json): enhance JSON Process Tool extraction to return st… #10575

Are you sure you want to change the base?

enhancement (json): enhance JSON Process Tool extraction to return st… #10575

Conversation

BenjaminX commented Nov 12, 2024 • edited Loading

Checklist:

Description

Type of Change

Testing Instructions

BenjaminX commented Nov 14, 2024 • edited Loading

BenjaminX commented Nov 18, 2024

PedroGomes02 commented Nov 18, 2024

BenjaminX commented Nov 19, 2024 • edited Loading

BenjaminX commented Nov 22, 2024

PedroGomes02 commented Nov 22, 2024

parse.py

parse.yaml

BenjaminX commented Nov 22, 2024

parse.py

parse.yaml

PedroGomes02 commented Nov 22, 2024

BenjaminX commented Nov 22, 2024

BenjaminX commented Nov 12, 2024 •

edited

Loading

BenjaminX commented Nov 14, 2024 •

edited

Loading

BenjaminX commented Nov 19, 2024 •

edited

Loading