-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to return empty string rather than None
for the zero_or_more
case?
#22
Comments
I note the following difference, I think I read somewhere that you use >>> m=re.search(r"-(?P<a>(.+)?) (?P<b>(.+)?)", "- "); f"a={m.group('a')} b={m.group('b')}"
'a= b='
>>> m.groups()
('', None, '', None)
>>> m=re.search(r"-(?P<a>(.*)?) (?P<b>(.*)?)", "- "); f"a={m.group('a')} b={m.group('b')}"
'a= b='
>>> m.groups()
('', '', '', '') If it's not a match an error is raised >>> m=re.search(r"-(?P<a>(.*)?) (?P<b>(.*)?)", "x"); f"a={m.group('a')} b={m.group('b')}"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group' |
On closer review it looks like the culprit is This is the example of: @with_pattern(r".*")
def parse_str(text: str) -> str:
return text
extra_types = {"Stringlike": parse_str}
parser = Parser("-{content:Stringlike?}", extra_types=extra_types)
parser.parse("-") (Pdb) p m.groupdict()
{'content': ''}
...
(Pdb) n
> /home/louis/miniconda3/lib/python3.10/site-packages/parse.py(580)evaluate_result()
-> if k in self._type_conversions:
(Pdb) p self._type_conversions
{'content': <parse.convert_first object at 0x7f3d9deb7d90>} This function is just a helper (source): class convert_first:
"""Convert the first element of a pair.
This equivalent to lambda s,m: converter(s). But unlike a lambda function, it can be pickled
"""
def __init__(self, converter):
self.converter = converter
def __call__(self, string, match):
return self.converter(string) You can use it with anything >>> parse.convert_first(print)(1,2)
1 i.e. it equals (Pdb) n
> /home/louis/miniconda3/lib/python3.10/site-packages/parse.py(581)evaluate_result()
-> value = self._type_conversions[k](groupdict[k], m) It turns out this is where the (Pdb) p self._type_conversions[k].converter
<function TypeBuilder.with_zero_or_one.<locals>.convert_optional at 0x7f3d9dea7520> (Pdb) n
> /home/louis/miniconda3/lib/python3.10/site-packages/parse.py(585)evaluate_result()
-> named_fields[korig] = value
(Pdb) p value
None |
Update: here is the implementation I ended up with for reference, I found this quite involved but it appears to be robust. I think this might constitute a candidate for adding to the library, I'd be interested in your thoughts. from parse import Parser, with_pattern
from parse_type import TypeBuilder
@with_pattern(r".*")
def parse_str(text: str) -> str:
return text
class SmolStrTypeBuilder(TypeBuilder):
@classmethod
def with_zero_or_more_chars(cls, converter, pattern=None):
nullable_optional = cls.with_zero_or_one(converter=converter, pattern=pattern)
@with_pattern(nullable_optional.pattern)
def convert_optional(text, m=None):
"""Uses the empty string as the sentinel instead of `None`."""
return converter(text) if text else ""
convert_optional.regex_group_count = nullable_optional.regex_group_count
return convert_optional
def check(parser: Parser, schema: str, expected: list[str], /) -> None:
"""Validate the parsed field values against their expected values."""
result = parser.parse(schema)
try:
assert result is not None, f"Parse failed for {schema!r} ({expected=})"
values = [result[f] for f in parser.named_fields]
assert values == expected, f"Parsed {schema!r} as {values} ({expected=})"
except AssertionError as exc:
print(f" F {exc}")
else:
print(f" P {schema!r} ---> {result}")
parse_any_width_string = SmolStrTypeBuilder.with_zero_or_more_chars(parse_str)
extra_types = {"?": parse_any_width_string}
parser = Parser("-{content:?}", extra_types=extra_types)
print(f"EXPR {parser._expression}")
check(parser, "-hello world", ["hello world"])
check(parser, "-", [""])
print()
parser = Parser("-{a:?} {b:?}", extra_types=extra_types)
print(f"EXPR {parser._expression}")
check(parser, "-A B", ["A", "B"])
check(parser, "-A ", ["A", ""])
check(parser, "- B", ["", "B"])
check(parser, "- ", ["", ""]) This outputs all "P" (passed tests):
|
Hi there, I was very pleased to find a solution to the inability to generate a regex for
(.*?)
in capture groups via theparse
library, only(.+?)
. I feel it's a shame that the libraries could not be merged, but such is open source.I've studied your docs and comments on the other repo and written out test cases for the behaviour I'm after.
I've only managed to make "optional strings" (nullable strings,
Union[str,None]
) whereas what I really want is "any width strings" (length 0+,str
).Here's the code I wrote to achieve it:
Which results in
Note that in my code I extract the field value
or ""
so I can test against lists of strings including the empty string rather thanNone
.What I would really like here is to eliminate that
or
statement, I really just want strings.I suspect that the place to do so would be to hook into the
TypeBuilder
but I'm falling very far down the rabbit hole at this point! If you could guide me I would appreciate it greatly :-)The text was updated successfully, but these errors were encountered: