-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regex support #69
Comments
I think this is an excellent idea. Could you give me an example of the kind of thing you need to run a regex over? Please be as specific as possible: it would be helpful in designing such a feature to have some example input, the expected output, and the kind of regex you'd need. |
Let's imagine I want to check the stderr output of a given command containing a particular pattern. For example, let's say I want to check if an error in a specific location occurs while running
PS. The error pattern in the example is used by GHC compiler, if I'm not mistaken. |
That seems quite sensible. You could run a filter over it to remove the unimportant parts but that might make a test failure message very confusing. How do you feel about this kind of syntax? command:
- MyTool
tests:
- name: negative
args:
- tests/negative/MyFile.X
exit-status: 1
stderr:
match: '(.+)/([^/]+)\.X:11:7-8: error.*' (You can probably get away with fewer escapes if we use Perl-compatible regular expressions and don't need to support Perl- or sed-style syntax.) You do lose one nice feature, in that you can't show a diff when the match is a regex, but as long as you're using it on short outputs, I don't think that's an issue. |
That syntax seems excellent. Fewer escapes for a regex, the better. Another possible option that may become handy is if we specify if the match has to occur in the whole output or a line. Maybe, something like this. But for now, the initial proposal sounds fantastic! ...
exit-status: 1
stderr:
match:
inline: '^(.+)/([^/]+)\.X:11:7-\d+: error.*'
... |
I suggest we make use of PCRE's native functionality. You can use I suggest a syntax along the lines of: match: <pattern> or: match:
pattern: <pattern>
flags:
- multiline # or `m`
- caseless # or `i`
- etc. The flags are specified in the pcrepattern specification. We won't have a way to match against multiple regular expressions and I can probably take a crack at this sometime in the next week, though I make no guarantees as to when I will finish it. If you'd like to give it a go, you're more than welcome. What would you prefer? |
Just to give you an update: in trying to implement this, I have found a few bugs in edge cases when interpreting the YAML, which are exacerbated by trying to add another type of matcher. So it's taking longer than expected. (I'm also not working on this full-time, as you might have guessed.) Long story short, probably don't implement even a simple expression language in YAML. I hope I can get this done soon-ish, but there is of course the family, who come first, and the day job, who pay me, so please don't hold your breath on it happening in the next couple of weeks. |
Understandable. I have similar time constraints and a trip this week, so please don't feel pressured. Maybe you want to share a branch to see the progress or suggest which places in the codebase to look first for anyone to experiment. |
This boils down to adding a new assertion type in I have laid a little bit of the groundwork in the |
Thinking about it, we may be able to just ignore the parsing issue that tripped me up. The main issue is that we can write:
But we can't write:
This is inconsistent with assertions that use a file. But perhaps we can just live with it for now, as long as we can write |
I have managed to make the time to sort out those vaguely-related issues. The next challenge is picking a regex engine. I was hoping to support Perl-compatible regular expressions (PCRE) as they're very powerful and I like them a lot. However, all the implementations require linking to a native library, which (a) means that we have a dynamic dependency we didn't have before, and (b) Windows support will be a nightmare. I tried getting both I guess I'll be trying |
I left this for a few months and when I came back, Windows support magically started working. I've now added this in #79. Please give it a shot, following the README changes, and let me know if you have any feedback. I'll merge it and ship a new release if it looks good for you (or in a few days if I get no response). |
I ended up reverting this change because I couldn't release; there seems to be no way to use text-icu statically across operating systems and architectures. Now I'm considering the regex-tdfa library, which supports POSIX regular expressions. They would be more limited. If you would still find this feature useful, could you let me know if that suits your purposes? |
Gonna re-open this. In #100, @ivan19871002 points out that regex-tdfa has some memory consumption issues, so that's not really an option. I will keep looking for a pure Haskell alternative. |
I revert this commit 888148d. re-add regex support. works well on all platform (x86_x64 linux , x86 mac, aarch64 linux) |
I'm glad it's working well for you. I have done the same thing and pushed the changes as the regex-icu branch, which you are welcome to use instead. I will support this branch and fix any issues until I can find a better solution, or figure out how to compile things statically on Windows and macOS. |
macOS brew link icu4c before compile export LDFLAGS="-L/usr/local/opt/icu4c/lib" then |
Hey
I didn't see it any support for testing the output against a regex expression in the documentation.
Sorry if I miss it, but I consider it essential. One of the reasons I'm still using ShellTestRunner.
The text was updated successfully, but these errors were encountered: