Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed regex in unbound2plain.py #62

Closed
wants to merge 5 commits into from

Conversation

tjohnson314
Copy link
Collaborator

Fixes #25

@bbriggs
Copy link
Contributor

bbriggs commented Jan 21, 2016

The new regex looks like it's working.

Now we need to update asv_plain.txt and spanish_plain.txt to reflect the new standard. Can you add that data and push it onto this PR?

@tjohnson314
Copy link
Collaborator Author

Oops, this isn't what I wanted. I updated the files inside the Vagrant box by running the new version of unbound2plain.py, exited, and committed them. But somehow once I'm outside the Vagrant box, the old version of the files is still there.

I'm confused now. I thought the corpora folders were synced? Maybe that doesn't guarantee that the subfolders under them are synced?

@bbriggs
Copy link
Contributor

bbriggs commented Jan 21, 2016

When you run unbound2plain.py from the tests folder, it outputs a file called asv_plain.txt.out. Move that to asv_plain.txt. Do the same for spanish. Those will serve as our test standards.

You can do this from outside the Mgiza machine by starting at the root of the git repo and running sh ./mgiza/tests/unbound2plain.sh

-- Briggs

On Jan 20, 2016, at 9:35 PM, tjohnson314 [email protected] wrote:

Oops, this isn't what I wanted. I updated the files inside the Vagrant box by running the new version of unbound2plain.py, exited, and committed them. But somehow once I'm outside the Vagrant box, the old version of the files is still there.

I'm confused now. I thought the corpora folders were synced? Maybe that doesn't guarantee that the subfolders under them are synced?


Reply to this email directly or view it on GitHub.

@bbriggs
Copy link
Contributor

bbriggs commented Jan 21, 2016

Now that I think about it, the output from my_tokenizer.py depends on correct input from unbound2plain.py, so you'll need to update the following files in /tests/data:

asv_plain.txt
spanish_plain.txt
asv_tokenized.txt
spanish_tokenized.txt

@tjohnson314
Copy link
Collaborator Author

The log might be messier than it needed to be, but everything should work now, haha.

@bbriggs
Copy link
Contributor

bbriggs commented Jan 22, 2016

Okay, @tjohnson314, I hate to do this to you, but here goes: We merged that reeeeally big re-org into development last night. Can you please pull and merge that change into your PR, then push it back? That should get us fixed for good on the testing front.

@bbriggs bbriggs self-assigned this Jan 22, 2016
bbriggs pushed a commit to bbriggs/rtt-linguiflow that referenced this pull request Jan 22, 2016
bbriggs pushed a commit to bbriggs/rtt-linguiflow that referenced this pull request Jan 23, 2016
stealthybox added a commit that referenced this pull request Jan 25, 2016
Integrates changes from #62 into new dev dir structure. Connects #25.
@stealthybox
Copy link
Collaborator

#66 is merged.
We should ensure that @tjohnson314 has a commit history though.

@bbriggs
Copy link
Contributor

bbriggs commented Jan 25, 2016

Agreed. His works as a hotfix on master and will show up in the commit history, but might not show up in the git blame once we push dev onto master.

@tjohnson314
Copy link
Collaborator Author

Wow, I didn't know git blame was a thing. That's pretty cool.
So what should I do?

@bbriggs
Copy link
Contributor

bbriggs commented Jan 25, 2016

I don't think you really need to do anything. You submitted a critically important hotfix to the master branch, which we really needed.

It all depends on how you want to be credited. Your commit will remain in the git history, but the blame will most likely shift to me for that line since I submitted that change in dev after yours on master. It's just a single line, though, and you have lots of other code contributed. Not to mention that you're driving a lot of our algorithm design. I wouldn't sweat credit over a single line in a regex :)

@bbriggs
Copy link
Contributor

bbriggs commented Jan 29, 2016

I really hate to do this and it's my mistake, but I fixed this in another PR thinking that this one was targeted at the master branch. Merging it in would require a lot of de-conflicting to just bring us back to where we already are.

I messed up, sorry.

@bbriggs bbriggs closed this Jan 29, 2016
@tjohnson314
Copy link
Collaborator Author

If you're worried about me getting credit, it doesn't really matter to me. That's not why I'm working on this. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants