Follow-up of GSoC: remove obsolete parts from main gem #405

zverok · 2017-09-01T21:44:49Z

I believe after release of daru-io there should be a period of cleaning up daru of duplicating things (importers and exporters), and related outdated stuff, like mentioned here: Please do not show the Install optional dependencies message everytime on runtime #404
I believe daru-io should be hard dependency for daru (like rspec installs rspec-core, rspec-expectations and so on) -- maybe in future with extraction of daru-dataframe-only gem, maybe not. But the point is, if somebody just install daru and wants his DF from CSV, they don't need to think about daru-io explicitly.
Currently I am not sure about parts of daru that can be removed due to daru-views existance, subject to investigate.
I believe such API change is marking next major version, and probably several other things should be done before the release, like Drop support for Ruby 2.0 #336, and probably Enhance specs style #317 and Make all inline docs proper YARD #318
And by this time we probably can do a Great Cleanup, like removing completely everything that was deprecated in last versions, and deprecate a bunch of another stuff, and process #cleanup-tagged issues, and merge some stuff like Defining group_by#aggregate #340 and ...

I am ready to lead the work (and do the most complicated parts myself), and propose the following (really hi-level) timeline:

September: finalizing GSoC, planning new release and define roadmap 🤔 💭
October: hard work
Beginning of November, hopefully (the earlier, the better, maybe I am overestimating the complexity of the road): Big Shiny Release 🎉

@v0dro @lokeshh @athityakumar @Shekharrajak @parthm @rainchen please feel free to share your thoughts, and notify if you are willing to take some part of the work.

The text was updated successfully, but these errors were encountered:

zverok · 2017-09-01T21:48:00Z

Probably, the work should be done on some daru-1.0 branch, and (as GitHub allows it) PRs on the road should be merged there, not in master?.. Or master would became our moving target, and stable branch should be maintained for current bugs fixing/small enchancements.

athityakumar · 2017-09-01T21:59:24Z

Since it's usually an understanding that master (or default branch) is usually the stable branch, I feel that it'd be better if we continue on daru-1.0 branch. Regarding the port of daru-io, I'd definitely like to contribute in removing obsolete IO parts from daru. A few pointers related to this porting :

IMO, #save and #load methods are better suited to reside within daru.
All other IO methods have already been ported to daru-io, and can be removed from daru.
Though #to_html seems like an Exporter method at first look, I feel that it'd be better to have it in daru-view. Of course, it can be ported to daru-io too if that'd be better.

zverok · 2017-09-01T22:06:39Z

Since it's usually an understanding that master (or default branch) is usually the stable branch, I feel that it'd be better if we continue on daru-1.0 branch.

👍

Regarding the port of daru-io, I'd definitely like to contribute in removing obsolete IO parts from daru.

👍

IMO, #save and #load methods are better suited to reside within daru.

👍

All other IO methods have already been ported to daru-io, and can be removed from daru.

👍

Though #to_html seems like an Exporter method at first look, I feel that it'd be better to have it in daru-view.

👍

Looks like we are on the same page currently :)

Shekharrajak · 2017-09-02T15:21:01Z

daru uses Nyaplot, Gruff plotting libraries. daru-view uses Nyaplot (same as daru but it is able to generate html code) , HighCharts, GoogleCharts for plotting.

So if we want to shift the whole plotting libraries to daru-view then we need to shift the Gruff library to daru-view (without any change) and need some changes in daru-view/nyaplot module.

I will try to do this part of work.

parthm · 2017-09-03T13:48:00Z

@zverok looks like a good plan. I would be glad to help on some pieces as time permits.

Since this is toward v1.0, it may be worth reviewing Daru API for consistency. As 1.0 is a major version I would think it would be significantly more difficult to make any API adjustments post 1.0. Two examples that come to mind are the discussion on to/from-read/write on issue #280 and the inconsistency between DataFrame#vector_sum and Vector#sum in issue #391 (the latter ignores nil by default, the former doesn't). With Daru usage growing, it may be worth using the major version change to iron out any API inconsistencies.

A somewhat related question would be, does Daru have a deprecation policy and versioning guideline in place to help with breaking changes? This can help users with migration to newer versions of Daru and becomes important due to its growing adoption.

parthm · 2017-09-03T13:58:32Z

Just too keep track of another API proposal discussed, issue #131 talks about changing #dv to #to_daru_vector.

zverok · 2017-09-03T17:19:15Z

I would be glad to help on some pieces as time permits.

Nice to hear that!

Since this is toward v1.0, it may be worth reviewing Daru API for consistency.

Yes, it would definitely a huge review. As @v0dro officially made me the maintainer, I am planning a Big Autumn Cleanup for a huge part of our ancestry Hold your hats.

A somewhat related question would be, does Daru have a deprecation policy and versioning guideline in place to help with breaking changes? This can help users with migration to newer versions of Daru and becomes important due to its growing adoption.

We have deprecate API (which marks methods as deprecated and prints warnings), but to that point never removed anything deprecated. Probably, it is a good point to explicitly set the policy (and even add some automation tool for check it, like "should ve drop this or that methods in current version?")

zverok · 2017-09-03T19:21:12Z

Just too keep track of another API proposal discussed, issue #131 talks about changing #dv to #to_daru_vector.

To be honest, current preferred by community (outside Rails one, ofc) Ruby style is "avoid monkey patches of core classes if only possible". I believe we'll follow this rule in 1.0

v0dro · 2017-09-05T13:49:46Z

@baarkerlounger, since you're working on release cycles, will it possible for you to take inspiration from your previous work and other gems and frame a 'release policy' for daru which we will consistetly follow for every single new release in the future? I think it is very important given that this is becoming a popular project and more people have started contributing to it. Please share your thoughts on this issue: #411

v0dro · 2017-09-05T13:58:14Z

Also, I think the next version (after the great cleanup) should be 0.2 not 1.0. We'll release 1.0 once the data storage issue is sorted out: #328

zverok · 2017-09-05T15:47:50Z

Also, I think the next version (after the great cleanup) should be 0.2 not 1.0. We'll release 1.0 once the data storage issue is sorted out: #328

Well, you are the boss, but I believe with current usage, number of GitHub stars, and so on and so forth -- the new version with seriously updated docs, removed deprecations, new importers/exporters, ... well, I really believe it could be called The One-Point-Oh.

As about #328 -- I am not sure that it should become the main implementation of Daru. Because Daru is useful also as a small, easy-to-install, almost-no-dependencies concept of DataFrame and Vector, and with daru-io and daru-view even more so (like "take those 300 rows from DB and export them to Excel").

I understand the goal to compete with pandas, yet, to be honest, I still believe all internal speedups with dependencies on side libraries and compiled codes should be only plugins, not the core of the library.

Of course, you may have another opinion, and, like I've said, you are the boss.

baarkerlounger · 2017-09-05T15:53:10Z

I guess the outcome of this version numbering should be part of the 'release policy' - personally I like have x.y.z with x number bumps being used only for breaking changes that will require user code to change, y being bumped for new features, z being used for bug fixes only.

That way users can have consistent expectations about behaviour from the numbering. So I think 0.2.0 or 1.0.0 should depend on whether the great cleanup is backwards compatible or not.

v0dro · 2017-09-05T17:02:27Z

Hmmmm yes @baarkerlounger is right. @zverok I don't think there will be too many breaking changes due to the cleanup given that we're essentially only moving out the IO and view methods, right?

About the data storage, I think it will probably be a plugin which will over-ride all core daru functionality once required.

baarkerlounger · 2017-09-05T17:05:19Z

I also think there should be another release when the currently open PRs are merged.

parthm · 2017-09-05T17:24:23Z

Based on autumn cleanup review @zverok mentoned in an earlier note, if we have API consistency being handled in this release, labeling it 1.0 might make sense. In case it's only internal cleanup I suppose we could call it 0.2.

I still believe all internal speedups with dependencies on side libraries and compiled codes should be only plugins, not the core of the library.

Just to cast my vote here, this approach makes sense IMHO. It makes it very easy to get started with Daru. Having a huge install by default can sometimes be painful. I had issues earlier when I tried the full install of SciRuby (IIRC it was some compilation issues with gsl) and nearly ended up going with Pandas. Fortunately, a Daru only install worked smoothly and I could happily use Ruby/Daru.

zverok · 2017-09-06T06:21:55Z

@zverok I don't think there will be too many breaking changes due to the cleanup given that we're essentially only moving out the IO and view methods, right?

Well, my idea of the cleanup (described in original issue) is:

move out IO/view
it already will introduce some changes and most probably incompatibilities, therefore...
remove what was deprecated for a long time
go through all unmerged PRs, unfinished issues and style and fix it

So, I've meant that splitting out daru-io is a BIG move, and it could be a reason for dropping the outdated things, and doing a New Big Shiny release.

v0dro · 2017-10-08T03:20:16Z

Yes I agree. @zverok do you plan to work on this?

zverok · 2017-10-08T13:31:33Z

Right now sitting and looking into the large file with a plan :) Will publish today or tomorrow's morning!

zverok · 2017-10-08T22:37:01Z

I've created branch v-1-pre targeting all "big" (e.g. incompatible) changes and GitHub issues milestone.
The latter is NOT complete currently (I am still reviewing through all we have).

As for myself, I am planning currently to work on updating specs (and making a list of methods to deprecate/redesign), and if evrybody is OK with it I'll work just in the branch, without PRs.

baarkerlounger · 2017-10-08T22:40:06Z

@zverok why no PRs? Will end up with a potentially huge branch to review?

zverok · 2017-10-08T22:46:16Z

@baarkerlounger because I am planning mostly to do "maintenance" tasks (change minimal Ruby version, facelift code style here and there and so on), and also gradually update specs between that, it would be work-in-progress of preparation of Big Release. It would be iterations like "look at Index, fix specs, discuss possible deprecations, fix a bit more specs, cleanup some method" and so on.

(And one huge PR "fix all specs" would be almost un-reviewable anyways.)

athityakumar · 2017-10-13T10:01:24Z

I've submitted PR #430 to port IO modules away from daru, please review. (I'll resolve the merge conflict along with review changes) 😄

zverok · 2017-11-11T12:02:59Z

JFYI: not sleeping. Started to refactor specs & docs, which somehow leaded to HUGE lib review, current results (WIP, not passing specs at the moment) in https://github.com/SciRuby/daru/tree/awfully-big-refactoring branch. Current plan is to finish it there, then merge into v-1-pre branch, then fix everything that will left of milestone :) ETA ~1-2 weeks on my side.

athityakumar mentioned this issue Sep 2, 2017

Add #to_html method SciRuby/daru-view#62

Open

zverok added this to the Version 1.0 milestone Oct 8, 2017

Shekharrajak mentioned this issue Jan 9, 2019

Error When Using Highcharts / Googlecharts SciRuby/daru-view#130

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Follow-up of GSoC: remove obsolete parts from main gem #405

Follow-up of GSoC: remove obsolete parts from main gem #405

zverok commented Sep 1, 2017 •

edited

Loading

zverok commented Sep 1, 2017

athityakumar commented Sep 1, 2017

zverok commented Sep 1, 2017

Shekharrajak commented Sep 2, 2017

parthm commented Sep 3, 2017

parthm commented Sep 3, 2017

zverok commented Sep 3, 2017

zverok commented Sep 3, 2017

v0dro commented Sep 5, 2017

v0dro commented Sep 5, 2017

zverok commented Sep 5, 2017

baarkerlounger commented Sep 5, 2017

v0dro commented Sep 5, 2017

baarkerlounger commented Sep 5, 2017

parthm commented Sep 5, 2017

zverok commented Sep 6, 2017

v0dro commented Oct 8, 2017

zverok commented Oct 8, 2017

zverok commented Oct 8, 2017

baarkerlounger commented Oct 8, 2017

zverok commented Oct 8, 2017

athityakumar commented Oct 13, 2017

zverok commented Nov 11, 2017

Follow-up of GSoC: remove obsolete parts from main gem #405

Follow-up of GSoC: remove obsolete parts from main gem #405

Comments

zverok commented Sep 1, 2017 • edited Loading

zverok commented Sep 1, 2017

athityakumar commented Sep 1, 2017

zverok commented Sep 1, 2017

Shekharrajak commented Sep 2, 2017

parthm commented Sep 3, 2017

parthm commented Sep 3, 2017

zverok commented Sep 3, 2017

zverok commented Sep 3, 2017

v0dro commented Sep 5, 2017

v0dro commented Sep 5, 2017

zverok commented Sep 5, 2017

baarkerlounger commented Sep 5, 2017

v0dro commented Sep 5, 2017

baarkerlounger commented Sep 5, 2017

parthm commented Sep 5, 2017

zverok commented Sep 6, 2017

v0dro commented Oct 8, 2017

zverok commented Oct 8, 2017

zverok commented Oct 8, 2017

baarkerlounger commented Oct 8, 2017

zverok commented Oct 8, 2017

athityakumar commented Oct 13, 2017

zverok commented Nov 11, 2017

zverok commented Sep 1, 2017 •

edited

Loading