Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#777 #778

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

#777 #778

wants to merge 1 commit into from

Conversation

tribbloid
Copy link

experimental prototype for 2-stage encoder derivation for record-like objects:

  • generic
  • typedRow

@pomadchin
Copy link
Member

pomadchin commented Nov 26, 2023

Could you add more description and re-open PR as a draft?
+ that's way to much of the fmt changes; hard to review // I know that its WIP but just making sure we're on the same page.

@tribbloid tribbloid marked this pull request as draft November 26, 2023 04:05
@tribbloid tribbloid force-pushed the RecordEncoder/spike2 branch 2 times, most recently from 5b65761 to 89bcd8b Compare November 26, 2023 05:29
@tribbloid
Copy link
Author

tribbloid commented Nov 26, 2023

@pomadchin yep, easy, just submitted the minimalistic version and converted to a draft

In this PR, the new TypedRow[T <: HList] uses the intermediate Record type G in RecordEncoder as compile-time schema. This schema representation is as informative as product type but more flexible.

It has its own encoder which share most of the implementation & implicit evidences with the old RecordEncoder, This allows a TypedDataset[TypedRow[_]] to be converted from/to RDD[TypedRow[_] and a Spark DataFrame at will.

The new test case under RecordEncoderTests demonstrated its usage

@tribbloid tribbloid force-pushed the RecordEncoder/spike2 branch from 89bcd8b to 51b9d71 Compare November 26, 2023 05:48
@tribbloid
Copy link
Author

In the future we can declare TypedDataFrame an alias of TypedDataset[TypedRow[_]], and make the behaviour of functions like withColumn more consistent with their vanilla counterparts

@tribbloid
Copy link
Author

@pomadchin can the example be reproduced in your environment? Do you anticipate any difficulty in implementing this roadmap?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants