Can the writer support the scenario when the output is a positional file with header&data?

## Question
Can the writer support the scenario when the output is a positional file with header&data?

Considering header and data as two different sets of records having different schemas, usually header is 1 row while data is N row.

What I was considering:
- manage header and data in same Spark Dataframe. Resulting in a Df with schema the union of columns of header and columns of data. Headers column will have values only in first row and nulls in the rest N rows, while data columns will have nulls in first row and values only in N rows after the first one (where N is the number of records in my df)
- write the df with 2 different copybooks: 1 with the header schema (writing only top 1 row), 1 with the data schema (writing all rows except first one). when writing all the columns not in the copybook will not be written to outputs
- the results: 2 separate files
- merge them together following the order header>data.

Do you think it will work? Is there something smarter?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can the writer support the scenario when the output is a positional file with header&data? #843

Question

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Can the writer support the scenario when the output is a positional file with header&data? #843

Description

Question

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions