Fix typos, grammar, and comment inconsistencies in parquet.thrift#573
Fix typos, grammar, and comment inconsistencies in parquet.thrift#573iemejia wants to merge 2 commits into
Conversation
- Fix typos: "to be be", "documention", "not necessary"
- Remove off-by-one in DataPageHeaderV2 is_compressed comment
- Fix article agreement ("a element" -> "an element", "a OffsetIndex" -> "an OffsetIndex")
- Disambiguate compressed_page_size comment in PageLocation
- Fix "edges interpolation" -> "edge interpolation" in Geospatial comments
- Capitalize proper nouns: Hive, Pig; normalize GZIP casing
- Add terminal periods for consistency
- Clarify BIT_PACKED is superseded by RLE (cross-reference Encodings.md)
- Missing space before parenthesis in frameworks list
Thrift validation passes after these edits.
etseidl
left a comment
There was a problem hiding this comment.
Thanks for taking this on. Just a few minor nits.
| * Embedded Geometry logical type annotation | ||
| * | ||
| * Geospatial features in the Well-Known Binary (WKB) format and edges interpolation | ||
| * Geospatial features in the Well-Known Binary (WKB) format and edge interpolation |
There was a problem hiding this comment.
I think "edges" is the common usage in GeoParquet. Maybe this should be in backticks (edges) here and elsewhere.
There was a problem hiding this comment.
Good point, changed to edges interpolation with backticks.
| * plain type. | ||
| * in a data page use RLE_DICTIONARY instead. | ||
| * in a Dictionary page use PLAIN instead | ||
| * in a Dictionary page use PLAIN instead. |
There was a problem hiding this comment.
| * in a Dictionary page use PLAIN instead. | |
| * In a Dictionary page use PLAIN instead. |
If we're adding punctuation, then shouldn't we also capitalize? And now that I think of it, should it be "For a..." rather than "In a..."
There was a problem hiding this comment.
Done. Changed both sentences to start with "For a ..." and capitalized: For a data page use RLE_DICTIONARY instead. / For a Dictionary page use PLAIN instead.
| RLE = 3; | ||
|
|
||
| /** Bit packed encoding. This can only be used if the data has a known max | ||
| /** Deprecated: Bit packed encoding. This can only be used if the data has a known max |
There was a problem hiding this comment.
We're pretty inconsistent here. Sometimes "DEPRECATED:", others "Deprecated:". Perhaps we should standardize?
There was a problem hiding this comment.
Agreed, standardized all label-style occurrences to DEPRECATED: (uppercase), inspired by the RFC convention for keywords. Inline prose uses (e.g. "this field is now deprecated") are left lowercase as normal English.
… standardize DEPRECATED
etseidl
left a comment
There was a problem hiding this comment.
Thanks. Just a few more "edges" to clean up.
| * | ||
| * Geospatial features in the WKB format with an explicit (non-linear/non-planar) | ||
| * edges interpolation algorithm. | ||
| * edge interpolation algorithm. |
There was a problem hiding this comment.
| * edge interpolation algorithm. | |
| * `edges` interpolation algorithm. |
| * defaults to "OGC:CRS84". | ||
| * | ||
| * An optional algorithm can be set to correctly interpret edges interpolation | ||
| * An optional algorithm can be set to correctly interpret edge interpolation |
There was a problem hiding this comment.
| * An optional algorithm can be set to correctly interpret edge interpolation | |
| * An optional algorithm can be set to correctly interpret `edges` interpolation |
Summary
Fix typos, grammar, and comment inconsistencies in the canonical Thrift schema definition.
Changes
is_compressedcommentcompressed_page_sizecomment in PageLocation (it includes the header; the field of the same name on PageHeader does not)Validation
Thrift definition compiles cleanly after all changes. No semantic/behavioral changes to the format specification.
Split from #572 for easier review.