docs: update data_types.md to reflect current Arrow type mappings #20072
+23
−57
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
data_types.mdpage #18314Rationale for this change
The documentation in
data_types.mdwas outdated and showedUtf8as the default mapping for character types (CHAR, VARCHAR, TEXT, STRING), but the current implementation defaults toUtf8View. This caused confusion for users reading thedocumentation as it didn't match the actual behavior.
Additionally, the "Supported Arrow Types" section at the end was redundant since
arrow_typeofnow supports all Arrow types, making the comprehensive list unnecessary.What changes are included in this PR?
Utf8toUtf8Viewfor CHAR, VARCHAR, TEXT, and STRING typesdatafusion.sql_parser.map_string_types_to_utf8viewsetting that allows users to switch back toUtf8if neededAre these changes tested?
This is a documentation-only change. The documentation accurately reflects the current behavior of DataFusion:
Utf8Viewis the current implementation behaviordatafusion.sql_parser.map_string_types_to_utf8viewconfiguration option exists and works as documentedAre there any user-facing changes?
Yes, documentation changes only. Users will now see accurate information about:
Utf8behavior