-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AnalysisException: Attribute name contains invalid character(s) issue #462
Comments
Not only this column name (with special characters), normal column like "Transaction description" is also failing with same error. That means if a Spark DataFrame reads a CSV file with column names( having spaces), its throwing same AnalysisException exception. Is there any way to fix this issue with default character adding to that space like Or any other solution to solve this scenario please? |
You can use the select(col("Transacation description").alias("Transacation_description")) |
Yes we can do it programatically. Do we have any option to fix it without doing so? |
Parquet doesn't allow storing such column names. I'd say it's better engineering practice for you to follow some convention yourself and fix the names rather than having some system arbitrarily fix it for you. |
Im able to do the same operation using DataFrame API like I would say its a feature in "Delta Lake" that it is enforcing to follow that good practice. |
That has indeed been one of our core design principle - an opinionated view of how to manage data without shooting yourself in the foot. |
Closing this. Invalid characters are disallowed by design. |
for c in df.columns: That worked fine in case |
Exception in thread "main" org.apache.spark.sql.AnalysisException: Attribute name "Code région" contains invalid character(s) among " ,;{}()\n\t=". Please use alias to rename it.;
The text was updated successfully, but these errors were encountered: