Unable to Save Apache Spark parquet file to csv with Databricks

Unable to Save Apache Spark parquet file to csv with Databricks

Questions : Unable to Save Apache Spark parquet file to csv with Databricks

246

I’m trying save/convert a parquet file to in4codes_azure-databricks csv on Apache Spark with Databricks but not in4codes_azure-databricks having much luck.

The following code successfully writes to a in4codes_azure-databricks folder called tempDelta:

df.coalesce(1).write.format("parquet").mode("overwrite").option("header","true").save(saveloc+"/tempDelta") 

I then would like to convert the parquet in4codes_azure-databricks file to csv as follows:

df.coalesce(1).write.format("parquet").mode("overwrite").option("header","true").save(saveloc+"/tempDelta").csv(saveloc+"/tempDelta") AttributeError Traceback (most recent call last) <command-2887017733757862> in <module> ----> 1 df.coalesce(1).write.format("parquet").mode("overwrite").option("header","true").save(saveloc+"/tempDelta").csv(saveloc+"/tempDelta") AttributeError: 'NoneType' object has no attribute 'csv' 

I have also tried the following after in4codes_azure-databricks writing to the location:

df.write.option("header","true").csv(saveloc+"/tempDelta2") 

But it get the error:

A transaction log for Databricks Delta was found at `/CURATED/F1Area/F1Domain/final/_delta_log`, but you are trying to write to `/CURATED/F1Area/F1Domain/final/tempDelta2` using format("csv"). You must use 'format("delta")' when reading and writing to a delta table. 

And when I try to save as a csv to folder in4codes_azure-databricks that isn’t a delta folder I get the in4codes_azure-databricks following error:

df.write.option("header","true").csv("testfolder") AnalysisException: CSV data source does not support struct data type. 

Can someone let me know the best way of in4codes_azure-databricks saving / converting from parquet to csv with in4codes_azure-databricks Databricks

Total Answers 1
31

Answers 1 : of Unable to Save Apache Spark parquet file to csv with Databricks

You can use either of the below 2 in4codes_apache-spark options

1. df.write.option("header",true).csv(path) 2. df.write.format("csv").save(path) 

Note: You cant mention format as parquet in4codes_apache-spark and use .csv function at once.

0