Write spark dataframe to file using python and ‘|’ delimiter

I have constructed a Spark dataframe from a query. What I wish to do is print the dataframe to a text file with all information delimited by ‘|’, like the following:

+-------+----+----+----+
|Summary|col1|col2|col3|
+-------+----+----+----+
|row1   |1   |14  |17  |
|row2   |3   |12  |2343|
+-------+----+----+----+

How can I do this?

Answer

You can try to write to csv choosing a delimiter of |

df.write.option("sep","|").option("header","true").csv(filename)

This would not be 100% the same but would be close.

Alternatively you can collect to the driver and do it youself e.g.:

myprint(df.collect())

or

myprint(df.take(100))

df.collect and df.take return a list of rows.

Lastly you can collect to the driver using topandas and use pandas tools

Attribution
Source : Link , Question Author : Brian Waters , Answer Author : Assaf Mendelson

Leave a Comment