Skip to content

Commit

Permalink
[SPARK-2130] End-user friendly String repr for StorageLevel in Python
Browse files Browse the repository at this point in the history
JIRA issue https://issues.apache.org/jira/browse/SPARK-2130

This PR adds an end-user friendly String representation for StorageLevel
in Python, similar to ```StorageLevel.description``` in Scala.
```
>>> rdd = sc.parallelize([1,2])
>>> storage_level = rdd.getStorageLevel()
>>> storage_level
StorageLevel(False, False, False, False, 1)
>>> print(storage_level)
Serialized 1x Replicated
```

Author: Kan Zhang <[email protected]>

Closes apache#1096 from kanzhang/SPARK-2130 and squashes the following commits:

7c8b98b [Kan Zhang] [SPARK-2130] Prettier epydoc output
cc5bf45 [Kan Zhang] [SPARK-2130] End-user friendly String representation for StorageLevel in Python
  • Loading branch information
kanzhang authored and pwendell committed Jun 17, 2014
1 parent 7afa912 commit d81c08b
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 0 deletions.
3 changes: 3 additions & 0 deletions python/pyspark/rdd.py
Original file line number Diff line number Diff line change
Expand Up @@ -1448,9 +1448,12 @@ def toDebugString(self):
def getStorageLevel(self):
"""
Get the RDD's current storage level.
>>> rdd1 = sc.parallelize([1,2])
>>> rdd1.getStorageLevel()
StorageLevel(False, False, False, False, 1)
>>> print(rdd1.getStorageLevel())
Serialized 1x Replicated
"""
java_storage_level = self._jrdd.getStorageLevel()
storage_level = StorageLevel(java_storage_level.useDisk(),
Expand Down
9 changes: 9 additions & 0 deletions python/pyspark/storagelevel.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ def __repr__(self):
return "StorageLevel(%s, %s, %s, %s, %s)" % (
self.useDisk, self.useMemory, self.useOffHeap, self.deserialized, self.replication)

def __str__(self):
result = ""
result += "Disk " if self.useDisk else ""
result += "Memory " if self.useMemory else ""
result += "Tachyon " if self.useOffHeap else ""
result += "Deserialized " if self.deserialized else "Serialized "
result += "%sx Replicated" % self.replication
return result

StorageLevel.DISK_ONLY = StorageLevel(True, False, False, False)
StorageLevel.DISK_ONLY_2 = StorageLevel(True, False, False, False, 2)
StorageLevel.MEMORY_ONLY = StorageLevel(False, True, False, True)
Expand Down

0 comments on commit d81c08b

Please sign in to comment.