Skip to content

Commit

Permalink
[SPARK-8903] Fix bug in cherry-pick of SPARK-8803
Browse files Browse the repository at this point in the history
This fixes a bug introduced in the cherry-pick of apache#7201 which led to a NullPointerException when cross-tabulating a data set that contains null values.

Author: Josh Rosen <[email protected]>

Closes apache#7295 from JoshRosen/SPARK-8903 and squashes the following commits:

5489948 [Josh Rosen] [SPARK-8903] Fix bug in cherry-pick of SPARK-8803
  • Loading branch information
JoshRosen committed Jul 8, 2015
1 parent 4df0f1b commit 3f6e6e0
Showing 1 changed file with 3 additions and 3 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ private[sql] object StatFunctions extends Logging {
if (element == null) "null" else element.toString
}
// get the distinct values of column 2, so that we can make them the column names
val distinctCol2: Map[Any, Int] =
val distinctCol2: Map[String, Int] =
counts.map(e => cleanElement(e.get(1))).distinct.zipWithIndex.toMap
val columnSize = distinctCol2.size
require(columnSize < 1e4, s"The number of distinct values for $col2, can't " +
Expand All @@ -128,7 +128,7 @@ private[sql] object StatFunctions extends Logging {
countsRow.setLong(columnIndex + 1, row.getLong(2))
}
// the value of col1 is the first value, the rest are the counts
countsRow.setString(0, cleanElement(col1Item.toString))
countsRow.setString(0, cleanElement(col1Item))
countsRow
}.toSeq
// Back ticks can't exist in DataFrame column names, therefore drop them. To be able to accept
Expand All @@ -139,7 +139,7 @@ private[sql] object StatFunctions extends Logging {
// In the map, the column names (._1) are not ordered by the index (._2). This was the bug in
// SPARK-8681. We need to explicitly sort by the column index and assign the column names.
val headerNames = distinctCol2.toSeq.sortBy(_._2).map { r =>
StructField(cleanColumnName(r._1.toString), LongType)
StructField(cleanColumnName(r._1), LongType)
}
val schema = StructType(StructField(tableName, StringType) +: headerNames)

Expand Down

0 comments on commit 3f6e6e0

Please sign in to comment.